Generalized Accelerated Failure Time Models for Recurrent Events

Wen, Xiaoyi; Xu, Jinfeng

doi:10.3390/math10152662

Open AccessArticle

Generalized Accelerated Failure Time Models for Recurrent Events

by

Xiaoyi Wen

¹ and

Jinfeng Xu

^2,*

¹

The Institute of Statistics and Big Data, Renmin University of China, Beijing 100872, China

²

School of Mathematics, Yunnan Normal University, Kunming 650092, China

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(15), 2662; https://doi.org/10.3390/math10152662

Submission received: 29 June 2022 / Revised: 22 July 2022 / Accepted: 25 July 2022 / Published: 28 July 2022

(This article belongs to the Special Issue Recent Advances in Computational Statistics)

Download

Browse Figures

Versions Notes

Abstract

:

For analyzing recurrent event data, we consider a generalization of the classical accelerated failure time model. In the proposed approach, the general function is no longer assumed to be a singleton but allowed to be time-varying. This is in the same spirit as in quantile regression and the counting process techniques can be utilized. Theoretical properties such as consistency and asymptotic normality are obtained. The illustration of the methodology using simulation studies and then the application to the bladder cancer data is also given.

Keywords:

accelerated failure time model; censored quantile regression; counting processes; recurrent events; time-varying general function

MSC:

46N30; 65C60

1. Introduction

Recurrence event data refer to the situation in which events of interest occur repeatedly over time, and these are wildly used in science and technology, especially in medical research, e.g., epileptic seizures, tumor recurrences, or asthma attack. When the investigators studied the recurrent events, they are often interested in the estimation of covariates on recurrent event times, which can help them perform further predictions. The establishment of models can be approached in many ways; meanwhile, the concept of intensity functions and the counting process are also useful. The nonparametric method used to generalize the intensity estimator for the censored failure time data, independently started by [1,2], also denoted the Nelson–Aalen estimator, is one of the methods widely used by investigators. Refs. [3,4], among others, also studied multiplicative models for the rate and mean functions, an approach which can be used with more general models, including regression. Ref. [5] proposed a generalization of the accelerated failure time model, the so-called accelerated recurrence time model, which resembles quantile regression as well as allows for the evolution of covariate effects.

In this paper, we also extend the quantile regression approach and utilize the counting process techniques. For a time-to-event response T, Ref. [6] proposed that a form may assume

Q_{T} (τ | Z) = \exp \{X^{T} β_{0}\}

for

τ \in (0, 1)

, where

Q_{T} (τ | Z) = \inf \{t : P r (T \leq t | Z) \geq τ\}

, so

Q_{T} (τ | Z)

can also be seen as the

τ

-th quantile of T given that the covariates Z and

X = {(1, Z^{T})}^{T}

. Here,

T, τ

and t denote the time to event, the quantile level, and a non-negative real number, respectively. Since its introduction, quantile regression has been widely used and researched, mainly in survival analysis. Ref. [7] extended the LAD estimation method to more general quantiles, and can also improve efficiency when the error terms are identically distributed. Ref. [5] developed a new quantile regression approach to the counting process, and Ref. [8] adopted a constant general function in quantile regression which it applied to recurrence event data. In the application and algorithm estimation of quantile regression, efforts have also been made by many people. Ref. [9] proposed a locally weighted censored quantile regression approach that can solve covariate-dependent censoring. Ref. [10] proposed a semi-parametric approach using empirical likelihood to a random effects quantile regression model.

Overall, this work considers a situation wherein the general function of the accelerated failure time model is affected by time, sharing the same spirit of the parametric estimation method, while we will use the nonparametric method to estimate the effect. Then, we developed a new counting process model extended from the quantile regression models of this general function. This method provides some ideas for the development of more diverse estimation methods for the counting process.

2. Model

2.1. Accelerated Recurrent Events Time Model

Defining the

R (\cdot) = \sum_{j = 1}^{\infty} I (T^{(j)} \leq \cdot)

, where

I (\cdot)

is the indicator function. The observed counting process

N (\cdot) = R (\cdot \land L)

shows that the observation is limited to follow-up time L and ∧ is the minimization operator, meanwhile, the at-risk process

Y (\cdot)

and

Y (t) = I (L \geq t)

. We also have

μ_{Z} = E \{R (t) | Z\}

, and Z is a p-vector as the covariates of interest. The original accelerated recurrence time models considered covariates’ effects as time scale changes, which also share the same spirit as quantile regression, and the inverse function is

τ_{Z} (u) = \inf \{t : μ_{Z} (t) > G (u)\},

in a general setting, and

G (u) = u

denotes the time to expected frequency u. However, using a constant to estimate the expected frequency has some limitations. For example, the effect of the intervention of interest factors affecting the occurrence of an event may change over time. In this case, the model which combines a constant and variable coefficient may be more effective. As such, we proposed a new accelerated recurrence time model, and the right side of the inequality in the inverse function is

G (u) = u \cdot m (t)

instead of

G (u) = u

, where

m (t)

is the function of time t. Regardless of the form of inverse function, it can be written in the following form:

\log τ_{Z} (u) = X^{T} β_{0} (u)

(1)

As can be seen from the above formula,

\log τ_{Z} (u)

should be linear. Compared to the original accelerated recurrence model and the improved model, when the recurrent event T follows a homogeneous Poisson distribution, as shown in Figure 1, the estimated log

τ_{Z} (u)

by both models is approximately linear when the recurrent event is a non-homogeneous Poisson process, which has an intensity function

λ_{0} (t)

related to time t. In Figure 2, we compare three different situations, namely when

λ_{0} (t) = b \times t

and

b = 1, 5, 10

. As b increases, the estimation of the original accelerated recurrence model tends to be more and more nonlinear, while the proposed improved model still shows a strong linear relationship. In that way, the proposed recurrent event time model can be used in a more general situation.

In this paper, we consider the situation that

G (u) = u t

, then model 2 can be rewritten as

E (X [R \{\exp (X^{T} β)\} - \exp (X^{T} β) u]) = 0

(2)

if and only if

β = β_{0} (u)

, and the left-hand side should be monotone, then the above equation can be transformed into

β_{0} (u) = \underset{β}{argmin} E \{\sum_{j = 1}^{\infty} {(X^{T} β - \log T^{(j)})}^{+} - \exp (X^{T} β) u\} .

where

a^{+} = a \lor 0

and ∨ denotes the maximization operator. Then, we can come up with the following theorem:

Theorem 1.

For

G (u) = \exp (X^{T} β (u)) u

, if (1) hold and

R \{τ_{Z} (u)\} ⊥ L ∣ Z

, define

v^{⨂ 2} = v v^{T}

for vector v,

{\dot{μ}}_{Z} (\cdot)

is the derivative of

μ_{Z}

, let

F_{Z} (x) = E \{Y (x) ∣ Z\}

, and assume

(C1): $E [X^{⨂ 2} I \{L > τ_{Z} (u)\}]$ is non-singular;
(C2): ${\dot{μ}}_{Z} (e^{X^{T} β_{0} (u)}) F_{Z} (e^{X^{T} β_{0} (u)}) > U$ for all $u \in (0, U]$ .
Then,

$β_{0} (u) = \underset{β}{argmin} ψ (β; u),$

(3)

where

$ψ (β; u) = E \{\sum_{j = 1}^{M} {(X^{T} β \land \log L - \log T^{(j)})}^{+} - \exp (X^{T} β \land \log L) u\} .$

The proof of this result, as well as the proofs for Theorems 2 and 3 below, are given in Appendix A. Powell extended these results for censored quantile regression when the censoring time is observed [7]. We will then combine the counting process model and apply this method to the recurrence events data.

2.2. The Recurrent Events Model

For the counting process model, we begin with a review of Gill and Andersen’s study [11], which extended the Cox proportional hazards model [12] to a multivariate counting process. The intimate connection of the study on the multivariate counting process and the use of martingale methods can allow us to derive more properties of the statistical estimation and testing procedures. As each component

N_{i} (\cdot)

of a multivariate counting process has a cumulative hazard function

Λ_{T} (\cdot)

, there exist local martingales

M_{i}

defined by

M_{i} (t) = N_{i} (t) - Λ_{T} (t)

. Since the cumulative hazard function

Λ_{T} (\cdot)

is the integral of the hazard function, then according to [11], the counting process formation of the Cox model can be given by

E \{N (t) ∣ Z\} = E \{\int_{0}^{u} Y (s) λ_{0} (s) e^{X^{T} b} d s ∣ Z\}, t > 0,

(4)

where

λ_{0} (s)

denotes the baseline hazard function, and this model can accommodate recurrent events data well. Peng and Huang conducted a re-examination work [5], which combined the quantile regression method for survival data with the above counting process model.

E \{N (e^{X^{T} β_{0} (u)}) ∣ Z\} = E \{\int_{0}^{u} Y (e^{X^{T} β_{0} (s)}) \frac{1}{1 - s} d s ∣ Z\}, u \in (0, 1),

(5)

which has a nice monotonicity property. In model 6, the estimated cumulative hazard function

{\hat{Λ}}_{T} (x) = - \log (1 - x)

. Huang also showed that a singleton model, where an estimated cumulative hazard function is a constant u, can also be applied in the counting process model [13]. When the singleton model is combined with quantile regression, this extended the following model to a more general situation of recurrent events [8], and the estimation formula is given by

E \{N (e^{X^{T} β_{0} (u)}) ∣ Z\} = E \{\int_{0}^{u} Y (e^{X^{T} β_{0} (s)}) g (s) d s ∣ Z\}, u \in (0, 1),

(6)

where

g (s) \equiv 1

. In fact, in model 7, the estimated cumulative hazard function

G (u) = u

, which is a constant. According to Theorem 1 and the above models, we can propose a new counting process model that takes the form

E \{N (e^{X^{T} β_{0} (u)}) ∣ Z\} = E \{\int_{0}^{u} Y (e^{X^{T} β_{0} (s)}) d G (s) ∣ Z\}, u \in (0, U],

(7)

where in our model, we can also show that

μ_{Z} (e^{X^{T} β_{0} (u)}) = G (u) = u t = \int_{0}^{u} g (s) d s .

Therefore, we also have an alternative representation of the model that

τ_{Z} (G (u)) = \exp \{X^{T} β_{0} (u)\}, u \in (0, U],

(8)

The proof of this recurrence events setting is also shown in the Appendix A. In the simulation part, we will show that our estimation method performs better when the recurrent event time follows the non-homogeneous Poisson distribution.

2.3. The Proposed Estimation Procedure

From Theorem 1, we can define the objective function

Ψ (β; u) = n^{- 1} \sum_{i = 1}^{n} \{\sum_{j = 1}^{M_{i}} {(X_{i}^{T} β \land \log L_{i} - \log T_{i}^{(j)})}^{+} - \exp (X_{i}^{T} β \land \log L_{i}) u)\} .

(9)

Theorem 1 leads to the estimation of

β_{0} (u)

, that is

\hat{β} (u) = \underset{β}{argmin} Ψ (β; u)

.

It can be seen that the objective function is not convex. An algorithm was developed to find the local minimizer that is asymptotically equivalent to the global minimizer [13]. Since we proposed the counting process-based model (model 7), then we can propose an equation to estimate

β_{0} (u)

:

n^{1 / 2} S_{n} (β, u),

(10)

where

S_{n} (β, u) = n^{- 1} \sum_{i = 1}^{n} X_{i} \{N_{i} (\exp \{X_{i}^{T} β (u)\}) - \int_{0}^{u} I (L_{i} \geq \exp \{X_{i}^{T} β (s)\}) d G (s)\} .

As the results proposed in [11], although

M_{h}

is a local square integrable martingale, it has the same asymptotic property as a global martingale, and if we let

S (β, u) = E \{S_{n} (β, u)\}

, then follows the martingale property of

M (\cdot)

given that

S (β_{0}, u) = 0

.

Equation (8) boils down to the estimation of censored quantile regression in [5]. A common method used to predict

β_{0} (u)

is the grid-based estimation procedure by denoting

β_{0} (u)

as a right-continuous piecewise-constant function that jumps on a grid. More specifically, we define

S_{L (n)} = \{0 = u_{0} < u_{1} < \dots < u_{L (n)} = U\}

, and for our recurrence setting, the U in the grid can be greater than 1. The size of

S_{L (n)}

, denoted by

∥S_{L (n)}∥

, is the maximum value of

u_{j} - u_{j - 1}

where

j = 1, \dots, L (n)

. It is also noteworthy that

\exp \{X^{T} \hat{β} (0)\} = 0

. Thus, based on model 9, we can estimate

\hat{β} (u_{j})

by sequentially solving the estimating equation

n^{- 1 / 2} \sum_{i = 1}^{n} X_{i} \{R_{i} (\exp \{X_{i}^{T} β (u_{j})\}) - \sum_{k = 0}^{j - 1} I (L_{i} \geq \exp \{X_{i}^{T} \hat{β} (u_{k})\}) \int_{u_{k}}^{u_{k + 1}} g (s) d s\} = 0,

(11)

Since Equation (10) is not continuous, the exact root may not exist. Fygenson and Ritov proposed a generalization solution of

\hat{β} (u_{j})

for monotone estimating equations [14]. To find the generalization solution of Equation (10), we need to perform some simple algebraic manipulations, and then the solution-finding problem is equivalent to locating the minimizer of the following

L_{1}

-type convex function:

\begin{matrix} l_{j} (h) & = \sum_{i = 1}^{n} \sum_{j = 1}^{\infty} I (T_{i}^{(j)} \leq L_{i}) |\log T_{i}^{(j)} - X_{i}^{T} h| + |R^{*} - \{\sum_{i = 1}^{n} \sum_{j = 1}^{\infty} I (T_{i}^{(j)} \leq L_{i}) {(- X_{i})}^{T} h\}| \\ + |R^{*} - \{\sum_{i = 1}^{n} 2 X_{i}^{T} h \sum_{k = 0}^{j - 1} I (L_{i} \geq \exp \{X_{i}^{T} \hat{β} (u_{k})\}) \exp \{X_{i}^{T} \hat{β} (u_{k})\} (u_{k + 1} - u_{k})\}| . \end{matrix}

where

R^{*}

is a very large number and

j = 1, \dots, L (n)

. One can also solve the

l_{j} (h)

by using statistical software, such as the

rq ()

function in R package

quantreg

or the

l 1 fit ()

function in S-PLUS.

2.4. Asymptotic Properties

In this section, we establish the uniform consistency and weak convergence of the proposed estimator

\hat{β} (u)

. Firstly, some regularity conditions should be stated.

Define

μ (b) = E \{X N (\exp (X^{T} b))\}

,

\tilde{μ} (b) = E \{X I (\exp (X^{T} b) \leq L)\}

,

{\tilde{μ}}_{Z} (x) = E (N (x) | Z)

,

g_{Z} (x) = d {\tilde{μ}}_{Z} (x) / d x

,

F (t | Z) = P r (x \leq t | Z)

,

\bar{f} (x | Z) = - f (x | Z) = - d F (x | Z) / d x

. By using simple algebra, we also define

B (b) = d μ (b) / d b^{T} = E \{X^{⨂ 2} e^{X^{T} b} g_{Z} (e^{X^{T} b})\}

,

J (b) = d \tilde{μ} (b) / d b^{T} = E \{X^{⨂ 2} e^{X^{T} b} \bar{f} (e^{X^{T} b} | Z)\}

. Suppose the assumptions in Theorem 1 hold and

\hat{β} (u)

is strongly consistent for

β_{0} (u)

. Assume the following conditions:

(C1’): Z and $N (L)$ are bounded;
(C2’): (a) ${\dot{μ}}_{Z} (\cdot)$ is bounded and continuous at $t = τ_{Z} (u)$ , as well as uniform in Z, and (b) $μ (β_{0} (u))$ is a Lipschitz function of u;
(C3’): (a) each component of $J (b) B {(b)}^{- 1}$ is uniformly bound on $b \in B (d)$ , where $B (d)$ is a neighborhood containing $β_{0} (u)$ , (b) $g_{Z} (x) > 0$ and $E (Z^{⨂ 2})$ is positive definite;
(C4’): $\inf_{u \in [v, U]} eigmin B (β_{0} (u)) > 0$ for any $v \in (0, U]$ , where $eigmin (\cdot)$ denotes the minimum eigenvalue of a matrix.

Condition (C1’) implies the boundedness of covariates and the number of observed events, and (C2’) gives the smoothness of

β_{0} (u)

, condition (C3’) sets additional mild assumptions such as the positive definite. Noted that (C4’) is a key condition that ensures the consistency of the proposed estimator. Then, we have the following theorems.

Theorem 2.

Under conditions C1’–C4’, when

{lim}_{n \to \infty} ∥S_{L (n)}∥ = 0

, then

\sup_{u \in (v, U]} ∥\hat{β} (u) - β_{0} (u)∥ \overset{P}{\to} 0,

where

0 < v < U

.

Theorem 3.

Under condition C1’–C4’, when

{lim}_{n \to \infty} n^{1 / 2} ∥S_{L (n)}∥ = 0

, then

n^{1 / 2} \{\hat{β} (u) - β_{0} (u)\}

converge weakly to a Gaussian process for

u \in [v, U]

, where

v \in (0, U)

.

The proofs are also in the Appendix A.

3. Simulation Studies

In this part, we conducted the Monte Carlo simulations to test our model. Combining the methods of Huang and Peng [13], recurrent events were generated from both homogeneous Poisson and non-homogeneous Poisson processes. We also generated two covariates,

X_{1}

and

X_{2}

, following the distribution

B e r n o u l l i (0.5)

and

U n i f (- 0.5, 0.5)

separately. The recurrent event time was generated by:

T^{(j)} = \exp \{\min (1, \frac{T^{* (j)}}{1.5 γ}) X_{1} + X_{2}\} T^{* (j)} / γ, j = 1, 2, . . .,

where in one case,

\{T^{* (j)}, j = 1, 2, \dots\}

was a recurrent event time from the standard homogeneous Poisson process, in other words, the gap times of

T^{* (j)} - T^{* (j - 1)}

are independent and identically

e x p o n e t i a l (1)

variables; in other cases,

\{T^{* (j)}, j = 1, 2, . . .\}

is a recurrent event time from the non-homogeneous Poisson process with the intensity function

λ_{0} (t) = t

. Furthermore, the frailty

γ

followed the Gamma distribution for the homogeneous Poisson process, we considered two situations, that is the variance of

γ

was chosen to be 0 and 0.5, for the non-homogeneous Poisson process, and we only consider the variance of

γ

to be 0. Under our simulation setup, we have

τ_{Z} (G (u)) = \exp \{\log (u) + \min (1, u / 1.5) X_{1} + X_{2}\} .

For censoring time L, we generated it from

U n i f (0, 12)

. For each selection of the variance of

γ

, we generated 500 datasets of sample size

n = 100

. Since we adopted the grid-based method to estimate the

β

, an equally spaced grid on

u \in (0, 3]

with step size 0.02 was conducted in our simulation.

Figure 3 and Figure 4 are the simulation results for the homogeneous Poisson process from the set-up with a Gamma frailty of variance 0 and 0.5. In the first row, we plot the empirical bias of the proposed estimator

\hat{β} (u)

(solid lines) and the empirical bias of the Sun’s estimator (dashed lines) [8]. Sun considered the double censored situation while we only consider the right-censored event time to suit a more general situation. In the second row of the plots, we depicted the empirical mean squared error (MSE) versus the expected frequency u. The third row presents the coverage probabilities of 95% confidence intervals obtained from the proposed estimator (solid lines) and Sun’s estimator (dashed lines). We can see that both methods have a slight bias which converges to 0, and the empirical MSE also tends to be stable as u increases. In the homogeneous Poisson process, when the variance of gamma frailty equals 0, the empirical MSE shows less fluctuation compared to the variance that is equal to 0.5.

Figure 5 depicts the same parameters as Figure 3 and Figure 4, while the difference is that the results are from the non-homogeneous Poisson process. We can see that, in the first and second rows of Figure 5, both methods performed quite similarly in terms of the bias and SD. However, from the third row, in terms of the coverage probability, the proposed estimator (solid lines) performed slightly better than the Sun’s estimator (dashed lines), than for the simulation of the non-homogeneous Poisson process.

4. Application to the Bladder Tumor Studies

We applied our proposed estimated method to a well-known bladder tumor study [15]. This dataset was conducted to analyze the effect of two treatments, pyridoxine and thiotepa, based on the recurrence of bladder tumors. A total of 118 subjects were recorded, 48 were treated with placebo, 32 were pyridoxine, and 38 with thiotepa. The covariates also contained the initial number of tumors, the size of the largest initial tumor and others. The maximum observed number of recurrences is 9.

We selected four covariates and the intercept: the two treatment methods and the initial tumor size and number. In Figure 6, we displayed our estimation result of the proposed method (solid lines) and Sun’s method (gray line) [8], surrounded by point-wise Wald 95% confidence intervals (dashed lines). The grid-based estimation method estimated the regression coefficients over [0, 1.6]. The intercept coefficient estimates represent the log time to the expected frequency of the bladder tumor recurrence, consisting of patients who had no pyridoxine and thiotepa treatment. The intercept term indicates that as time increases, the bladder tumor recurrence also increases, which is in line with expectations. The negative non-intercept coefficient estimates pyridoxine and thiotepa show that these two treatments can inhibit tumor recurrence. In contrast, the initial tumor size and number of covariates are positive, suggesting a negative effect on tumor recurrence. From Figure 6, we can see that our method and Sun’s method do not have much difference, and the overall trend is the same.

5. Discussion

In this article, we introduced the accelerated recurrence time model for recurrence events, and then considered a new situation when the expected frequency of the accelerated recurrence time model was affected by the time-varying and combined this model with the counting process model. This new counting process model with a non-singleton general function is similar to Cox’s regression model but can be transformed into quantile regression. This method can also estimate more general situations, such as the non-homogeneous Poisson process.

We can generalize our estimation procedure from double-censored to right-censored situations. In the recurrence events setting, the most popular choice of

G (\cdot)

, which may be

G (u) = u

, as well as

G (u) = - \ln (1 - u)

, we introduce our new estimation method with

G (u) = u t

. The assumption is that a constant hazard is rarely tenable in practical problems. Therefore, in the parametric estimation process, a more general distribution of the hazard function is the Weibull distribution, which is

g (t) = λ γ t^{γ - 1}

for

0 < t < \infty

, so we have

G (u) = u t^{γ}

. When

γ = 0

, the Weibull distribution function becomes a constant, but resembles the accelerated recurrence time model. When

γ = 1

, the model became our proposed method. Thus, after combining the parametric and nonparametric methods, we can make the estimation method more diverse and adapt to different types of datasets.

Author Contributions

Conceptualization, J.X.; Data curation, X.W.; Formal analysis, X.W.; Investigation, X.W. and J.X.; Methodology, J.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof of Theorem 1.

If we want to estimate the

β_{0}

, the left-hand side of the following equation should be monotone,

E (X [R \{\exp (X^{T} β)\} - \exp (X^{T} β) u]) = 0

(A1)

taking derivative with respect to

β

on both sides of equation that

{\dot{μ}}_{Z} (e^{X^{T} β (u)}) F_{Z} (e^{X^{T} β_{0} (u)}) e^{X^{T} β (u)} X^{T} d β - u e^{X^{T} β (u)} X^{T} d β,

then, under condition (2), the Equation (A1) can be monotone. Meanwhile, straightforward algebra gives

\begin{matrix} ϕ (β; Z, L) & = E \{\sum_{j = 1}^{M} {(X^{T} β \land \log L - \log T^{(j)})}^{+} - \exp (X^{T} β \land \log L) u ∣ Z, L\} \\ = \int_{- \infty}^{X^{T} β \land \log L} μ_{Z} \{\exp (s)\} d s - \exp (X^{T} β \land \log L) u \end{matrix}

and for any

β \neq β_{0} (u)

,

E [ϕ (β_{0} (u); Z, L)] = ψ (β_{0} (u); u) < ψ (β; u) = E [ϕ (β (u); Z, L)],

in order to eliminate the minimization operator, we consider the following situations:

when $\log L \leq X^{T} β$ and $\log L \leq X^{T} β_{0}$ ,

$ϕ (β_{0} (u); Z, L) = ϕ (β; Z, L) .$
when $\log L > X^{T} β$ and $\log L \leq X^{T} β_{0}$ ,

$\begin{matrix} ϕ (β_{0} (u); Z, L) - ϕ (β; Z, L) & = \int_{X^{T} β}^{l o g L} μ_{Z} \{\exp (s)\} d s - u [\exp (l o g L) - \exp (X^{T} β)] \\ = \int_{X^{T} β}^{l o g L} [μ_{Z} \{\exp (s)\} - u \{\exp (s)\}] d s \leq 0 . \end{matrix}$
when $\log L > X^{T} β_{0}$ , since $\exp (β_{0} (u))$ is the unique value that minimizes

$\int_{- \infty}^{x} μ_{Z} \{\exp (s)\} d s - e^{x} u,$

considering the monotonicity of $μ_{Z} (\cdot)$ , therefore $ϕ (β_{0} (u); Z, L) \leq ϕ (β; Z, L)$ . By the non-singularity of $E [X^{⨂ 2} I \{L > τ_{Z} (u)\}]$ ,

$E [ϕ (β_{0} (u); Z, L) I \{\log L > X^{T} β_{0} (u)\}] \leq E [ϕ (β; Z, L) I \{\log L > X^{T} β_{0} (u)\}] .$

Thus, the proof is completed. □

Proof of Theorem 2.

Define

α_{0} (u) = μ (β_{0} (u))

,

\hat{α} (u) = μ (\hat{β} (u))

, and

B (d) = \{b \in R^{p + 1} : \inf_{u \in (0, U]} ∥μ (b) - μ (β_{0} (u))∥ \leq d\}, A (d) = \{μ (b) : b \in B (d)\} .

Noted that for any

b_{1}

,

b_{2} \in B (d)

,

(b_{1} - b_{2}) \{μ (b_{1}) - μ (b_{2})\} = E [(X^{T} b_{1} - X^{T} b_{2}) \{N (e x p (X^{T} b_{1})) - N (e x p (X^{T} b_{2}))\}] = 0,

which only occurs when

b_{1} = b_{2}

by condition (C3’). Then, there exists a one-to-one map from

B (d)

to

A (d)

that we denoted by

κ (\cdot)

and

κ \{μ (b)\} = b

for any

b \in B (d)

.

Furthermore, by the properties of the martingale stochastic process

M (u) = N (\exp (X^{T} β_{0} (u)) - \int_{0}^{u} Y (e^{X^{T} β_{0} (s)}) g (s) d s,

we have

n^{- 1} \sum_{i = 1}^{n} X_{i} N_{i} (\exp \{X_{i}^{T} \hat{β} (u)\}) = n^{- 1} \sum_{i = 1}^{n} \int_{0}^{u} X_{i} I [\exp \{X_{i}^{T} \hat{β} (s)\} \leq L_{i}] g (s) d s + ξ_{n, k}

where

ξ_{n, k} = O_{p} (n^{- 1})

.

Then, the following notations are defined:

\begin{matrix} v_{n} (b) = n^{- 1} \sum_{i = 1}^{n} X_{i} N_{i} (\exp \{X_{i}^{T} b\}) - μ (b), \\ {\tilde{v}}_{n} (b) = n^{- 1} \sum_{i = 1}^{n} X_{i} I (\exp \{X_{i}^{T} b\} \leq L_{i}) - \tilde{μ} (b) . \end{matrix}

By the Glivenko–Cantelli theorem, we have

\begin{matrix} \sup_{b} ∥v_{n} (b)∥ \overset{a . s .}{\to} 0, \\ \sup_{b} ∥{\tilde{v}}_{n} (b)∥ \overset{a . s .}{\to} 0 . \end{matrix}

Since

E (X N (\exp \{X^{T} β_{0} (u)\})) = E (X \int_{0}^{u} I (L \geq \exp \{X^{T} β_{0} (s)\}) d G (u))

. Then,

\hat{α} (u) - α_{0} (u)

is equivalent to

- v_{n} (\hat{β} (u)) + \int_{0}^{u} {\tilde{v}}_{n} \{\hat{β} (s)\} d G (s) + \sum_{k = 1}^{j} \int_{u_{k - 1}}^{u_{k}} [\tilde{μ} (\hat{β} (s)) - \tilde{μ} (β_{0} (s))] d G (s) + ξ_{n, k} .

By the arguments as [5], the

\sup_{u \in [u_{k - 1}, u_{k})} ∥\hat{α} (u) - α_{0} (u)∥

can be bounded almost surely by

ε_{k} = {(1 + C_{4} b_{n})}^{k - 1} (C_{1} + ε_{0} C_{4} b_{n} + C_{2} n^{- 1} + C_{3} a_{n})

, where

ε_{0} = C_{3} a_{n}

. Given

{lim}_{n \to \infty} a_{n} = 0

and

L (n) = U / a_{n}

, we have

lim_{n \to \infty} {(1 + C_{4} b_{n})}^{L (n) - 1} = \exp \{C_{4} U / (1 - U)\},

where

C_{1}

,

C_{2}

,

C_{3}

, and

C_{4}

are some positive constant. Since

C_{i}

are arbitrarily and

{lim}_{n \to \infty} a_{n} = 0

, then we can know that

\sup_{u \in (0, U]} ∥\hat{α} (u) - α_{0} (u)∥ \overset{P}{\to} 0

. By the application of the Taylor expansion of

κ \{\hat{α} (u)\}

around

α_{0} (u)

, the conclusion

\sup_{u \in (v, U]} ∥\hat{β} (u) - β_{0} (u)∥ \overset{P}{\to} 0

can be reached. □

Proof of Theorem 3.

From the arguments made by [13], we have the following lemma:

\begin{matrix} \sup_{u \in (0, U]} ∥n^{- 1 / 2} \sum_{i = 1}^{n} X_{i} \{N_{i} (e^{X_{i}^{T} \hat{β} (u)}) - N_{i} (e^{X_{i}^{T} β_{0} (u)})\} - n^{- 1 / 2} \{μ (\hat{β} (u)) - μ (β_{0} (u))\}∥ \overset{P}{\to} 0, \\ \sup_{u \in (0, U]} ∥n^{- 1 / 2} \sum_{i = 1}^{n} X_{i} \{I (L_{i} \geq e^{X_{i}^{T} \hat{β} (u)}) - I (L_{i} \geq e^{X_{i}^{T} β_{0} (u)})\} - n^{- 1 / 2} \{\tilde{μ} (\hat{β} (u)) - \tilde{μ} (β_{0} (u))\}∥ \overset{P}{\to} 0 . \end{matrix}

(A2)

and the above lemma implies that

\begin{matrix} - n^{1 / 2} S_{n} (β_{0}, u) & = n^{1 / 2} [μ (\hat{β} (u)) - μ (β_{0} (u))] - \int_{0}^{u} n^{1 / 2} [\tilde{μ} (\hat{β} (s)) - \tilde{μ} (β_{0} (s))] d G (s) + o_{(0, U]} (1) \\ = n^{1 / 2} [μ (\hat{β} (u)) - μ (β_{0} (u))] \\ - \int_{0}^{u} [J (β_{0} (u)) B {(β_{0} (u))}^{- 1} + o_{(0, U]} (1)] n^{1 / 2} [μ (\hat{β} (u)) - μ (β_{0} (u))] d G (s) + o_{(0, U]} (1) \end{matrix}

since

{lim}_{n \to \infty} n^{1 / 2} ∥S_{L (n)}∥ = 0

, then

n^{1 / 2} S_{n} (\hat{β}, u) = o_{(0, U]} (1)

. Then, we can use the product integration theory to obtain the following equation:

n^{1 / 2} [μ (\hat{β} (u)) - μ (β_{0} (u))] = φ \{- n^{1 / 2} S_{n} (β_{0}, u)\} + o_{(0, U]} (1),

where

φ (g) (u) = \int_{0}^{u} I (s, u) d g (s)

, which is a one-to-one map from

F

to

F

. By our definition,

F = \{g : [0, U] \to R^{p + 1}\}

and g is left-continuous with a right limit,

g (0) = 0

. Meanwhile,

I (s, t) = π_{u \in (s, t]} [I_{p + 1} + J (β_{0} (u)) B {(β_{0} (u))}^{- 1} d G (u)] .

Since the definition given by [16] is followed,

\{X_{i} N_{i} (\exp \{X_{i} β_{0} (s)\}), s \in (0, U]\}

is a VC-class. For

\int_{0}^{u} I (L_{i} \geq \exp \{X_{i}^{T} β_{0} (s)\}) d G (s)

, let

h (x) = \int_{0}^{x} I (L_{i} \geq \exp \{X_{i}^{T} β_{0} (s)\}) d G (s)

, suppose that

x_{1} \leq x_{2}

, then

|h (u_{1}) - h (u_{2})| = |\int_{u_{2}}^{u_{1}} I (L_{i} \geq \exp \{X_{i}^{T} β_{0} (s)\}) d G (s)| \leq K |u_{1} - u_{2}|,

where

K = \max (\exp \{X_{i}^{T} β_{0} (s)\}, s \in [u_{1}, u_{2}])

, this implies that

h (x)

is a Lipschitz in x. The above arguments show that

\{X_{i} N_{i} (\exp \{X_{i} β_{0} (s)\}) - \int_{0}^{u} I (L_{i} \geq \exp \{X_{i}^{T} β_{0} (s)\}) d G (s), u \in (0, U]\}

is a Donsker class. By the Donsker theorem, we can know that

- n^{1 / 2} S_{n}

converge weakly to a Gaussian process, then

φ \{- n^{1 / 2} S_{n} (β_{0}, u)\}

is also a Gaussian process for

φ (\cdot)

is a linear operator. By the continuous mapping theory and Taylor expansion of

κ [μ \{\hat{β} (u)\}]

around

κ [μ \{β_{0} (u)\}]

, we can know that

n^{1 / 2} \{\hat{β} (u) - β_{0} (u)\}

converge weakly to a Gaussian process with a mean of zero and a covariance matrix

Σ (s, t) = E [ζ_{1} (s) ζ_{1} (t)]

, where

ζ_{i} (u) = B {(β_{0} (u))}^{- 1} φ (s_{i})

, and

s_{i} (u) = N_{i} (\exp \{X_{i}^{T} β (u)\}) - \int_{0}^{u} I (L_{i} \geq \exp \{X_{i}^{T} β (s)\}) d G (s)

. □

Recurrent Event Setting

Suppose model 6 holds, then according to the following definition:

E \{N (e^{X^{T} β_{0} (u)}) | Z\} = μ_{Z} (e^{X^{T} β_{0} (u)} \land L),

and the inequality

e^{X^{T} β_{0} (u)} \leq L

can also be written as

G (u) \leq μ_{Z} (L)

following the definition of

τ_{Z} (u)

. Then

E \{\int_{0}^{u} Y (e^{X^{T} β_{0} (s)}) g (s) d s | Z\} = E \{\int_{0}^{\infty} I (G (s) \leq μ_{Z} (L)) d G (s) | L, Z\} .

In model 6,

G (u) = μ_{Z} \{X^{T} β_{0} (u)\}

, so it follows from the above equation that for

u \in (0, U]

,

E \{\int_{0}^{u} Y (e^{X^{T} β_{0} (s)}) g (s) d s ∣ Z\} = μ_{Z} (L \land e^{X^{T} β_{0} (u)}) .

Therefore, the model 5 is satisfied.

Suppose that model 5 holds, then taking the derivative with respect to u on both sides of equation in model 5, we have

{\dot{μ}}_{Z} (e^{X^{T} β (u)}) F_{Z} (e^{X^{T} β_{0} (u)}) e^{X^{T} β (u)} X^{T} d β = F_{Z} (e^{X^{T} β_{0} (u)}) g (u) d u,

that is,

d μ_{Z} (e^{X^{T} β (u)}) = g (u) d u

, then

μ_{Z} (e^{X^{T} β (u)}) = G (u)

and model 6 are satisfied.

References

Nelson, W. Hazard plotting for incomplete failure data. J. Qual. Technol. 1969, 1, 27–52. [Google Scholar] [CrossRef]
Altshuler, B. Theory for the measurement of competing risks in animal experiments. Math. Biosci. 1970, 6, 1–11. [Google Scholar] [CrossRef]
Lawless, J.F. Regression methods for Poisson process data. J. Am. Stat. Assoc. 1987, 82, 808–815. [Google Scholar] [CrossRef]
Lin, D.Y.; Wei, L.J.; Yang, I.; Ying, Z. Semiparametric regression for the mean and rate functions of recurrent events. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2000, 62, 711–730. [Google Scholar] [CrossRef] [Green Version]
Peng, L.; Huang, Y. Survival analysis with quantile regression models. J. Am. Stat. Assoc. 2008, 103, 637–649. [Google Scholar] [CrossRef]
Koenker, R.; Bassett, G., Jr. Regression quantiles. Econom. J. Econom. Soc. 1978, 46, 33–50. [Google Scholar] [CrossRef]
Powell, J.L. Censored regression quantiles. J. Econom. 1986, 32, 143–155. [Google Scholar] [CrossRef]
Sun, X.; Peng, L.; Huang, Y.; Lai, H.J. Generalizing quantile regression for counting processes with applications to recurrent events. J. Am. Stat. Assoc. 2016, 111, 145–156. [Google Scholar] [CrossRef]
Wang, H.J.; Wang, L. Locally weighted censored quantile regression. J. Am. Stat. Assoc. 2009, 104, 1117–1128. [Google Scholar] [CrossRef] [Green Version]
Kim, M.O.; Yang, Y. Semiparametric approach to a random effects quantile regression model. J. Am. Stat. Assoc. 2011, 106, 1405–1417. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Andersen, P.K.; Gill, R.D. Cox’s regression model for counting processes: A large sample study. Ann. Stat. 1982, 10, 1100–1120. [Google Scholar] [CrossRef]
Cox, D.R. Regression models and life-tables. J. R. Stat. Soc. Ser. B (Methodol.) 1972, 34, 187–202. [Google Scholar] [CrossRef]
Huang, Y.; Peng, L. Accelerated recurrence time models. Scand. J. Stat. 2009, 36, 636–648. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Fygenson, M.; Ritov, Y. Monotone estimating equations for censored data. Ann. Stat. 1994, 22, 732–746. [Google Scholar] [CrossRef]
Byar, D. The veterans administration study of chemoprophylaxis for recurrent stage i bladder tumours: Comparisons of placebo, pyridoxine and topical thiotepa. In Bladder Tumors and Other Topics in Urological Oncology; Springer: Berlin/Heidelberg, Germany, 1980; pp. 363–370. [Google Scholar]
Vaart, A.V.D.; Wellner, J.A. Weak convergence and empirical processes with applications to statistics. J. R. Stat. Soc.-Ser. A Stat. Soc. 1997, 160, 596–608. [Google Scholar]

Figure 1. Estimation results of homogeneous Poisson process.

Figure 2. Estimation results of non-homogeneous Poisson process.

Figure 3. Simulation results with sample size n = 100 and the set-up with Gamma frailty of variance 0.

Figure 4. Simulation results with sample size n = 100 and the set-up with Gamma frailty of variance 0.5.

Figure 5. Simulation results of non-homogeneous Poisson process with sample size n = 100 and the set-up with Gamma frailty of variance 0.

Figure 6. Bladder data example: coefficient estimates and 95% pointwise confidence.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wen, X.; Xu, J. Generalized Accelerated Failure Time Models for Recurrent Events. Mathematics 2022, 10, 2662. https://doi.org/10.3390/math10152662

AMA Style

Wen X, Xu J. Generalized Accelerated Failure Time Models for Recurrent Events. Mathematics. 2022; 10(15):2662. https://doi.org/10.3390/math10152662

Chicago/Turabian Style

Wen, Xiaoyi, and Jinfeng Xu. 2022. "Generalized Accelerated Failure Time Models for Recurrent Events" Mathematics 10, no. 15: 2662. https://doi.org/10.3390/math10152662

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Generalized Accelerated Failure Time Models for Recurrent Events

Abstract

1. Introduction

2. Model

2.1. Accelerated Recurrent Events Time Model

2.2. The Recurrent Events Model

2.3. The Proposed Estimation Procedure

2.4. Asymptotic Properties

3. Simulation Studies

4. Application to the Bladder Tumor Studies

5. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Conflicts of Interest

Appendix A

Recurrent Event Setting

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI