Next Article in Journal
Multi Polar q-Rung Orthopair Fuzzy Graphs with Some Topological Indices
Previous Article in Journal
Energy-Efficient Deep Neural Networks for EEG Signal Noise Reduction in Next-Generation Green Wireless Networks and Industrial IoT Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Joint Model for Estimating the Asymmetric Distribution of Medical Costs Based on a History Process

1
School of Mathematics, Jilin University, No. 2699 Qianjin Street, Changchun 130012, China
2
Department of Mathematics and Statistics, University of Regina, Regina, SK 62519, Canada
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Symmetry 2023, 15(12), 2130; https://doi.org/10.3390/sym15122130
Submission received: 11 October 2023 / Revised: 21 November 2023 / Accepted: 22 November 2023 / Published: 30 November 2023
(This article belongs to the Section Mathematics)

Abstract

:
In this paper, we modify a semi-parameter estimation of the joint model for the mean medical cost function with time-dependent covariates to enable it to describe the nonlinear relationship between the longitudinal variable and time points by using polynomial approximation. The observation time points are discrete and not exactly the same for all subjects; in order to use all of the information, we first estimate the mean medical cost at the same observed time points for all subjects, and then we weigh these values using the kernel method. Therefore, a smooth mean function of medical costs can be obtained. The proposed estimating method can be used for asymmetric distribution statistics. The consistency of the estimator is demonstrated by theoretical analysis. For the simulation study, we first set up the values of parameters and non-parametric functions, and then we generated random samples for covariates and censored survival times. Finally, the longitudinal data of response variables could be produced based on the covariates and survival times. Then, numerical simulation experiments were conducted by using the proposed method and applying the JM package in R to the generated data. The estimated results for parameters and non-parametric functions were compared with different settings. Numerical results illustrate that the standard deviations of the parametric estimators decrease as the sample sizes increases and are much smaller than preassigned threshold value. The estimates of non-parametric functions in the model almost coincide with the true functions as shown in the figures of simulation study. We apply the proposed model to a real data set from a multicenter automatic defibrillator implantation trial.

1. Introduction

Survival analysis consists in analyzing and inferring the occurrence times of the concerned events. Survival time represents the time from observation to the occurrence of the event (e.g., the patient’s death time), which will be affected by random factors [1]. The distribution characteristics of survival time are usually described by three functions: the probability density function, the survival function, and the hazard rate function. In the research on survival analysis, researchers need to use appropriate methods to evaluate the medical insurance-related expenditures of a specific patient group within a specific time frame. This type of data includes longitudinal data of repeated measurements of the sample, time-event censored data, and supplementary information from covariates. As we can see, the information from additional covariates and the censorship of time–event data complicate the estimation of mark variables. The Kaplan–Meier estimator [2] can solve the mark survival function if the historical process is constant before the event time, which leads to the proportional mark variable and to the terminal variable.
However, due to the censoring mechanism of ‘inducing’ information, the survival function is unidentifiable [3]. In clinical treatment, more and more researchers realized that the patient’s historical process is usually more informative than a simple endpoint measurement, which is called a mark variable in the point process theory; it is also called a ‘cumulative variable of the historical process’. Lin et al. [4] considered the average total cost and minimized the bias from censoring by partitioning the entire time period into a series of small intervals, and the estimators were proven to be asymptotically normal. Huang and Lovato [5] formulated weighted log-rank statistics in a marked point process framework, and developed the asymptotic theory.
Weighted estimators for censoring data proposed by Bang and Tsiatis [6] performed well even when the survival data were heavily censored. Then, Zhao and Tsiatis [7] and Zhao and Tian [8] suggested the refinements of the weighted estimators. Furthermore, some researchers estimated the mean of the truncated mark variable indirectly. For instance, Korn [9] improved the Kaplan–Meier estimator by estimating the cumulative distribution function of an interpolated area under a quality-of-life curve. Strawderman [10] estimated the mean of the stopped longitudinal process and proved the asymptotic relative efficiency of the proposed estimators. They also proposed some new estimators with finite-sample optimality properties.
However, it should be noticed that the entire time period has to be partitioned into several small intervals to make the censored events occur at the boundaries, and the method proposed by Lin et al. [4] was applicable. The estimator derived by Bang and Tsiatis [6] works only if the observation is complete, and the refined versions are very complicated to compute. Fang et al. [11] derived an estimator of the mean mark variable with right censoring while the time-dependent/independent covariates are not considered. Liu et al. [12] presented a ‘turn back time’ method with time-dependent covariates to keep the actual censoring time for the survival data while the observations of cost data of time are censored before the actual censoring time. However, they did not consider repeated measurements of the longitudinal variable and the effects from time-independent covariates.
Correlation between repeated measurements for the individuals of marked variables over time and time-to-event variables needs to be considered jointly. Fortunately, Deng [13] combined the joint model technique [1] and inverse probability weighting method, and developed a novel approach to estimating the cumulative mean mark variable with time-dependent covariates under right censoring. Deng [13] considered a mixed effect model for longitudinal data to examine the relationship between longitudinal data and other covariates, and an additive risk model for survival data to demonstrate the relationship between the longitudinal process and the hazard rate function. The proposed estimator for the state function from the subjects that are observed at discrete time points has been proven to be consistent.
However, there is little literature that considers the non-linear relationship in a joint model. Zhao et al. [14] characterized the relationship between the longitudinal variable and the observed time points with a latent variable and an unspecified link function to describe the joint model, but the estimation of the link function was not given. Li et al. [15] linked the multivariate sparse functional data to event–time data by a functional joint model. Do et al. [16] classified the under-representation group using a joint fairness model (JFM) approach for logistic regression models and proposed a joint modeling objective function to predict risk. Tang et al. [17] considered the multivariate longitudinal and bivariate correlated survival data and proposed the method of Bayesian penalized splines to approximate baseline hazard functions. In fact, the expected value of the longitudinal variable may depend on the time’s non-linearly. Thus, such a relationship should be addressed by using the semi-parametric joint model. The baseline hazard function in the joint model can be seen as an infinite-dimensional parameter in the procedure of calculation. Generally, the estimation for the parameters in the joint model is practiced by the expectation maximization (EM) (refer to [18] for the specific algorithm). The drawback of the EM algorithm is that standard error (SE) estimates are not automatically produced. The estimation of SE in this paper is attained by calling “method = peicewise-PH-GH” in the JM package, where the baseline risk function is assumed to be piecewise constant. In this paper, we modify the linear model in the longitudinal sub-model in the traditional joint model into a semi-parametric model, in which the non-parametric part adopts the method of polynomial regression to make the estimates more accurate. In Section 2, we suggest a modified semi-parametric joint model and obtain the consistency of the proposed estimator. In Section 3, we show the feasibility of this method through numerical simulations. In Section 4, we apply the method established in this paper to a real data set from a MADIT. In Section 5, we conclude this paper with a discussion.

2. Estimation Methods

2.1. Description of Models

The target data involve the variables Y ( t ) , X ( t ) , Z ( t ) , W , T , and δ . T is the terminal event time, and Y ( t ) is the patient’s state history process, which is related to the vectors of the time-dependent covariates X ( t ) , Z ( t ) and the time-independent covariate W. Furthermore, the observations of { Y ( t ) , t 0 } will be ceased by the terminal event T, Y ( t ) = 0 when t T .
Furthermore, L is the artificial end point according to the data situation. For convenience, let ν ( t ) = E [ Y ( t ) ] be the mean state function of Y ( t ) , and τ [ 0 , L ]   Γ ( τ ) = 0 τ Y ( t ) d t be the cumulative variable in the time period [ 0 , τ ] , μ ( τ ) = E [ Γ ( τ ) ] and be the mean of the cumulative variable Γ ( τ ) . Let S T ( t ) = P ( T > t ) be the survival function of T, S C ( t ) = P ( C > t ) and be the survival function of C, and  S T * ( t ) = P T * > t be the survival function of T * = min { T , C } , δ = I ( T C ) and be the corresponding event indicator.
Now, for the ith subject, the true event time is denoted by T i , i = 1 , , m , observed values of Y ( t ) , Z ( t ) , X ( t ) , W at the time point t are denoted by y i ( t ) , z i ( t ) , x i ( t ) and w i , i = 1 , 2 , , m , and the event indicator is denoted by δ i = I ( T i C i ) .
We should notice that Y i ( t ) cannot be observed for all the time points, but at some special times t i j until the event times T i * .
Thus, the measurements of the longitudinal data consist of the vectors y i = { y i ( t i j ) , j = 1 , 2 , , n i } , x i = { x i ( t i j ) , j = 1 , 2 , , n i } and z i = { z i ( t i j ) , j = 1 , 2 , , n i } with t i n i T i * .
Now, we can describe the semi-parametric joint model for the data { ( y i , x i , z i , w i , T i * , δ i ) ;   i = 1 , 2 , , m } as follows:
y i ( t ) = m i ( t ) + ϵ i ( t ) , m i ( t ) = x i ( t ) T β + g ( t ) + z i ( t ) T b i , h i ( t ) = h 0 ( t ) exp { w i γ + α m i ( t ) } ,
where β is a fixed effects regression parameter, g ( t ) is an unknown smooth and bounded function, b i denotes the random effects coefficients with b i N ( 0 , D ) , α is a fixed effects parameter that describes the correlation between longitudinal outcome and the hazard rate of event occurrence, h 0 ( t ) is the baseline hazards function, and  ϵ i ( t ) is the random error that is assumed to be independent of T conditional on time-independent covariates W.

2.2. Proposed Estimator for the Parameter

It is assumed that g ( t ) is independent of ε i and b i . The existing approaches to approximate the unknown smooth function g ( t ) include the kernel method, wavelet-based methods, smoothing splines, regression splines, and so on.
We consider the polynomial basis regression splines method to express the non-parametric part. Now, for a given sequence 0 < τ 1 < τ 2 < < τ K < τ  of K knots, a linear combination f ( t ) of a set of power basis functions can be used to recover g ( t ) :
f ( t ) = η B ( t ) ,
where:
η = ( η 0 , , η ( k + K ) ) , B ( t ) = [ 1 , t , , t k , ( t τ 1 ) + k , , ( t τ K ) + k ] T .
For convenience, Let θ = ( θ t , θ y , θ b T ) be the vector of all parameters, where θ t = ( α , γ ) denotes the parameters related to the event–time outcome, θ y = ( β , η , σ 2 ) are the parameters of the longitudinal outcomes, θ b = D are the parameters in the random-effects covariance matrix, q b is the dimensionality of the random-effects vector, and  x is the Euclidean vector norm. Then, the log-likelihood contribution for the ith subject is given by:
l i = log L i = log p T i , δ i , y i ; θ = log p T i , δ i , y i , b i ; θ d b i = log p T i , δ i b i ; θ t , β p y i b i ; θ y p b i ; θ b d b i ,
where:
p T i , δ i b i ; θ t , β = h i T i M i T i ; θ t , β δ i S i T i M i T i ; θ t , β = h 0 T i exp γ w i + α m i T i δ i × exp 0 T i h 0 ( s ) exp γ w i + α m i ( s ) d s ,
and:
p ( y i ( t i j ) b i ; θ ) p ( b i ; θ ) = j p y i b i ; θ y p b i ; θ b = 1 2 π ( σ 2 + σ 1 i 2 ) exp y i ( t i j ) x i ( t ) T β z i T ( t i j ) b i η B ( t ) 2 2 ( σ 2 + σ 1 i 2 ) × ( 2 π ) q b / 2 det ( D ) 1 / 2 exp b i D 1 b i / 2 ,
Now, the estimators β ^ , α ^ , γ ^ , η ^ , D ^ , σ 2 ^ for the parameters β , α , γ , η , D , σ 2 can be derived, and the corresponding b i ^ ( 0 , D ^ ) , g ^ ( t ) = η ^ B ( t ) can be obtained.
Remark 1.
The proposed semi-parametric model can handle the situation for the inferences of asymmetric distribution statistics. K interior knots describe the distribution with a kurtosis coefficient smaller than 0 well.
Remark 2.
The estimator β ^ derived by maximum likelihood is consistent. Theorem 3.1 in Zeng and Cai [19] states the strong consistency of the maximum likelihood estimator.
Remark 3.
The estimator g ^ ( t ) is consistent in the case where g ( t ) is a smooth and bounded function. By Taylor expansion, we have that g ( t ) = g ( t 0 ) + g ( t 0 ) ( t t 0 ) + 1 2 ! g ( t 0 ) ( t t 0 ) 2 + 1 3 ! g ( t 0 ) ( t t 0 ) 3 + ( t t 0 ) 4 . Hence, ( η ^ B ( t ) g ( t ) ) 0 as n .
The fitted values of Y ( t ) and h ( t ) can be estimated by the maximum likelihood method based on the observations. For convenience, we define the set Δ = { t ( k ) ; k = 1 , 2 , , N } , where 0 = t ( 0 ) < t ( 1 ) < t ( 2 ) < < t ( N ) are the observed distinct time points for all subjects, and N is the total number of the distinct observed time points from all subjects.
Using the inverse probability weighting method, the proposed estimator for the mean state function ν at any observed time point is as follows:
ν ^ ( t ( k ) ) = 1 m i = 1 m I { T i t ( k ) } S ^ T ( t ( k ) ) Y ^ i ( t ( k ) ) = 1 m i = 1 m I { T i t ( k ) } S ^ T ( t ( k ) ) S ^ C ( t ( k ) ) Y ^ i ( t ( k ) ) ,
where Y ^ i ( t ( k ) ) is the fitted value of Y i ( t ( k ) ) at time point t ( k ) from the joint model:
Y ^ i ( t ( k ) ) = x i ( t ( k ) ) T β ^ + g ^ ( t ( k ) ) ,
S ^ C ( t ( k ) ) is a Kaplan–Meier estimator for S C ( t ( k ) ) , and  S ^ T ( t ( k ) ) is the estimator of the survival function S T ( t ( k ) ) , which can be obtained from the joint model:
S ^ T ( t ( k ) ) = exp { 0 t ( k ) h ^ ( u ) d u } .
Theoretically, the estimator S ^ T ( t ) is related to x ( s ) for s [ 0 , t ] . It can be obtained only when x ( t ) can be observed continuously. When x ( t ) are not continuously observed, we replace S ^ T ( t ) with the Kaplan–Meier estimator.
The general values of ν ^ ( s ) for s [ 0 , t ] cannot be observed continuously, and thus the estimator ν ^ ( t ) cannot be obtained continuously. Therefore, we can use the kernel method to ‘smooth’ the estimator of the state function:
ν ˜ ( t ) = k = 1 N W h ( t t ( k ) ) ν ^ ( t ( k ) ) ,
where:
W h ( t t ( k ) ) = K h ( t t ( k ) ) k = 1 N K h ( t t ( k ) )
K h ( · ) is the kernel weight function that can be selected according to the actual situation. The bandwidth h scales the distance of t and t ( k ) . Then, the mean of the cumulative variable μ ( t ) for any time point t can be estimated as:
μ ˜ ( t ) = 0 t ν ˜ ( s ) d s .
Now, we introduce some notations in order to derive the consistency of the proposed estimator. Let Q ( t ) = { x ( t ) , w } denote all the observed covariate history processes such as baseline information and observance time; L ¯ Q ( t ) = { Q ( s ) : s < t } denotes the longitudinal covariate history processes until time t, and  L ¯ Y ( t ) = { Y ( s ) : s < t } denotes the mark state response history processes until time t. · is the Euclidean norm in the real space. Q ( t ) is the derivative of Q ( t ) with respect to time t. The following assumptions should be imposed for the joint model.
Assumption 1.
Q ( t ) is fully observed prior to time t and is conditional on L ¯ Q ( t ) , L ¯ Y ( t ) , and T t for any t [ 0 , T ] ; the distribution of Q ( t ) depends only on L ¯ Q ( t ) . Furthermore, Q ( t ) is continuously differentiable in [ 0 , T ] , and  m a x t [ o , T ] Q ( t ) < with a probability of one.
Assumption 2.
The intensity of the counting process N c ( t ) depends only on L ¯ Q ( t ) and Q ( t ) conditional on L ¯ Q ( t ) , L ¯ Y ( t ) , Q ( t ) , and T t for any t < T .
Assumption 3.
P ( X T X is full rank) is positive. Moreover, if there exists a constant vector R 0 such that with positive probability, ( w , X ( t ) ) T R 0 = l ( t ) for a deterministic function l ( t ) for all t [ 0 , T ] , then R 0 = 0 and l ( t ) = 0 .
Assumption 4.
The true value of the parameter ϕ denoted by ϕ 0 = σ 0 2 , β T , γ T , α , satisfies ϕ 0 M 0 , σ 0 2 > M 0 1 for a known positive constant M 0 .
Assumption 5.
The baseline hazard function h 0 ( t ) is bounded and positive in [ 0 , T ] .
Assumption 6.
There exists a positive constant a > 0 such that S T ( L ) a .
Now, we consider the asymptotic property of the estimator μ ˜ ( t ) .
Theorem 1.
Under Assumptions (1)–(6), the estimator μ ^ ( τ ) defined in Equation (7) is the consistent estimator for any τ [ 0 , T ] .
The proof can be found in Appendix A.

3. Simulation

In this section, we present some numerical results. The joint model used in our simulation study is:
Y ( t ) = β x ( t ) + g ( t ) + b 0 + b 1 z 1 ( t ) + ϵ ( t ) , h ( t ) = h 0 ( t ) exp { γ w + α ( Y ( t ) ϵ ( t ) ) } ,
where Y ( t ) is the mark state function, x ( t ) = ( x 1 ( t ) , , x p ( t ) ) is the vector of time-dependent covariates for fixed parameters β = ( β 1 , , β p ) , and p is the dimension of x ( t ) . g ( t ) is a smooth and bounded function, z ( t ) = ( 1 , z 1 ( t ) ) is the vector of the time-dependent covariates for the random effects coefficients b = ( b 0 , b 1 ) , w is the time-independent covariates for the fixed parameter γ , and  α is the fixed parameter for the association of longitudinal outcome and the hazard rate of event occurrence.
The accuracy of the overall parametric estimates β ^ , α ^ , γ ^ is evaluated using the bias and standard deviation (std.dev). We use g ( t ) = η B ( t ) with η = ( η 0 , η 1 , η 2 , η 3 ) , B ( t ) = [ 1 , t , t 2 , t 3 ] to simulate the smooth function. The performance of the overall non-parametric estimate g ^ ( t ) is evaluated by the sample standard deviation (std.dev) and the square-root of average square errors (RASE) with R A S E = [ 1 n i n ( g ^ ( t i ) g ( t i ) ) 2 ] 1 2 .
We summarize the steps in the following Algorithm 1:   
Algorithm 1: Estimation for the cumulative state function
Input: The sample size n, the covariate x , w , z ( t ) , h 0 ( t ) , w , the true function g ( t ) , the true value of parameters β , α , γ , the censoring rate r.
Process:
Generate a random sample u i U [ 0 , 1 ] ;
Generate a random sample x i N ( μ , σ 2 ) ;
Generate a random sample ( b 0 i , b 1 i ) N ( ( 0 , 0 ) , Σ ) , where Σ = σ 1 2 ρ σ 1 σ 2 ρ σ 1 σ 2 σ 2 2 ;
Generate a lifetime random sample v i with the hazards function h ( t ) ;
Generate a random censoring sample C i U [ a , b ] ;
Set t i = m i n { v i , c i } , δ i = I { v i c i } ;
Generate the response variables y ( t i ) = β 0 + β 1 s + β x i + g ( t i ) + b 0 i + b 1 i z ( t i ) .
Output: The estimated function g ^ ( t ) , the estimated value of parameters β ^ , α ^ , γ ^ , the bias and the std.err of parameters β , α , γ , the RASE of estimated function, the std.err of η .
Estimator:
Give the estimation of parameters β , α , γ and the estimation of parameters in g ( t ) with the likelihood method.
Now, we consider the following two scenarios:
Scenario 1: Set x ( t ) = x 1 , x 1 N ( 2 , 1.0 ) , z ( t ) = t , g ( t ) = 1.5 t + 1.2 t 0.5 , w U [ 0 , 1 ] and h 0 ( t ) = h 0 . The following is the form based on the joint model:
Y ( t ) = β 1 x 1 + 1.5 t + 1.2 t 0.5 + b 0 + b 1 t + ϵ ( t ) , h ( t ) = h 0 ( t ) exp { γ w + α ( β 1 x 1 + 1.5 t + 1.2 t 0.5 + b 0 + b 1 t ) } .
Scenario 2: Set x ( t ) = x 1 , x 1 N ( 2 , 1.0 ) , z ( t ) = t , g ( t ) = 2.5 t + exp { t 6 ( 1 + t 30 ) 2 } + ( t 30 0.75 ) 0.5 , w U [ 0 , 1 ] and h 0 ( t ) = h 0 . The following is the form based on the joint model:
Y ( t ) = β 1 x 1 + 2.5 t + exp { t 6 ( 1 + t 30 ) 2 } + ( t 30 0.75 ) 0.5 + b 0 + b 1 t + ϵ ( t ) , h ( t ) = h 0 ( t ) exp { γ w + α ( β 1 x 1 + 2.5 t + exp { t 6 ( 1 + t 30 ) 2 } + ( t 30 0.75 ) 0.5 + b 0 + b 1 t ) } .
After 1000 repetitions in R, we obtain the results for the estimation of the parametric and non-parametric functions at sample sizes of n = 125 , 250, and 500. In scenario 1, We set the censoring rate to be about 25 % . The random effect b i follows a Gaussian distribution with ρ = 0.3 , σ 1 = 0.3 , and σ 2 = 0.2 , and the error ϵ i follows ϵ i N ( 0 , 1 ) . The true values of parameters α , β , γ are chosen to be 0.25 , 0.9 , 1.0 . In scenario 2, we set the censoring rate to be about 30 % . The random effect b i follows a Gaussian distribution with ρ = 0.03873 , σ 1 = 0.03873 , and σ 2 = 0.02582 , and the error ϵ i follows ϵ i N ( 0 , 1 ) . The true value of parameters α , β , γ is chosen by 0.5 , 0.9 , 1.0 .
Table 1 and Table 2 summarize the main findings over 1000 simulations. As the tables show, the biases are small and decrease as the sample size increases. The performance is empirical and improves with a larger sample size.
Table 3 and Table 4 present the results for the estimation of g ( t ) of scenario 1 and scenario 2. The results show that rather large standard deviations of η 0 may be incurred, although the RASE of η is pretty small. As the sample size increases, such biases may attenuate.
The proposed estimators of real functions of g ( t ) with two different settings are illustrated as a continuous curve in Figure 1 and Figure 2. Additionally, Figure 3 and Figure 4 show the true curves and fitted values of the state function v ( t ) and the mean of the cumulative variable μ ( t ) of scenario 1 and scenario 2 correspondingly. In all figures, the estimated functions approximate to the true functions. Thus, the polynomial basis regression splines method works well regardless of simple linear functions or complex exponential functions.

4. An Application to MADIT Data

In the previous discussion, we presented an efficient estimation for the state function from the subjects observed at discrete time points. Here, we validate our proposed method with MADIT data. A total of 181 patients from 36 centers in the USA enrolled in MADIT data.
Of the 181 patients, 89 people were implanted with cardiac defibrillators, while another 92 were not. For convenience, we encode them as ICD 1 or 0. The effect of treatment ICD is considered in the survival model because it did not directly produce any medical costs but it directly affected the life expectancy of the survival part. These patients have six types of costs:
  • Hospitalization and emergency department visits;
  • Outpatient tests and procedures;
  • Physician/specialist visits;
  • Community services;
  • Medical supplies;
  • Medication.
These costs can be incured daily from the start to the completion of the trial. It should be pointed out that R-code does not work for the daily cost data because it contains a large amount of data points. Hence, we used 12 days as a time unit. These covariates should be considered in the longitudinal process sub-model. We used the log-cost as the price process Y ( t ) .
These data also include the patients’ ID code, observed survival time in days, and death indicator.
In the data set, a total 181 patients have been encoded as follows:
  • ID code (from 1 to 181);
  • Treatment code (1 for ICD and 0 for conventional);
  • Observed survival time;
  • Death indicator (1 for death, 0 for censored);
  • Daily cost of the whole trial;
  • Cost type 1–6 of patients at all observed time points.
Now, the merged data points consist of 11,321 observed points for a total of 181 subject.
To analyze this data set, the model can be described as follows:
Y i ( t ) = β 1 x 1 i ( t ) + β 2 x 2 i ( t ) + β 3 x 3 i ( t ) + β 4 x 4 i ( t ) + β 5 x 5 i ( t ) + β 6 x 6 i ( t ) + g ( t ) + b 0 + b 1 x 7 i ( t ) + ϵ i ( t ) , h i ( t ) = h 0 ( t ) exp { γ w i + α ( Y i ( t ) ϵ i ( t ) ) } ,
where x i r ( t ) = 1 with type r = 1, and x i r ( t ) = 0 with type r equal to others for any r = 1 , 2 , , 6 . w i = 1 for the ICD group and w i = 0 for the conventional group. We used η B ( t ) with η = ( η 0 , η 1 , η 2 , η 3 ) , B ( t ) = [ 1 , t , t 2 , t 3 ] to simulate the smooth function g ( t ) .
We first used the JM package in R with the longitudinal data and survival data to obtain the estimated vector ( β ^ 0 , , β ^ 6 ) for the fixed coefficients, the estimated parameter α ^ for the association parameter, the estimated parameter γ ^ for the time-independent regression parameter, and the estimated vector ( η ^ 0 , , η ^ 3 ) for the non-parametric function g ( t ) . The estimated mean daily medical cost ν ( t ) at observed time points can be figured out in terms of the proposed model in Equation (10), and then a smooth function for ν ( t ) can be obtained using the kernel method. Finally, the cumulative mean medical cost μ ( t ) can be calculated by integrating ν ( t ) .
The estimated coefficients for the parameters and function are presented in Table 5 and Table 6. As one can see, all estimated parameters are significantly even at a level of 0.0001 .
Table 7 shows the estimated results of the event process. The association between survival and hazard rate is positive, which means that higher medical costs correspond to lower survival rates, and as we know, a patient needs higher medical costs if they are severely ill. This further demonstrates that our model is efficient and reasonable.
Parametric regression estimation for the joint model [13] describes the relationship between the natural logarithm of medical costs and the time unit as the fixed coefficient β 7 of time t in the model Y i ( t ) = β 1 x 1 i ( t ) + β 2 x 2 i ( t ) + β 3 x 3 i ( t ) + β 4 x 4 i ( t ) + β 5 x 5 i ( t ) + β 6 x 6 i ( t ) + β 7 t + b 0 + b 1 t + ϵ i ( t ) . In this case, β 7 turns out to be 0.001389 , which is negative. Compared with this, the non-parametric part in the proposed model can indicate such a relationship as a precise descent function as Figure 5 shows, and not just a negative correlation. Moreover, the function decreases over time, which is consistent with the result in [13] and further shows that the model we proposed is feasible.
Compared with the Kaplan–Meier estimator, the estimated marginal survival function of the survival time T is smoother, as Figure 6 shows. The solid line is the estimated survival rate, and the dotted line represents the 95 % confidence interval of the estimated survival rate.
Figure 7 illustrates the fitted points for the mean medical cost. Figure 8 gives the fitted points for the cumulative mean medical cost, which increases linearly with unit time. Over time, the patient’s cumulative medical cost increases. The estimates suggest that this increase is almost linear and show the rate of increase in medical costs per unit time. Table 8 presents the estimated cumulative cost for 5 years and the total treatment period. In practical situations, the treatment a patient was going to select could not be prognosticated. The results can provide government agencies or insurance companies with a statistical decision recommendation.

5. Discussion

The history process is closely related to the quality-adjusted life of patients. The academic community has carried out some work on the estimation of medical costs using the joint model, and it has paid no attention to the complex nonlinear relationships that may exist between longitudinal data and covariates. There is not much discussion of the method of estimating the cumulative mean state function.
Therefore, this paper examines the existing estimation method, and then introduces the novel semi-parametric estimation based on a joint-model technique to assist government or insurance companies in better prediction for average cost and survival probability. At first, the fitted values Y ^ ( t i j ) for the state process Y ( t ) at discrete times are obtained by using a semi-parametric model, and the mean medical cost ν ( t ) can be estimated at the observed time points t i j , accordingly using the inverse probability weighting method. Then, using the kernel function method, the smooth function of ν ( t ) can be derived. Finally, for any time points, the estimate of the cumulative mean state function μ ( t ) can be obtained because ν ( t ) can be calculated at any time point.
Although the results for the estimation of the longitudinal process in this paper demonstrate that the proposed method performs well, the estimation of the survival process is not satisfactory. The biases of the time-independent covariate coefficients are considerably large, and the performance is only improved when the sample size is large enough. Meanwhile, although the standard deviation decreases as the sample size increases, it is still not very small. In the future, researchers can pay attention to the modeling of the survival process by considering the nonlinear survival function or the time-varying association coefficient.
To sum up, the proposed estimator is more robust because the non-parametric part in the longitudinal process is designed to compare with the traditional models and the kernel function method is added to the estimation of the cumulative mean state function. The consistent property of the proposed estimator and the numerical studies are quite supportive of such a conclusion.

Author Contributions

Conceptualization, S.L., D.D. and Y.H.; methodology, S.L. and D.D; software, S.L. and D.D.; validation, D.D. and Y.H. and D.Z.; formal analysis, S.L.; investigation, S.L.; resources, D.D.; data curation, S.L. and D.D.; writing—original draft preparation, S.L.; writing—review and editing, D.D.; visualization, Y.H. and D.Z.; supervision, D.D. and Y.H.; project administration, D.D.; funding acquisition, Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

Han’s work is partially supported by the National Natural Science Foundation of China (grant number 11871244); Deng’s work is partially supported by the Natural Sciences and Engineering Research Council of Canada.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof of Theorem 1.
At first, we define Y ˜ ( t ) = Y ( t ) z ( t ) T b = X ( t ) β + g ( t ) + ϵ ( t ) . From (Equation (7)), we have:
ν ˜ ( t ) ν ( t ) = 1 m i = 1 m I { T i * t } S ^ T * ( t ) Y ^ i ( t ) E Y ( t ) = 1 m i = 1 m I { T i * t } S ^ T * ( t ) S T * ( t ) Y ^ i ( t ) [ S T * ( t ) S ^ T * ( t ) ] + 1 m i = 1 m I { T i * t } S T * ( t ) ( Y ^ i ( t ) Y ˜ i ( t ) ) + 1 m i = 1 m I { T i * t } S T * ( t ) Y ˜ i ( t ) E Y i ( t ) = ( I ) + ( I I ) + ( I I I ) .
It suffices to show that ( I ) , ( I I ) , and ( I I I ) , converge to 0, as m .
We have assumed that ϵ ( t ) is the independent of the T conditional on X ( t ) and W, and thus we have that Y ˜ ( t ) = X ( t ) β + g ( t ) + ϵ ( t ) is independent of terminal time T, and E Y ˜ ( t ) = E Y ( t ) . Moreover, we have:
E I { T i * t } S T * ( t ) Y ˜ i ( t ) = E ( I { T i * t ) } S T * ( t ) E Y ˜ i ( t ) = E Y ( t ) .
Then, from the law of large numbers, we have:
( I I I ) = 1 m i = 1 m I { T i * t } S T * ( t ) Y ˜ i ( t ) E Y ( t ) 0 a . s . as   m .
For the second term ( I I ) , we have:
( I I ) = 1 m i = 1 m I { T i * t } S T * ( t ) ( Y ^ i ( t ) Y ˜ i ( t ) ) = 1 m i = 1 m I { T i * t } S T * ( t ) x i ( t ) ( β ^ β ) + 1 m i = 1 m I { T i * t } S T * ( t ) ( g ^ ( t ) g ( t ) ) + 1 m i = 1 m I { T i * t } S T * ( t ) ( ϵ i ( t ) ) ( I V ) + ( V ) + ( V I ) .
From Assumption (a1), x i ( t ) R 0 for some positive constants R 0 , and, based on Theorem 3.1 in Zeng and Cai [19] and Assumption (a6), we have:
( I V ) 1 m i = 1 m I { T i * t } S T ( t ) S C ( t ) x i ( t ) β ^ β C 0 a 2 β ^ β 0 a . s . as   m .
Moreover, as we assume, g ( t ) is a smooth and bounded function. By Taylor expansion, we have that g ( t ) = g ( t 0 ) + g ( t 0 ) ( t t 0 ) + 1 2 ! g ( t 0 ) ( t t 0 ) 2 + 1 3 ! g ( t 0 ) ( t t 0 ) 3 + ( t t 0 ) 4 . Hence, we have:
( V ) = 1 m i = 1 m I { T i * t } S T ( t ) S C ( t ) ( η ^ B ( t ) g ( t ) ) 0 .
Also, from the law of large numbers,
( V I ) = lim m 1 m i = 1 m I { T i * t } S T * ( t ) ( ϵ i ( t ) ) = E I { T * t } S T * ( t ) ( ϵ ( t ) ) = E ( I { T * t } ) S T * ( t ) E ( ϵ ( t ) ) = 0 a . s .
Now, we consider the first term ( I ) :
( I ) = 1 m i = 1 m I { T i * t } S ^ T ( t ) S T ( t ) S ^ C ( t ) S C ( t ) x i ( t ) β ^ [ S T * ( t ) S ^ T * ( t ) ] + 1 m i = 1 m I { T t } S ^ T ( t ) S T ( t ) S ^ C ( t ) S C ( t ) g ^ ( t ) [ S T * ( t ) S ^ T * ( t ) ] ( V I I ) + ( V I I I ) .
Similarly, based on Assumption (a4), Theorem 3.1 in Zeng and Cai [19], and Theorem 2 in Phadia and Ryzin [20], we have:
( V I I ) 1 m i = 1 m I { T i * t } S ^ T ( t ) S T ( t ) S ^ C ( t ) S C ( t ) x i ( t ) β ^ S T * ( t ) S ^ T * ( t ) M 0 C 0 2 a 4 S T ( t ) S C ( t ) S ^ T ( t ) S ^ C ( t ) M 0 C 0 2 a 4 ( S T ( t ) S C ( t ) S ^ C ( t ) + S ^ C ( t ) S T ( t ) S ^ T ( t ) ) 0 a . s . as   m .
Obviously, g ^ ( t ) is a bounded function,
( V I I I ) 1 m i = 1 m I { T t } S ^ T ( t ) S T ( t ) g ^ ( t ) S T ( t ) S ^ T ( t ) 1 a 4 S T ( t ) S ^ T ( t ) 0 a . s . as   m .
Hence, we have that ( I ) 0 as m . □
This completes the proof of Theorem 1.

References

  1. Rizopoulos, D. Joint Models for Longitudinal and Time-to-Event Data: With Applications in R; CRC Press: Boca Raton, FL, USA, 2012. [Google Scholar]
  2. Kalbfleisch, J.D.; Prentice, R.L. The Statistical Analysis of Failure Time Data; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
  3. Lagakos, S.W. General Right Censoring and Its Impact on the Analysis of Survival Data. Biometrics 1979, 35, 139–156. [Google Scholar] [CrossRef] [PubMed]
  4. Lin, D.Y.; Feuer, E.J.; Etzioni, R.; Wax, Y. Estimating Medical Costs from Incomplete Follow-Up Data. Biometrics 1997, 53, 419–434. [Google Scholar] [CrossRef] [PubMed]
  5. Huang, Y.; Lovato, L. Tests for lifetime utility or cost via calibrating survival time. Stat. Sin. 2002, 12, 707–723. [Google Scholar]
  6. Bang, H.; Tsiatis, A. Estimating medical costs with censored data. Biometrika 2000, 87, 329–343. [Google Scholar] [CrossRef]
  7. Zhao, H.; Tsiatis, A.A. Estimating Mean Quality Adjusted Lifetime with Censored Data. Sankhyā Indian J. Stat. Ser. 2000, 62, 175–188. [Google Scholar]
  8. Zhao, H.; Tian, L. On Estimating Medical Cost and Incremental Cost-Effectiveness Ratios with Censored Data. Biometrics 2001, 57, 1002–1008. [Google Scholar] [CrossRef] [PubMed]
  9. Korn, E.L. On estimating the distribution function for quality of life in cancer clinical trials. Biometrika 1993, 80, 535–542. [Google Scholar] [CrossRef]
  10. Strawderman, R.L. Estimating the Mean of an Increasing Stochastic Process at a Censored Stopping Time. J. Am. Stat. Assoc. 2000, 95, 1192–1208. [Google Scholar] [CrossRef]
  11. Fang, H.B.; Wang, J.; Deng, D.; Tang, M.L. Estimating the mean of a mark variable under right censoring on the basis of a state function. Comput. Stat. Data Anal. 2011, 55, 1726–1735. [Google Scholar] [CrossRef]
  12. Liu, L.; Wolfe, R.A.; Kalbfleisch, J.D. A shared random effects model for censored medical costs and mortality. Stat. Med. 2007, 26, 139–155. [Google Scholar] [CrossRef] [PubMed]
  13. Deng, D. Estimating the cumulative mean function for history process with time-dependent covariates and censoring mechanism: Estimating the cumulative mean function for history process. Stat. Med. 2016, 35, 4624–4636. [Google Scholar] [CrossRef] [PubMed]
  14. Zhao, X.; Tong, X.; Sun, L. Joint analysis of longitudinal data with dependent observation times. Stat. Sin. 2012, 22, 317–336. [Google Scholar] [CrossRef]
  15. Li, C.; Xiao, L.; Luo, S. Joint model for survival and multivariate sparse functional data with application to a study of Alzheimer’s Disease. Biometrics 2022, 78, 435–447. [Google Scholar] [CrossRef] [PubMed]
  16. Do, H.; Nandi, S.; Putzel, P.; Smyth, P.; Zhong, J. A Joint Fairness Model with Applications to Risk Predictions for Under-represented Populations. Biometrics 2022, 79, 826–840. [Google Scholar] [CrossRef] [PubMed]
  17. Tang, A.M.; Peng, C.; Tang, N. Semiparametric normal transformation joint model of multivariate longitudinal and bivariate time-to-event data. Stat. Med. 2023, 42, 5491–5512. [Google Scholar] [CrossRef]
  18. Xu, C.; Baines, P.D.; Wang, J.L. Standard error estimation using the EM algorithm for the joint modeling of survival and longitudinal data. Biostatistics 2014, 15, 731–744. [Google Scholar] [CrossRef]
  19. Zeng, D.; Cai, J. Asymptotic Results for Maximum Likelihood Estimators in Joint Analysis of Repeated Measurements and Survival Time. Ann. Stat. 2005, 33, 2132–2163. [Google Scholar] [CrossRef]
  20. Phadia, E.G.; Ryzin, J.V. A Note on Convergence Rates for the Product Limit Estimator. Ann. Stat. 1980, 8, 673–678. [Google Scholar] [CrossRef]
Figure 1. Estimated function g ( t )  for scenario 1.
Figure 1. Estimated function g ( t )  for scenario 1.
Symmetry 15 02130 g001
Figure 2. Estimated function g ( t )  for scenario 2.
Figure 2. Estimated function g ( t )  for scenario 2.
Symmetry 15 02130 g002
Figure 3. True values and estimated values of estimated function of v ( t ) and μ ( t )  for scenario 1.
Figure 3. True values and estimated values of estimated function of v ( t ) and μ ( t )  for scenario 1.
Symmetry 15 02130 g003
Figure 4. True values and estimated values of estimated function of v ( t ) and μ ( t )  for scenario 2.
Figure 4. True values and estimated values of estimated function of v ( t ) and μ ( t )  for scenario 2.
Symmetry 15 02130 g004
Figure 5. Estimated function of g ( t ) .
Figure 5. Estimated function of g ( t ) .
Symmetry 15 02130 g005
Figure 6. Marginal survival compared with the Kaplan–Meier estimator.
Figure 6. Marginal survival compared with the Kaplan–Meier estimator.
Symmetry 15 02130 g006
Figure 7. Fitted points of cost.
Figure 7. Fitted points of cost.
Symmetry 15 02130 g007
Figure 8. Fitted points of cumulative cost.
Figure 8. Fitted points of cumulative cost.
Symmetry 15 02130 g008
Table 1. Theestimates of parameters in the joint model for scenario 1.
Table 1. Theestimates of parameters in the joint model for scenario 1.
n = 125n = 250n = 500
ParameterTrueBiasStd. Err.BiasStd. Err.BiasStd. Err.
β −0.9−0.02500.04590.01640.06340.01000.0387
α 0.250.00580.04770.01610.07690.03210.0493
γ 1.00.15500.31300.08240.14600.04290.1560
Table 2. The estimates of parameters in the joint model for scenario 2.
Table 2. The estimates of parameters in the joint model for scenario 2.
n = 125n = 250n = 500
ParameterTrueBiasStd. Err.BiasStd. Err.BiasStd. Err.
β −0.9−0.02090.02260.00220.03530.01210.0121
α 0.50.01480.01640.01910.01110.01870.0085
γ 1.00.01930.36000.02650.22800.06100.1670
Table 3. The estimates of non-parameters in the joint model for scenario 1.
Table 3. The estimates of non-parameters in the joint model for scenario 1.
n = 125n = 250n = 500
RASE0.41958090.1416950.133229
η 0 0.11000.21800.0547
η 1 0.04220.07140.0216
η 2 0.04860.09340.0255
η 3 0.01640.03150.0087
Table 4. The estimates of non-parameters in the joint model for scenario 2.
Table 4. The estimates of non-parameters in the joint model for scenario 2.
n = 125n = 250n = 500
RASE0.17437470.16374060.1599371
η 0 0.13800.08080.0577
η 1 0.05480.14300.0216
η 2 0.05800.06680.0262
η 3 0.02340.09530.0259
Table 5. Estimate results of the longitudinal process.
Table 5. Estimate results of the longitudinal process.
ParameterValueStd. Err.p-Value
Type 14.15960.0203<0.0001
Type 21.24530.0605<0.0001
Type 30.59540.0138<0.0001
Type 40.25980.0276<0.0001
Type 5−0.38730.0725<0.0001
Type 61.23080.0517<0.0001
Table 6. Estimate results of g(t).
Table 6. Estimate results of g(t).
ParameterValueStd. Err.p-Value
η 0 4.96500.0796<0.0001
η 1 −0.02370.0017<0.0001
η 2 0.00370.0002<0.0001
η 3 −0.00020.0000<0.0001
Table 7. Estimate results of the event process.
Table 7. Estimate results of the event process.
ParameterValueStd. Err.p-Value
Treatment−1.19990.33690.0004
Association0.24990.17760.1594
Table 8. Estimated cumulative costs for 5 years and total period.
Table 8. Estimated cumulative costs for 5 years and total period.
Year 1Year 2Year 3Year 4Year 5Total
23,256.2944,811.7968,177.8697,343.17120,904.7121,210.8
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, S.; Deng, D.; Han, Y.; Zhang, D. Joint Model for Estimating the Asymmetric Distribution of Medical Costs Based on a History Process. Symmetry 2023, 15, 2130. https://doi.org/10.3390/sym15122130

AMA Style

Li S, Deng D, Han Y, Zhang D. Joint Model for Estimating the Asymmetric Distribution of Medical Costs Based on a History Process. Symmetry. 2023; 15(12):2130. https://doi.org/10.3390/sym15122130

Chicago/Turabian Style

Li, Simeng, Dianliang Deng, Yuecai Han, and Dingwen Zhang. 2023. "Joint Model for Estimating the Asymmetric Distribution of Medical Costs Based on a History Process" Symmetry 15, no. 12: 2130. https://doi.org/10.3390/sym15122130

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop