Bayesian Variable Selection and Estimation in Semiparametric Simplex Mixed-Effects Models with Longitudinal Proportional Data

Tang, Anmin; Duan, Xingde; Zhao, Yuanying

doi:10.3390/e24101466

Open AccessArticle

Bayesian Variable Selection and Estimation in Semiparametric Simplex Mixed-Effects Models with Longitudinal Proportional Data

by

Anmin Tang

¹,

Xingde Duan

^2,* and

Yuanying Zhao

³

¹

Yunnan Key Laboratory of Statistical Modeling and Data Analysis, Yunnan University, Kunming 650091, China

²

Department of Mathematics and Statistics, Guizhou University of Finance and Economics, Guiyang 550025, China

³

College of Mathematics and Information Science, Guiyang University, Guiyang 550005, China

^*

Author to whom correspondence should be addressed.

Entropy 2022, 24(10), 1466; https://doi.org/10.3390/e24101466

Submission received: 6 September 2022 / Revised: 8 October 2022 / Accepted: 9 October 2022 / Published: 14 October 2022

(This article belongs to the Special Issue Statistical Methods for Modeling High-Dimensional and Complex Data)

Download

Browse Figures

Versions Notes

Abstract

:

In the development of simplex mixed-effects models, random effects in these mixed-effects models are generally distributed in normal distribution. The normality assumption may be violated in an analysis of skewed and multimodal longitudinal data. In this paper, we adopt the centered Dirichlet process mixture model (CDPMM) to specify the random effects in the simplex mixed-effects models. Combining the block Gibbs sampler and the Metropolis–Hastings algorithm, we extend a Bayesian Lasso (BLasso) to simultaneously estimate unknown parameters of interest and select important covariates with nonzero effects in semiparametric simplex mixed-effects models. Several simulation studies and a real example are employed to illustrate the proposed methodologies.

Keywords:

simplex distribution; Gibbs sampler; Metropolis–Hastings algorithm; Dirichlet process prior; Bayesian Lasso

1. Introduction

Various mixed-effects models based on simplex distribution have increasingly become popular tools in the analysis of longitudinal continuous proportional data over time in many biological, medical and clinical studies. Under the framework of generalized linear mixed models, see Qiu et al. [1] for information on developing a simplex generalized linear mixed model on the basis of the penalized quasi-likelihood (PQL) and restricted maximum likelihood (REML) inference; see Zhang and Wei [2] for information on using the maximum likelihood estimation combining the stochastic approximation (SA) algorithm and the MCMC method to infer on simplex distribution nonlinear mixed models; see Zhao et al. [3] for information on implementing the MCMC algorithm to obtain the joint Bayesian estimate of simplex distribution nonlinear mixed models from the Bayesian perspective; see Bonat et al. [4] for information on investigating the likelihood analysis for a class of simplex mixed models with logit, probit, complement log–log and Cauchy link functions; see Quintero [5] for information on presenting the sensitivity analysis for variance parameters of random effects in Bayesian simplex mixed models. The random effects in the abovementioned mixed-effects models are assumed to have a multivariate normal distribution. However, in some practical applications, it is questionable for the normal assumption for random effects to analyze the skewed, bimodal and heavy-tailed longitudinal data. Therefore, it is essential to incorporate a semiparametric hierarchical structure via a Dirichlet process prior distribution for the random effects into the simplex mixed-effects models to accommodate longitudinal proportional data.

The nonparametric Bayesian approach based on Dirichlet process (DP) prior for random effects in mixed-effects models has been receiving a lot of attention in recent years. For example, Kleinman and Ibrahim [6] used a Dirichlet process prior for the general distribution of the random effects in generalized linear mixed model. As a variant of Dirichlet process prior, the truncation approximation Dirichlet process with stick-breaking priors is widely incorporated into various mixed-effects models to specify the general distribution of random effects. For example, Tang and Duan [7] used this approach for a semiparametric Bayesian approach to generalized partial linear mixed model; Tang and zhao [8] used this approach for nonlinear reproductive dispersion mixed models; Zhao et al. [9] used this approach for a semiparametric Bayesian approach to binomial distribution logistic mixed-effects model. In particular, Duan et al. [10] used a truncated and centered Dirichlet process prior to specify random effects in semiparametric reproductive dispersion mixed model. However, the abovementioned DP with stick-breaking prior for random effects is inappropriate when the underlying density of random effects is continuous. In addition, this type of variant for Dirichlet process prior is rather time-consuming in the calculation process for complicated models. Therefore, to address the above issues, the goal of this paper is to propose a new semiparametric simplex mixed-effects models with the random effects distribution specified by the centered Dirichlet process mixture model (CDPMM).

Although various methodologies have been developed to make statistical inference on the aforementioned simplex mixed-effects models, little work has been performed for the variable selection of simplex mixed-effects models. Classical model-selection methods, such as the step-wise selection method [11], the model comparison via Bayes factor [12], the Akaike information criterion [13] and Deviance information criterion [14], are often used to identify the important covariates in regression analysis; however, these approaches are generally computationally intensive and unstable for complicated mixed models with many covariates. On the other hand, the regularization (penalization) method has increasingly become a popular tool for conducting variable selection in regression analysis. Commonly used regularization methods in the context of linear regression include least absolute shrinkage and selection operator (Lasso) [15], elastic net [16] and adaptive lasso [17]. In addition, Park and Casella [18] proposed the Bayesian version of the Lasso (BLasso) by assigning the conditional Laplace prior of regression coefficients and the gamma distribution of shrinkage parameter under the Bayesian framework. The BLasso procedure has been extended to various complex models including semiparametric structural equation models [19] and semiparametric joint models of multivariate longitudinal and survival data [20]. In particular, Erd et al. [21] pointed out that Bayesian penalization methods perform similarly or sometimes even better than frequentist penalization methods, since Bayesian penalization methods can easily provide credible intervals (CIs) for parameters of interest and obtain the estimate of the penalty parameter by assigning an appropriate prior distribution. Therefore, the other main purpose of this paper is to extend the BLasso procedure to the considered semiparametric simplex mixed-effects models.

The paper is organized as follows: In Section 2, we propose a new semiparametric simplex mixed-effects models with random effects following the centered Dirichlet process mixture model (CDPMM) and incorporate a BLasso procedure into the proposed simplex mixed-effects models. The required conditional distributions are derived in Section 3. Two simulation studies and a real example are used to illustrate the proposed methodologies in Section 4. Some concluding remarks are given in Section 5.

2. Model and Notation

The simplex distribution was firstly proposed by Barndorff-Nielsen and Jørgensen [22], whose probability density function is specified as

p (y; μ, σ^{2}) = \{\begin{matrix} {[2 π σ^{2} {y (1 - y)}^{3}]}^{- 1 / 2} e x p {- \frac{d (y; μ)}{2 σ^{2}}}, & i f 0 < y < 1 \\ 0, & o t h e r w i s e \end{matrix}

(1)

where

μ \in (0, 1)

denotes the mean parameter;

σ^{2} > 0

represents the dispersion parameter; and

d (y; μ) = \frac{{(y - μ)}^{2}}{y (1 - y) μ^{2} {(1 - μ)}^{2}}

. For simplicity of notation, we denote

y \sim S^{-} (μ, σ^{2})

if a random variable, y, is distributed as a simplex distribution with mean parameter,

μ

, and dispersion parameter,

σ^{2}

, in the rest of this paper.

In the context of longitudinal data analysis, let

y_{i j}

denote the longitudinal percentage outcome for the ith individual at the jth follow-up time

t_{i j}

, and

0 < y_{i j} < 1

,

i = 1, \dots, n, j = 1, \dots, n_{i}

. We assume that, given a

q \times 1

random effects

b_{i}

corresponding to the ith individual, the responses

y_{i j}

are conditionally independent and each

y_{i j} | b_{i}

is distributed as a simplex distribution with conditional means,

μ_{i j} = E (y_{i j} | b_{i})

, and constant dispersion parameter,

σ^{2}

: that is,

y_{i j} | b_{i} \sim S^{-} (μ_{i j}, σ^{2})

. Under the framework of GLMM, the conditional mean is linked to explanatory variables and random effects as follows:

f (μ_{i j}) \overset{Δ}{=} η_{i j} = x_{i j}^{T} β + z_{i j}^{T} b_{i},

(2)

where an unknown and monotone link function

f (\cdot)

is chosen as the logit link;

x_{i j}

is a

(p + 1) \times 1

vector of covariates which consist of the constant 1 and time-dependent covariates observed at time point

t_{i j}

;

β

is a

(p + 1) \times 1

vector of unknown regression parameters;

z_{i j}

is a

q \times 1

vector of time-dependent variables which may include some elements of

x_{i j}

corresponding to random effects

b_{i}

. In classical random-effects models, the random effects in (2) are generally assumed to be a multivariate normal distribution, which may give rise to biased estimates of parameters or even misleading conclusions. Thus, inspired by Ohlssen and Spiegelhalter [23], we used the DP mixture of normals to specify the random effects: that is,

b_{i} \overset{i . i . d .}{\sim} \sum_{g = 1}^{\infty} π_{g} N_{q} (μ_{g}, Ω_{g})

with

(μ_{g}, Ω_{g}) \sim P

, where

P

is an unknown random probability. Clearly, it is rather difficult and inefficient to make Bayesian estimates for regression parameter

β

and dispersion parameter

σ^{2}

in Equation (2) since an unknown form of

P

is involved. To address the difficulty, the Dirichlet process (DP) prior is usually introduced to approximate

P

, i.e.,

P \sim DP (τ F_{0})

, in which

F_{0}

is a given base distribution such as multivariate normal distribution that serves as a starting point for constructing the nonparametric distribution, and

τ

is a weight that indicates the researcher’s certainty of

F_{0}

as the distribution of

P

. In particular, Sethuraman [24] showed that the DP prior

DP (τ F_{0})

has the stick-breaking prior representation; however, this approach causes a nonzero mean of random effects [25] and a discrete probability distribution of random effects [23]. Generally, the variants of Dirichlet Process proposed by Ishwaran and Zarepour [26] and Yang et al. [25] were regarded as discrete Dirichlet processes (discrete DPs). A discrete DP with stick-breaking prior for random effects is inappropriate when the underlying density of random effects is continuous. Furthermore, violation of zero mean assumption on the random effects may lead to non-identifiability in the aforementioned random effects model. In addition, the discrete DP methods with stick-breaking prior for random effects are generally computationally intensive for the complicated models.

To overcome the above issues, inspired by Ohlssen and Spiegelhalter [23] and Yang et al. [25], we incorporated the following variant of Dirichlet process into the above model in (2) to specify random effects. That is,

b_{i} \overset{i . i . d .}{\sim} \sum_{g = 1}^{\infty} π_{g} N_{q} (μ_{g}, Ω_{g}) with μ_{g} = μ_{g}^{*} - \sum_{g = 1}^{\infty} π_{g} μ_{g}^{*} and (μ_{g}^{*}, Ω_{g}) \overset{i . i . d .}{\sim} F_{0},

(3)

where

π_{g}

is a random probability weight satisfying

0 \leq π_{g} \leq 1

and

\sum_{g = 1}^{\infty} π_{g} = 1

. In addition,

π_{g}

is assumed to be be independent of

(μ_{g}^{*}, Ω_{g})

. This variant of Dirichlet process is referred to as the centered Dirichlet process mixture model (CDPMM). As in Ishwaran and Zarepour [26], we adopt the following mixture model of the truncated approximation DP for

P

:

b_{i} \overset{i . i . d .}{\sim} \sum_{g = 1}^{G} π_{g} N_{q} (μ_{g}, Ω_{g}) with μ_{g} = μ_{g}^{*} - \sum_{g = 1}^{G} π_{g} μ_{g}^{*} and (μ_{g}^{*}, Ω_{g}) \overset{i . i . d .}{\sim} F_{0},

(4)

where G is a limited integer satisfying

1 \leq G < \infty

. As for the selection of G, Ishwaran and Zarepour [26] pointed out that a moderate value of G such as 25 may be enough to capture a good approximation in practical application. Thus, the value of G is chosen to be 25 in the rest of this paper. Furthermore, the random probability weight,

π_{g}

, is specified by the following stick-breaking procedure:

π_{1} = ϑ_{1} and π_{g} = ϑ_{g} \prod_{ι = 1}^{g - 1} (1 - ϑ_{ι}) for g = 2, \dots, G,

(5)

where

ϑ_{g} \overset{i . i . d .}{\sim} Beta (1, τ)

for

g = 1, \dots, G - 1

, and

ϑ_{G} = 1

so that

\sum_{g = 1}^{G} π_{g} = 1

. The prior distribution for the unknown parameter

τ

is chosen as

τ \sim Γ (a_{1}, a_{2})

, such that the posterior distribution for

τ

is conjugated. Here, we set the hyperparameters

a_{1}

and

a_{2}

to be 25 and 5, respectively, such that large value of

τ

is generated, which results in more unique

b_{i}

values.

It is rather difficult and inefficient to generate observations from posterior distributions of

b_{i}

with the above DP prior via MCMC algorithm. Furthermore, a latent variable

L_{i} \in {1, \dots, G}

is introduced to solve sample issue since this latent variable can record each

b_{i}

’s cluster membership and convey its parametric value to the distribution of

b_{i}

. Let

L = {L_{1}, \dots, L_{n}}

,

π = {π_{1}, \dots, π_{G}}

,

μ^{*} = {μ_{1}^{*}, \dots, μ_{G}^{*}}

and

Ω = {Ω_{1}, \dots, Ω_{G}}

, in which

Ω_{g} = diag (ω_{g 1}, \dots, ω_{g q})

for

g = 1, \dots, G

. As in Ishwaran and Zarepour [26], the hierarchical structure defined in (4) can be written as

L_{i} | π \overset{i . i . d}{\sim} \sum_{g = 1}^{G} π_{g} δ_{g} (\cdot) and (π, μ^{*}, Ω) \sim f_{1} (π) f_{2} (μ^{*}) f_{3} (Ω),

(6)

where

δ_{g} (\cdot)

denotes a discrete probability measure concentrated at g,

f_{1} (π)

is defined in Equation (5), the prior for

μ_{g}^{*}

associated with

f_{2} (μ^{*}) = \prod_{g = 1}^{G} f_{2} (μ_{g}^{*})

is defined by

μ_{g}^{*} | ξ, Ψ \overset{i . i . d}{\sim} N_{q} (ξ, Ψ), ξ | ξ^{0}, Ψ^{0} \sim N_{q} (ξ^{0}, Ψ^{0}), ψ_{j}^{- 1} | c_{1}, c_{2} \sim Γ (c_{1}, c_{2}) for j = 1, \dots, q,

(7)

and the prior for

ω_{g_{j}}

related to

f_{3} (Ω) = \prod_{g = 1}^{G} \prod_{j = 1}^{q} f_{3} (ω_{g_{j}})

is defined by

ω_{g_{j}}^{- 1} | ω_{j}^{a}, ϖ_{j} \sim Γ (ω_{j}^{a}, ϖ_{j}) and ϖ_{j} | ϖ_{j}^{a}, ϖ_{j}^{b} \sim Γ (ϖ_{j}^{a}, ϖ_{j}^{b}),

(8)

where

Ψ = diag (ψ_{1}, \dots, ψ_{q})

,

Γ (c_{1}, c_{2})

denotes the Gamma distribution with parameters

c_{1}

and

c_{2}

, and

ξ^{0}, Ψ^{0}, c_{1}, c_{2}, ω_{j}^{a}, ϖ_{j}^{a}

and

ϖ_{j}^{b}

are pre-specified hyperparameters: that is,

ξ^{0} = 0_{q \times 1}

,

Ψ^{0} = I_{q}

,

c_{1} = 11

,

c_{2} = 2.5

,

ω_{j}^{a} = 3

,

ϖ_{j}^{a} = n

and

ϖ_{j}^{b} = 10

. Thus, given the values of

L_{i}

,

μ^{*}

and

Ω

, the prior for random effect

b_{i}

is assumed to be

N_{q} (μ_{L_{i}}, Ω_{L_{i}})

with

μ_{L_{i}} = μ_{L_{i}}^{*} - Σ_{g = 1}^{G} μ_{g}^{*}

.

To estimate the unknown parameters

β

and

σ^{2}

in Equation (2) from the Bayesian perspective, it is necessary to specify priors for

β

and

σ^{2}

. In order to alleviate the computational burden, the conjugate prior distribution for dispersion parameter

σ^{2}

is taken to be

σ^{- 2} \sim Γ (σ_{a}^{2}, σ_{b}^{2}),

(9)

where the values of hyperparameters

σ_{a}^{2}

and

σ_{b}^{2}

are taken to be 1 and

0.01

, respectively. In this paper, the main goal is to incorporate the Bayesian version of lasso into our proposed model (2) to conduct parameter estimation and model selection simultaneously. Similar to Park and Casella [18] and Tang et al. [20], the following Laplace prior on

β

is given by

π (β) = \prod_{k = 0}^{p} \frac{ν}{2} exp (- ν | β_{k} |),

where

ν

is the regularization parameter. Because the mass of the above presented Laplace prior is quite highly concentrated around zero with a distinct peak at zero, posterior means or modes of

β_{k}

’s are shrunk towards zero, which is the key principle in using BLasso method to select the important covariates. Following Robert [15], the Laplace distribution with the form

a exp (- a | x |) / 2

can be represented as a scale mixture of normal distributions with independent exponentially distributed variance: that is,

\frac{a}{2} exp (- a | x |) = \int_{0}^{\infty} \frac{1}{\sqrt{2 π u}} exp (- \frac{x^{2}}{2 u}) \frac{a^{2}}{2} exp (- \frac{a^{2} u}{2}) d u, for a > 0 .

Therefore, the aforementioned prior for

β

can be reformulated as the following hierarchical structure:

\begin{matrix} β | H_{β} \sim N_{p} (0, H_{β}) w i t h H_{β} = diag (h_{β_{0}}^{2}, \dots, h_{β_{p}}^{2}), \\ h_{β_{0}}^{2}, \dots, h_{β_{p}}^{2} \sim \prod_{j = 0}^{p} \frac{ν^{2}}{2} exp (- \frac{ν^{2}}{2} h_{β_{k}}^{2}) \\ ν^{2} \sim Gamma (ν_{a}^{2}, ν_{b}^{2}), \end{matrix}

(10)

where the hyperparameters

ν_{a}^{2}

and

ν_{b}^{2}

are selected as 1 and

0.1

, respectively, which imply diffuse prior. Similar to Park and Casella [18], the posterior distribution for

h_{β_{k}}^{2}

and

ν^{2}

in the hierarchical structure (10) have closed expressions, such that this hierarchical representation greatly simplifies the computation. Therefore, it follows from Equation (10) that the posterior distribution of

ν^{2}

is distributed as the following Gamma distribution

ν^{2} | β_{k}, H_{β_{k}} \sim Gamma (ν_{a}^{2} + p + 1, ν_{b}^{2} + \frac{1}{2} \sum_{k = 0}^{p} h_{β_{k}}^{2}) .

(11)

In addition, the posterior distributions for

h_{β_{0}}^{2}, \dots, h_{β_{p}}^{2}

are derived as

h_{β_{k}}^{- 2} | β_{k}, ν^{2} \sim IG (|\frac{ν}{β_{k}}|, ν^{2}) for k = 0, \dots, p,

(12)

where

IG (a, b)

denotes the inverse Gaussian distribution with parameter a and the shape parameter b. As for sampling from the inverse Gaussian distribution, Tang et al. [20] gave a detailed procedure.

3. Bayesian Analysis of Model

Let

Y = {y_{i j} : i = 1, \dots, n, j = 1, \dots, n_{i}}

,

X = {x_{i j} : i = 1, \dots, n, j = 1, \dots, n_{i}}

,

Z = {z_{i j} : i = 1, \dots, n, j = 1, \dots, n_{i}}

, and random effects

b = {b_{i} : i = 1, \dots, n}

. To obtain joint Bayesian estimates of unknown parameters

β

and

σ^{2}

and the random effects, as well as to select important covariates in our considered models, a hybrid algorithm combining the block Gibbs sampler and the Metropolis–Hastings algorithm is employed to draw a sequence of random observations from the joint posterior distribution

p (β, σ^{2}, b | Y, X, Z)

, as follows. In this hybrid algorithm, observations

{β, σ^{2}, b}

are iteratively drawn from the following conditional distributions:

p (β | σ^{2}, b, Y, X, Z)

,

p (σ^{- 2} | β, b, Y, X, Z)

and

p (b | β, σ^{2}, Y, X, Z)

.

Block Gibbs Sampler (A): Conditional distribution related to

β

It follows from Equations (2) and (10) that the conditional distribution

p (β | σ^{2}, b, Y, X, Z)

is proportional to

\exp \{- \frac{1}{2} [\sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} \frac{1}{σ_{i}^{2}} d (y_{i j}; μ_{i j}) + {(β - β_{0})}^{T} H_{β}^{- 1} (β - β_{0})]\},

which is an unfamiliar distribution. Therefore, we used the well-known Metropolis–Hastings (MH) algorithm to generate observations from the aforementioned conditional distribution as follows. Given the current value

β^{(l)}

, new candidate

β

is generated from the proposal distribution

N (β^{(l)}, σ_{β}^{2} Σ_{β})

and is accepted with probability

m i n \{1, \frac{p (β | σ^{2}, b, Y, X, Z)}{p (β^{(l)} | σ^{2}, b, Y, X, Z)}\},

where

Σ_{β} = {(\sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} \frac{\ddot{d} (y_{i j}; {\bar{μ}}_{i j})}{2 σ^{2} {\dot{f} (μ_{i j})}^{2}} x_{i j}^{T} x_{i j} + H_{β}^{- 1})}^{- 1}

with

\dot{f} (μ_{i j}) = \partial f / \partial μ_{i j}

and

\ddot{d} (y_{i j}; {\bar{μ}}_{i j}) = E_{y_{i j}} (\partial^{2} d (y_{i j}; μ_{i j}) / \partial μ_{i j}^{2}) |_{β = β^{(l)}}

, and the variance coefficient

σ_{β}^{2}

can be chosen, such that the average acceptance rates are approximately 0.25 or more.

Block Gibbs Sampler (B): Conditional distribution related to

σ^{- 2}

The conditional distribution

p (σ^{- 2} | β, b, Y, X, Z)

can be derived as

p (σ^{- 2} | β, b, Y, X, Z) \propto {(σ^{- 2})}^{0.5 \sum_{i = 1}^{n} n_{i} + σ_{a}^{2} - 1} exp \{- (0.5 \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} d (y_{i j}; μ_{i j}) + σ_{b}^{2}) σ^{- 2}\},

which can be simplified as

σ^{- 2} | β, b, Y, X, Z \sim Γ (0.5 \sum_{i = 1}^{n} n_{i} + σ_{a}^{2}, 0.5 \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} d (y_{i j}; μ_{i j}) + σ_{b}^{2}) .

Clearly, it is straightforward and efficient to draw observations for

σ^{- 2}

from the Gamma distribution via any statistical software.

Block Gibbs Sampler (C): Conditional distribution related to $θ_{b}$

Let

θ_{b}

denote all unknown parameters associated with distribution of random effects

b_{i}

,

i = 1, \dots, n

.

θ_{b}

can be iteratively sampled by using the following nine steps:

Step (a). Conditional distribution of

ξ

given

(μ^{*}, Ψ, b)

is given

ξ | μ^{*}, Ψ, b \sim N_{q} (A, B)

where

B = {(G Ψ^{- 1} + {(Ψ^{0})}^{- 1})}^{- 1}

and

A = B ({(Ψ^{0})}^{- 1} ξ^{0} + Ψ^{- 1} \sum_{g = 1}^{G} μ_{g}^{*})

.

Step (b). For

j = 1, \dots, q

, the diagonal elements of

Ψ

is conditionally distributed as

ψ_{j}^{- 1} | μ^{*}, ξ \sim Γ (c_{1} + \frac{G}{2}, c_{2} + \frac{1}{2} \sum_{g = 1}^{G} {(μ_{g_{j}}^{*} - ξ_{j})}^{2}),

where

μ_{g_{j}}^{*}

is the jth element of

μ_{g}^{*}

and

ξ_{j}

is the jth element of

ξ

.

Step (c). For

j = 1, \dots, q

,

ϖ_{j} | Ω

is conditionally distributed as

ϖ_{j} | Ω \sim Γ (ϖ_{j}^{a}, ϖ_{j}^{b} + \sum_{g = 1}^{G} ω_{g_{j}}^{- 1}),

where

ω_{g_{j}}

is the jth diagonal element of

Ω_{g}

.

Step (d). Following Ishwaran and Zarepour [26], the conditional distribution of

τ | π

can be expressed as

τ | π \sim Γ (a_{1} + G - 1, a_{2} - \sum_{g = 1}^{G - 1} log (1 - ν_{g}^{*})),

where

ν_{g}^{*}

is a random weight sampled from the beta distribution and it is sampled with step (e).

Step (e). It is easily obtained that the conditional distribution of

π | L, τ

is distributed as the following generalized Dirichlet distribution:

π | L, τ \sim Dir (a_{1}^{*}, b_{1}^{*}, \dots, a_{G - 1}^{*}, b_{G - 1}^{*}),

where

a_{g}^{*} = 1 + d_{g}, b_{g}^{*} = τ + \sum_{ι = g + 1}^{G} d_{ι}

for

g = 1, \dots, G - 1

, and

d_{g}

is the number of

L_{i}^{'} s

(and thus individuals) whose values equal to g. Simulating observation from the conditional distribution

π | L, τ

can be conducted as follows. First,

ν_{g}^{*}

is independently generated from a Beta distribution

(a_{g}^{*}, b_{g}^{*})

. Then,

π_{1}, \dots, π_{G}

are obtained from the following formulae:

π_{1} = ν_{1}^{*}, π_{G} = 1 - \sum_{g = 1}^{G - 1} π_{g}, and π_{g} = \prod_{ι = 1}^{g - 1} (1 - ν_{ι}^{*}) ν_{g}^{*}, for g \neq 1 or G .

Step (f). Conditional distribution of

μ^{*} | ξ, Ψ, Ω, L, b

.

Let

L_{1}^{*}, \dots, L_{d}^{*}

be the d unique values of

{L_{1}, \dots, L_{n}}

(i.e., unique number of “clusters”), for

g = 1, \dots, G

;

μ_{g}^{*}

is conditionally distributed as follows:

μ_{g}^{*} | ξ, Ψ \sim N_{q} (ξ, Ψ) for g \notin {L_{1}^{*}, \dots, L_{d}^{*}},

μ_{g}^{*} | ξ, Ψ, Ω, L, b \sim N_{q} (E_{g}, F_{g}) for g \in {L_{1}^{*}, \dots, L_{d}^{*}},

where

F_{g} = {(Ψ^{- 1} + Σ_{{i : L_{i} = g}} Ω_{i}^{- 1})}^{- 1}

and

E_{g} = F_{g} (Ψ^{- 1} ξ + Σ_{{i : L_{i} = g}} Ω_{i}^{- 1} b_{i})

for

g \in {L_{1}^{*}, \dots, L_{d}^{*}}

. Given

μ_{g}^{*}

,

μ_{g} = μ_{g}^{*} - Σ_{g = 1}^{G} π_{g} μ_{g}^{*}

,

μ^{*} = {μ_{1}^{*}, \dots, μ_{G}^{*}}

and

μ = {μ_{1}, \dots, μ_{G}}

.

Step (g). Conditional distribution of

Ω | μ, ϖ, L, τ

.

Similar to the notation of step (f), given g, for

j = 1, \dots, q

, the jth diagonal element of

Ω_{g}

is conditionally distributed as

ω_{g_{j}} \sim Γ (ω_{j}^{a}, ϖ_{j}) for g \notin {L_{1}^{*}, \dots, L_{d}^{*}},

ω_{g_{j}} \sim Γ (\frac{d_{g}}{2} + ω_{j}^{a}, ϖ_{j} + \sum_{{i : L_{i} = g}} \frac{1}{2} {(b_{i_{j}} - μ_{g_{j}})}^{2}) for g \in {L_{1}^{*}, \dots, L_{d}^{*}},

where

b_{i_{j}}

is the jth element of

b_{i}

and

μ_{g_{j}}

is the jth element of

μ_{g}

. Given

ω_{g_{j}}

,

Ω_{g} = diag (ω_{g_{1}}, \dots, ω_{g_{q}})

and

Ω = {Ω_{1}, \dots, Ω_{G}}

.

Step (h). The conditional distribution of

L_{i} | π, μ, Ω, b

is given by

L_{i} | π, μ, Ω, b \overset{i . i . d}{\sim} multinomial (π_{i g}^{*}, g = 1, \dots, G),

where

π_{i g}^{*}

is proportional to

(π_{g} p (b_{i} | μ_{g}, Ω_{g}))

with

b_{i} | μ_{g}, Ω_{g} \sim N_{q} (μ_{g}, Ω_{g})

, and

π_{g} (g = 1, \dots, G)

are sampled from step (e). Given

L_{i}

,

μ

and

Ω

, the prior of

b_{i}

is distributed as

N_{q} (μ_{L_{i}}, Ω_{L_{i}})

, with

μ_{L_{i}}

and

Ω_{L_{i}}

being the

L_{i}

elements of sets

μ

and

Ω

, respectively.

Step (i). The conditional distribution for

b = {b_{i} : i = 1, \dots, n}

The conditional distribution

p (b_{i} | β, σ^{2}, Y, X, Z)

is non-standard and cannot be derived directly via Gibbs sampling for

i = 1, \dots, n

. Specifically,

p (b_{i} | β, σ^{2}, Y, X, Z) \propto p (b_{i} | μ_{L_{i}}, Ω_{L_{i}}) p (Y_{i} | β, σ^{2}, b_{i}, X, Z) .

where

Y_{i} = {y_{i j} : j = 1, \dots, n_{i}}

,

p (Y_{i} | β, σ^{2}, b_{i}, X, Z) = \prod_{j = 1}^{n_{i}} p (y_{i j}; μ, σ^{2})

with

p (y_{i j}; μ, σ^{2})

specified by Equation (1) and

μ

by Equation (2). The Metropolis–Hastings algorithm used to sample observation

b_{i}

is implemented as follows. At the ℓth iteration with a current value

b_{i}^{(ℓ)}

, a new candidate

b_{i}

is drawn from the normal distribution

N_{q} (b_{i}^{(ℓ)}, σ_{b}^{2} Σ_{b_{i}})

, where

Σ_{b_{i}} = {(Ω_{L_{i}}^{- 1} + Ξ_{i})}^{- 1}

and

Ξ_{i} = - \partial^{2} {(ln (p (Y_{i} | β, σ^{2}, b_{i}, X, Z)) / \partial b_{i} \partial b_{i}^{T} |}_{b_{i} = b_{i}^{(ℓ)}}

. The new

b_{i}

is accepted with probability

min \{1, \frac{p (b_{i} | μ_{L_{i}}, Ω_{L_{i}}) p (Y_{i} | β, σ^{2}, b_{i}, X, Z)}{p (b_{i}^{(ℓ)} | μ_{L_{i}}, Ω_{L_{i}}) p (Y_{i} | β, σ^{2}, b_{i}^{(ℓ)}, X, Z)}\} .

The variance,

σ_{b}^{2}

, can be chosen such that the average acceptance rate is approximately 0.25 or more.

Then, we can obtain a series of sample observations—

{(β^{(l)}, {σ^{2}}^{(l)}, b^{(l)}) : l = 1, 2, \dots, L}

—via the above iterative process. Then, Bayesian estimates of

β, σ^{2}

and

b_{i}

for given i can be obtained by sample mean as follows:

\hat{β} = \frac{1}{L} \sum_{ℓ = 1}^{L} β^{(ℓ)}, \hat{σ^{2}} = \frac{1}{L} \sum_{ℓ = 1}^{L} {σ^{2}}^{(ℓ)}, {\hat{b}}_{i} = \frac{1}{L} \sum_{ℓ = 1}^{L} b_{i}^{(ℓ)},

Similarly, the consistent estimates of the posterior covariance matrices of

β

and

σ^{2}

can be obtained via the sample covariance matrices.

4. Numerical Examples

To investigate the behavior of our proposed model and the BLasso method under the Bayesian framework, we conduct four simulation studies and a real example related to a prospective ophthalmology study.

4.1. Simulation Study

In the first simulation study, we assume that, given the random effects

b_{i} = {(b_{i 1}, b_{i 2})}^{T}

, the longitudinal percentage responses,

y_{i j}

, are conditionally independent and each

y_{i j} | b_{i}

(

i = 1, \dots, 100

,

j = 1, \dots, 6

) follows the simplex distribution—that is,

y_{i j} | b_{i} \sim S^{-} (μ_{i j}, σ^{2})

. The conditional mean parameter

μ_{i j} = E (y_{i j} | b_{i})

is specified as follows:

l o g i t (μ_{i j}) = x_{i j}^{T} β + z_{i j}^{T} b_{i} = β_{0} + x_{1 i j} β_{1} + x_{2 i j} β_{2} + x_{3 i j} β_{3} + t_{i j} β_{4} + b_{i 1} + t_{i j} b_{i 2},

where

x_{1 i j}

randomly takes 1 or -1 with equal probability—

x_{2 i j} and x_{3 i j} \overset{i . i . d}{\sim} N (0, 1)

,

t_{i j} = 0.2 j

for

j = 0, \dots, 5

. Moreover, the true values of the parameters are specified as follows:

β = {(β_{0}, \dots, β_{4})}^{T} = {(- 0.45, 0.00, 0.45, 0.00, 0.45)}^{T}

. This implies that a covariate corresponding to 0 is unimportant, and that

σ^{2} = 1

. The true distribution of random effect,

b_{i}

, is assumed to be

b_{i 1} \overset{i . i . d}{\sim} N (0, 0.8), b_{i 2} = b_{i 2}^{*} - 2 with b_{i 2}^{*} \overset{i . i . d}{\sim} Γ (4, 2),

where the random effects cover the symmetric and skewed features with mean 0. A total of 500 Monte Carlo replications were conducted on the basis of the above-simulated setup.

In the second simulation study, 500 simulated datasets were generated by using the same setup as specified in the first simulation study except for the distribution of random effects. That is, random effects are distributed as

b_{i 1} \overset{i . i . d}{\sim} 0.6 N (- 0.8, 0.5) + 0.4 N (1.2, 0.5) and b_{i 2} \overset{i . i . d}{\sim} 0.6 N (0.8, 0.5) + 0.4 N (- 1.2, 0.5),

where random effects have bimodal features with 0.

Fore each dataset generated from the abovementioned two simulation studies, the hybrid algorithm combining the block Gibbs sampler and the Metropolis–Hastings algorithm in conjunction with the BLasso method and the stick-break prior of CDPMM was used to produce Bayesian estimates of parameters and random effects as well as simultaneously select the important covariates. To investigate the convergence for these Bayesian algorithms, we computed the estimated potential scale reduction (EPSR, proposed by Gelman et al. [27]) of parameters via three parallel sequences of observations based on three different initial values. It can be seen from Figure 1 that the EPSR values were less than 1.2 after about 7000 iterations in both simulations for all the test runs. Therefore,

L = 5000

observations collected after 7000 iterations were used to compute the simulation results for all replications. Results obtained under simulations 1 and 2 are reported in Table 1, where ‘Bias’ is the difference between the true value and the mean of the estimates based on 500 replications; ‘RMS’ is the root mean square of differences between the true values and their corresponding estimates based on 500 replications. Compared with the Lasso from the frequentist view, the BLasso would not shrink the non-significant elements of

β

exactly toward 0 since the sampling-based method is involved. Thus, as suggested by Tang et al. [20], the criterion for variable selection is that a coefficient is viewed as 0 if its 95% confidence interval includes zeros. In Table 1, ‘F0’ denotes the proportion that the number of 95% confidence interval for regression parameter including zero in 500 replications is divided by 500. The larger the values of F0 corresponding to non-significant regression parameters, and the smaller the values of F0 corresponding to significant parameters, the better the performance of the posited model.

Examination of Table 1 indicated that (i) the Bayesian estimates of the unknown parameters

β

and

σ^{2}

were reasonably accurate under the two abovementioned simulation studies since their absolute biases were less than 0.10 and their RMS values were less than 0.16; (ii) BLasso could correctly identify the zero and nonzero coefficients in most cases because the F0 values corresponding to important covariates were less than 10%, whilst the F0 values corresponding to unimportant covariates were near to 90%. On the other hand, to investigate the effectiveness of using the CDPMM prior for the random effects, we introduced the following RMSE (root of mean squared error) criterion in term of random effects,

RMSE ({\hat{b}}_{i}) = \sqrt{\frac{1}{2 L} \sum_{m = 1}^{2} \sum_{l = 1}^{L} {(p_{m} (h_{l}) - {\hat{p}}_{m} (h_{l}))}^{2}},

where

p_{m} (\cdot)

and

{\hat{p}}_{m} (\cdot)

denote, respectively, the true density function for random effect

b_{i m}

and kernel density estimation for the estimated values of random effect

{{\hat{b}}_{i m} : i = 1, \dots, n}

;

h_{l}

is chosen to be the

\frac{l}{L}

th quantile of the dataset

{{\hat{b}}_{i m} : i = 1, \dots, n}

. The sample quantiles for the estimated values of RMSE are reported in Table 2. Furthermore, we chose a typical replication whose RMSE value is equal to the median in the 500 replications. Therefore, on the basis of the selected replication, the estimated densities of

b_{i 1}

and

b_{i 2}

based on the CDPMM prior against their corresponding true densities are plotted in Figure 2 and Figure 3, which indicated that the finite mixture of normal distributions can flexibly capture the symmetric, skewed and bimodal shapes of random effect

b_{i}

. From Table 2, based on the results of 500 replication in both simulations, the estimated means and standard deviation (SD) of random effects

b_{i 1}

and

b_{i 2}

is approximate to their corresponding true values, the 25%, 50% and 75% quantiles of

RMSE ({\hat{b}}_{i})

values are small and close enough, which indicated that it is robust to apply CDPMM method to estimate random effects. All these findings indicated that (i) our proposed Bayesian procedure could capture the true information of

b_{i}

well, regardless of their true distributions and forms, and (ii) BLasso could identify the true model with a high probability.

To compare the performance of the CDPMM prior for the random effects with the discrete DP given by Ishwaran and Zarepour [26] and Yang et al. [25], we conducted the following third simulation study. In this simulation study, 500 simulated datasets were generated by using the same setup as specified in the first simulation study, and fitted by the model with the discrete DP for the random effects. In the fourth simulation, we reanalyzed the aforementioned 500 datasets generated in the second simulation by using a parametric Bayesian approach with the random effects distribution specified by a multivariate normal distribution. The aim of this simulation was to compare the semiparametric approach based on the CDPMM prior with the parametric approach based on the Gaussian prior from the Bayesian perspective. Results obtained under the third and fourth simulations are reported in Table 3. Our programs were written in Matlab. It roughly took 119.3 s and 186.9 s in an Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz (Intel, Santa Clara, CA, USA) serve to run 12,000 iterations for our proposed CDPMM and discrete DP, respectively; this indicates that the CDPMM method is much more efficient than the discrete DP in our considered simulations. It can be seen from Table 2 and Table 3 that (i) our proposed CDPMM and discrete DP methods have the same performance in term of the ’Bias’ and ’RMS’ values, but the F0 values corresponding to the non-significant parameters under the proposed CDPMM prior are higher than those under the discrete DP prior; (ii) the RMS values and the correct rates of variable selection based on the F0 values under the semiparametric CDPMM prior are better than those under the parametric Gaussian prior.

4.2. Real Example

In this section, the application of our proposed approach to the skewed longitudinal proportional data is illustrated by the analysis of a prospective ophthalmology study [28] from the Bayesian perspective. The prospective ophthalmology data used in the study were obtained from the Supplementary Materials of a paper by Song and Tan [29] and were available from https://biometrics.biometricsociety.org/home/archive/supplementary-materials, accessed on 5 september 2022. This prospective ophthalmology study described that the eyes of 31 patients before surgery were injected by three gas concentration levels of

C_{3} F_{8}

, and all patients after surgery were followed up three–eight times over a three-month period. The outcome variable was the percentage of remaining gas volume relative to the initial volume of gas injected. These longitudinal proportional data from a prospective ophthalmology study were analyzed by Qiu et al. [1], Song and Tan [29] and Song et al. [30], respectively. However, these authors did not consider to conduct variable selection in the analysis of this dataset. Our scientific interest in this study is to investigate the effect of three initial gas concentration levels of

C_{3} F_{8}

and time on the percentage of remaining gas volume, while accounting for selecting the important covariates based on the BLasso method. Let the response

y_{i j}

denote the percentage of gas left in the eye for patient i at the jth follow-up day,

t_{i j}

, and

y_{i j} | b_{i} \sim S^{-} (μ_{i j}, σ^{2})

. Thus, the conditional mean in our proposed semiparametric mixed-effects model is given by

l o g i t (μ_{i j}) = β_{0} + log (t_{i j}) β_{1} + {log}^{2} (t_{i j}) β_{2} + β_{2} w_{i j} + b_{i 1} + log (t_{i j}) b_{i 2}

where

t_{i j}

is the time covariate for days after the gas injection;

w_{i j}

is the covariate of gas concentration levels equal to −1, 0 and 1, corresponding to the concentration levels of 15%, 20% and 25%, respectively; and random effects

b_{i 1}

and

b_{i 2}

are specified by CDPMM in (4), which characterize the effect fluctuations of interception and logarithmic time among patients.

The abovementioned MCMC algorithm was used to produce the joint Bayesian estimates of parameters and random effects in this real example. In the implementation of MCMC process, the hyperparameter values were taken to be the same as those given in simulation. Similarly, we used the EPSR method given in simulation to investigate the convergence for these algorithms. The EPSR values of all parameters against the iteration numbers was plotted in Figure 4, which indicated that the MCMC algorithm converged after 4000 iterations since their EPSR values were less than 1.2 after about 4000 iterations. Hence,

L = 4000

observations collected after 4000 iterations were used to calculate Bayesian estimates of parameters and random effects. Results are reported in Table 4 and Figure 5, which indicated that (i) the estimated densities of random effects

b_{i 1}

and

b_{i 2}

were bimodal and skewed, which indicated that traditional normality assumption for random effects is inappropriate in this real example; (ii) the square of logarithmic time (

{log}^{2} (t_{i j})

) was detected to be an important covariate with a significantly negative effect on the percentage of gas left in the eye, since its corresponding 95% confidence interval did not include zero. The gas concentration levels (

w_{i j}

) and the logarithmic time (

{log}^{2} (t_{i j})

) were insignificant at significance level

0.05

because their 95% confidence interval included zero.

5. Conclusions

In this paper, we introduced a new semiparametric simplex mixed-effects models with the random effects following the centered Dirichlet process mixture model (CDPMM). The advantages of the proposed model based on CDPMM are the following: (i) it can capture the features of skewed and bimodal longitudinal proportional data; (ii) it can characterize absolutely continuous distributions for random effects. The novelty of our approach is that we adopted the BLasso procedure to simultaneously estimate parameters of interest, provide credible intervals (CIs) for parameters and conduct both shrinkage and variable selection for our considered models. A hybrid algorithm combining the Gibbs sampler and the MH algorithm was used to simultaneously obtain Bayesian estimates of unknown parameters, random effects and their standard errors and credible intervals. Empirical results show that (i) the proposed semiparametric Bayesian method provides quite accurate estimates of parameters (see Table 1); (ii) the average frequencies of correctly identifying unimportant predictors were near to 90%; (iii) CDPMM can effectively capture the potential features of normal, gamma and mixture normal distributions (see Table 2 and Figure 2, Figure 3 and Figure 5).

Author Contributions

Conceptualization, A.T. and X.D.; methodology, X.D. and A.T.; software, A.T. and Y.Z.; validation, A.T., X.D. and Y.Z.; formal analysis, A.T. and X.D.; investigation, A.T., X.D. and Y.Z.; Preparation of the original work draft, X.D. and A.T.; visualization, A.T. and Y.Z.; supervision, funding acquisition, A.T., X.D. and Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 11961079, No. 12161014, No. 11761016), the Guizhou Provincial Science and Technology Project ([2020]1Y009), the Natural Science Research Project of Education Department of Guizhou Province (KY[2021]134), the Project of High Level Creative Talents in Guizhou Province of China, and Guiyang University Multidisciplinary Team Construction Projects in 2021[2021-xk04].

Data Availability Statement

The research data are available on the website: https://biometrics.biometricsociety.org/home/archive/supplementary-materials, accessed on 5 september 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

Qiu, Z.G.; Song, P.X.K.; Tan, M. Simplex mixed-effects models for longitudinal proportional data. Scand. J. Stat. 2008, 35, 577–596. [Google Scholar] [CrossRef]
Zhang, W.; Wei, H. Maximum Likelihood Estimation for Simplex Distribution Nonlinear Mixed Models via the Stochastic Approximation Algorithm. Rocky Mt. J. Math. 2008, 38, 1863–1875. [Google Scholar] [CrossRef]
Zhao, Y.Y.; Xu, D.K.; Duan, X.D.; Dai, L. Bayesian estimation of simplex distribution nonlinear mixed models for longitudinal data. Int. J. Appl. Math. Stat. 2014, 52, 1–10. [Google Scholar]
Bonat, W.H.; Lopes, J.E.; Shimakura, S.E.; Ribeiro, P.J., Jr. Likelihood analysis for a class of simplex mixed models. Chil. J. Stat. 2018, 8, 3–7. [Google Scholar]
Quintero, F.O.L. Sensitivity analysis for variance parameters in Bayesian simplex mixed models for proportional data. Commun. Stat. Simul. Comput. 2017, 46, 5212–5228. [Google Scholar] [CrossRef]
Kleinman, K.P.; Ibrahim, J.G. A semiparametric Bayesian approach to generalized linear mixed models. Stat. Med. 1998, 17, 2579–2596. [Google Scholar] [CrossRef]
Tang, N.S.; Duan, X.D. A semiparametric Bayesian approach to generalized partial linear mixed models for longitudinal data. Comput. Stat. Data Anal. 2012, 56, 4348–4365. [Google Scholar] [CrossRef]
Tang, N.S.; Zhao, Y.Y. Semi-parametric Bayesian analysis of nonlinear reproductive dispersion mixed models for longitudinal data. J. Multivar. Anal. 2013, 115, 68–83. [Google Scholar] [CrossRef]
Zhao, Y.Y.; Xu, D.K.; Duan, X.D.; Du, J. A semiparametric Bayesian approach to binomial distribution logistic mixed-effects models for longitudinal data. J. Stat. Comput. Simul. 2022, 92, 1438–1456. [Google Scholar]
Duan, X.D.; Fung, W.K.; Tang, N.S. Bayesian semiparametric reproductive dispersion mixed models for non-normal longitudinal data: Estimation and case influence analysis. J. Stat. Comput. Simul. 2017, 87, 1925–1939. [Google Scholar] [CrossRef]
Hocking, R.R. The analysis and selection of variables in linear regression. Biometrics 1976, 32, 1–51. [Google Scholar] [CrossRef]
Kass, R.E.; Raftery, A.E. Bayes factors. J. Am. Stat. Assoc. 1995, 90, 773–795. [Google Scholar] [CrossRef]
Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control. 1974, 19, 716–723. [Google Scholar] [CrossRef]
Spiegelhalter, D.J.; Best, N.; Carlin, B.P.; Linde, A. Bayesian measures of model complexity and fit (with discussion). J. R. Stat. Soc. Ser. B 2002, 64, 583–639. [Google Scholar] [CrossRef] [Green Version]
Robert, T. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 1996, 58, 267–288. [Google Scholar]
Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 2005, 61, 301–320. [Google Scholar] [CrossRef] [Green Version]
Zou, H. The Adaptive Lasso and Its Oracle Properties. J. Am. Stat. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef] [Green Version]
Park, T.; Casella, G. The Bayesian Lasso. J. Am. Statal Assoc. 2008, 103, 681–686. [Google Scholar] [CrossRef]
Guo, R.; Zhu, H.; Chow, S.M.; Ibrahim, J.G. Bayesian lasso for semiparametric structural equation models. Biometrics 2012, 68, 567–577. [Google Scholar] [CrossRef]
Tang, A.M.; Zhao, X.; Tang, N.S. Bayesian variable selection and estimation in semiparametric joint models of multivariate longitudinal and survival data. Biom. J. 2017, 59, 57–78. [Google Scholar] [CrossRef]
Erp, S.V.; Oberski, D.L.; Mulder, J. Shrinkage priors for Bayesian penalized regression. J. Math. Psychol. 2019, 89, 31–50. [Google Scholar] [CrossRef] [Green Version]
Barndorff-Nielsen, O.E.; Jørgensen, B. Some parametric models on the simplex. J. Multivar. Anal. 1991, 39, 106–116. [Google Scholar] [CrossRef] [Green Version]
Ohlssen, D.I.; Sharples, L.D.; Spiegelhalter, D.J. Flexible random-effects models using Bayesian semiparametric models: Applications to institutional comparisons. Stat. Med. 2007, 26, 2088–2112. [Google Scholar] [CrossRef] [PubMed]
Sethuraman, J. A constructive definition of Dirichlet priors. Stat. Sin. 1994, 4, 639–650. [Google Scholar]
Yang, M.G.; Dunson, D.B.; Baird, D. Semiparametric Bayes hierarchical models with mean and variance constraints. Comput. Stat. Data Anal. 2010, 54, 2172–2186. [Google Scholar] [CrossRef]
Ishwaran, H.; Zarepour, M. Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models. J. Am. Stat. Assoc. 2000, 87, 371–390. [Google Scholar] [CrossRef]
Gelman, A. Inference and monitoring convergence. In Markov Chain Monte Carlo in Practice; Gilks, W.R., Richardson, S., Spiegelhalter, D.J., Eds.; Chapman and Hall: London, UK, 1996. [Google Scholar]
Meyers, S.M.; Ambler, J.S.; Tan, M.; Werner, J.C.; Huang, S.S. Variation of perfluorpropane disapperance after vitrectomy. Retina 1992, 4, 359–363. [Google Scholar] [CrossRef]
Song, P.X.K.; Tan, M. Marginal models for longitudinal continuous proportional data. Biometrics 2000, 56, 496–502. [Google Scholar] [CrossRef]
Song, P.X.K.; Qiu, Z.G.; Tan, M. Modelling heterogeneous dispersion in marginal models for longitudinal continuous proportional data. Biom. J. 2004, 46, 540–553. [Google Scholar] [CrossRef]

Figure 1. EPSR values of all parameters against iteration numbers for a randomly selected replication in the first simulation (left panel) and second simulation (right panel).

Figure 2. Estimated densities versus true densities for random effects

b_{i 1}

and

b_{i 2}

in the first simulation.

Figure 2. Estimated densities versus true densities for random effects

b_{i 1}

and

b_{i 2}

in the first simulation.

Figure 3. Estimated densities versus true densities for random effects

b_{i 1}

and

b_{i 2}

in the second simulation.

Figure 3. Estimated densities versus true densities for random effects

b_{i 1}

and

b_{i 2}

in the second simulation.

Figure 4. EPSR values of all parameters against iteration numbers in the ophthalmology study.

Figure 5. Estimated densities for random effects

b_{i 1}

and

b_{i 2}

in the ophthalmology study.

Figure 5. Estimated densities for random effects

b_{i 1}

and

b_{i 2}

in the ophthalmology study.

Table 1. Bayesian estimates of parameters in the first and second simulation studies.

	Simulation 1			Simulation 2
Par.	True	Bias	RMS	F0 (%)	Bias	RMS	F0 (%)
$β_{0}$	−0.45	0.054	0.117	1.60	0.031	0.143	5.40
$β_{1}$	0.00	0.007	0.109	89.20	0.002	0.105	88.20
$β_{2}$	0.45	0.017	0.116	1.40	0.018	0.132	3.00
$β_{3}$	−0.00	−0.001	0.141	94.60	−0.003	0.153	93.40
$β_{4}$	0.45	−0.074	0.136	3.80	−0.051	0.146	8.20
$σ^{2}$	1.00	0.006	0.070	—–	0.009	0.073	—–

Note: Bias denotes the difference between the true value and the mean of the estimates based on 500 replications; RMS denotes the root mean square of differences between the true values and their corresponding estimates based on 500 replications; F0 denotes the proportion of the 95% confidence interval for regression parameter including zero in 500 replications.

Table 2. Estimated means, standard deviations and RMSE quantile of random effects in the first and second simulation studies.

Random Effects	Simulation 1				Simulation 2
	Mean	Est Mean	SD	Est SD	Mean	Est Mean	SD	Est SD
$b_{i 1}$	0.000	0.040	0.894	0.831	0.000	−0.030	1.100	1.032
$b_{i 2}$	0.000	0.063	1.000	0.903	0.000	0.031	1.100	0.995
	Quantile of Simulation 1				Quantile of Simulation 2
	5%	25%	75%		5%	25%	75%
RMSE( $b_{i}$ )	0.031	0.039	0.050		0.079	0.084	0.095

Note: ’Mean’ denotes true empirical mean of the distribution; ’Est mean’ denotes mean of the posterior samples; ’SD’ denotes true empirical standard deviation of the distribution; ’Est SD’ denotes standard deviation of the posterior samples.

Table 3. Bayesian estimates of parameters in the third and fourth simulation studies.

		Simulation 3			Simulation 4
Par.	True	Bias	RMS	F0 (%)	Bias	RMS	F0 (%)
$β_{0}$	−0.45	0.005	0.101	1.20	0.064	0.151	8.40
$β_{1}$	0.00	0.000	0.104	86.60	−0.009	0.130	84.40
$β_{2}$	0.45	−0.001	0.123	1.40	0.033	0.148	3.80
$β_{3}$	−0.00	0.005	0.166	89.20	0.007	0.185	90.00
$β_{4}$	0.45	−0.003	0.119	2.00	−0.068	0.143	11.40
$σ^{2}$	1.00	0.093	0.134	—–	0.006	0.078	—–

Note: Bias denotes the difference between the true value and the mean of the estimates based on 500 replications; RMS denotes the root mean square of differences between the true values and their corresponding estimates based on 500 replications; F0 denotes the proportion of the 95% confidence interval for regression parameter including zero in 500 replications.

Table 4. Bayesian estimates (BEs) and standard deviations (SDs) and

95 %

credible intervals (CIs) for parameters in the ophthalmology study.

Table 4. Bayesian estimates (BEs) and standard deviations (SDs) and

95 %

credible intervals (CIs) for parameters in the ophthalmology study.

Par.	BE	SD	CI
$β_{0}$	2.579	0.257	(2.200, 3.031)
$β_{1}$	0.093	0.172	(−0.232, 0.427)
$β_{2}$	−0.322	0.045	(−0.412, −0.241)
$β_{3}$	0.368	0.281	(−0.131, 0.697)
$σ^{2}$	9.664	1.317	(7.406, 12.516)

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tang, A.; Duan, X.; Zhao, Y. Bayesian Variable Selection and Estimation in Semiparametric Simplex Mixed-Effects Models with Longitudinal Proportional Data. Entropy 2022, 24, 1466. https://doi.org/10.3390/e24101466

AMA Style

Tang A, Duan X, Zhao Y. Bayesian Variable Selection and Estimation in Semiparametric Simplex Mixed-Effects Models with Longitudinal Proportional Data. Entropy. 2022; 24(10):1466. https://doi.org/10.3390/e24101466

Chicago/Turabian Style

Tang, Anmin, Xingde Duan, and Yuanying Zhao. 2022. "Bayesian Variable Selection and Estimation in Semiparametric Simplex Mixed-Effects Models with Longitudinal Proportional Data" Entropy 24, no. 10: 1466. https://doi.org/10.3390/e24101466

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bayesian Variable Selection and Estimation in Semiparametric Simplex Mixed-Effects Models with Longitudinal Proportional Data

Abstract

1. Introduction

2. Model and Notation

3. Bayesian Analysis of Model

4. Numerical Examples

4.1. Simulation Study

4.2. Real Example

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI