A Logit Model for Bivariate Binary Responses

Purhadi, Purhadi; Fathurahman, M.

doi:10.3390/sym13020326

Open AccessArticle

A Logit Model for Bivariate Binary Responses

by

Purhadi Purhadi

¹

and

M. Fathurahman

^2,*

¹

Department of Statistics, Faculty of Science and Data Analytics, Institut Teknologi Sepuluh Nopember, Surabaya 60111, Indonesia

²

Department of Statistics, Mulawarman University, Samarinda 75123, Indonesia

^*

Author to whom correspondence should be addressed.

Symmetry 2021, 13(2), 326; https://doi.org/10.3390/sym13020326

Submission received: 24 January 2021 / Revised: 6 February 2021 / Accepted: 7 February 2021 / Published: 16 February 2021

(This article belongs to the Section Mathematics)

Download

Browse Figure

Versions Notes

Abstract

:

This article provides a bivariate binary logit model and statistical inference procedures for parameter estimation and hypothesis testing. The bivariate binary logit (BBL) model is an extension of the binary logit model that has two correlated binary responses. The BBL model responses were formed using a 2 × 2 contingency table, which follows a multinomial distribution. The maximum likelihood and Berndt–Hall–Hall–Hausman (BHHH) methods were used to obtain the BBL model. Hypothesis testing of the BBL model contains the simultaneous test and the partial test. The test statistics of the simultaneous test and the partial test were determined using the maximum likelihood ratio test method. The likelihood ratio statistics of the simultaneous test and the partial test were approximately asymptotically chi-square distributed with

3 p

degrees of freedom. The BBL model was applied to a real dataset, and the BBL model with the single covariate was better than the BBL model with multiple covariates.

Keywords:

logit model; bivariate binary responses; maximum likelihood; BHHH; maximum likelihood ratio test

1. Introduction

The logit model is a model that is often used for modeling categorical data in various research fields. Several studies have recently developed logit models for multiple correlated responses. McCullagh and Nelder [1] introduced a multivariate logistic transform used to construct logit models with two or more correlated responses. A multivariate logistic transform, including the numerical optimization methods, has been proposed in [2,3,4,5,6]. Lipsitz, Laird, and Harrington [7] examined the maximum likelihood (ML) method for binary data models, which connects the probability of success at each time point to a set of covariates. Liang, Zeger, and Qaqish [8] discussed the regression modeling of the marginal means of the responses using the generalized estimating equation approach, wherein there are dependencies between responses. An alternating logit model for jointly regressing the responses on covariates and modeling the dependencies among responses in the framework of pairwise odds ratios was proposed by [9]. Cessie and Houwelingen [10] modeled regression for correlated binary responses, in which the form of marginal response probabilities is the logit link function. Lang and Agresti [11] considered the model-fitting methods for analyzing the parameters simultaneously and parsimoniously. The ML estimator properties of the kappa coefficient in the bivariate binary logistic model using the small and moderate sized samples through Monte Carlo simulation were investigated by [12,13]. Molenberghs and Lesaffre [14] presented a simple generalized linear model formulation for marginal and association modeling of multivariate categorical data. The specifications of the association models in [15], in which the dependence ratios contrast with other models for a multivariate binary response that is specified by odds ratios or correlation coefficients, were employed by [16].

Studies have also proposed both the conditional and marginal models. A flexible conditional model for a multivariate binary response vector with covariates was examined by [17]. Islam, Chowdhury, and Briollais [18] developed a new simple procedure to construct the conditional and marginal models for the bivariate binary responses. The marginal and conditional probabilities of the responses were expressed as functions of covariates. El-Sayed, Islam, and Alzaid [19] provided the estimation and test procedures for association measures in the correlated binary data. A generalized approach using both the conditional and marginal models was demonstrated by [20,21]. The responses of the model have a bivariate Bernoulli distribution. The ML and Newton–Raphson methods were used to estimate the model’s parameters, whereas the likelihood ratio test method was used to test the parameters’ significance. The properties of the ML estimators of the regression parameters have also been investigated.

Other models have also been developed. Sinha, Laird, and Fitzmaurice [22] extended the univariate logit model in [23] to the case of a logit model for multivariate-correlated responses with missing covariates and observed auxiliary information. A robust model for misclassified correlated binary responses was described by [24]. O’Brien and Dunson [25] provided an exact Bayesian analysis of a marginal logistic model. The multivariate logistic regression model in a framework of the geographically-weighted regression was proposed by [26,27].

Corresponding to the previous studies, in this study we constructed a logit model, namely, the bivariate binary logit (BBL) model, which has two correlated binary responses. Following [1,2], the BBL model’s responses follow a multinomial distribution. Therefore, the ML method can be used to estimate the BBL model’s parameters. The ML estimator is not closed-form, and it needs an iterative procedure using a numerical optimization method. We used the Berndt–Hall–Hall–Hausman (BHHH) iterative method [28]. However, the BHHH method has not been used in previous studies. On the other hand, the BHHH method can be used as an alternative to the numerical optimization method when the elements of the Hessian matrix are unavailable. Following [29], the maximum likelihood ratio test (MLRT) method was used to test the significance of parameters both simultaneously and partially. The performance of the BBL model was evaluated using an empirical study.

This article is organized as follows. In Section 2, we describe the BBL model specifically. Section 3 investigates the estimation of the BBL model’s parameters using the ML and BHHH methods. Hypothesis testing of the BBL model is discussed in Section 4. Section 5 demonstrates an application of the BBL model to real data. The conclusions are given in Section 6.

2. Bivariate Binary Logit Model

Bivariate binary logit (BBL) models are one of the families of multivariate logit models and are used to model the relationships between two correlated binary responses with one or more covariates. Let

Y_{1}

and

Y_{2}

be two bivariate binary responses and

y = {[\begin{matrix} Y_{11} & Y_{10} & Y_{01} & Y_{00} \end{matrix}]}^{T}

be a vector of responses. The elements of

y

have the probabilities of

γ_{11}

,

γ_{10}

,

γ_{01}

, and

γ_{00}

, respectively, which are presented in Table 1.

According to Fathurahman, Purhadi, Sutikno, and Ratnasari [27], the BBL model responses in Table 1 follow a multinomial distribution. Therefore, the joint probability function of the responses can be defined as follows:

P (Y_{11} = y_{11}, Y_{10} = y_{10}, Y_{01} = y_{01}, Y_{00} = y_{00}) = \prod_{q = 0}^{1} \prod_{r = 0}^{1} γ_{q r}^{y_{q r}},

(1)

where

0 < γ_{q r} < 1

;

q, r = 0, 1

;

y_{q r} = 0, 1

;

y_{00} = 1 - y_{11} - y_{10} - y_{01}

; and

γ_{00} = 1 - γ_{11} - γ_{10} - γ_{01}

.

q

and

r

are the values of the responses.

y_{q r}

is the value of

Y_{q r}

, which represents the elements of the vector of responses.

γ_{q r} = P (Y_{1} = q, Y_{2} = r)

is the joint probability of the responses.

γ_{1} = P (Y_{1} = 1)

and

γ_{2} = P (Y_{2} = 1)

are the marginal probabilities of

Y_{1}

and

Y_{2}

, respectively.

Let

x = {[\begin{matrix} 1 & X_{1} & X_{2} & \dots & X_{p} \end{matrix}]}^{T}

be the vector of covariates, which is

(p + 1)

-dimensional. Then the BBL model is expressed as follows:

τ_{1} (x) = logit [γ_{1} (x)] = \log [\frac{γ_{1} (x)}{1 - γ_{1} (x)}] = x^{T} θ_{1}, τ_{2} (x) = logit [γ_{2} (x)] = \log [\frac{γ_{2} (x)}{1 - γ_{2} (x)}] = x^{T} θ_{2}, τ_{3} (x) = \log [ψ (x)] = \log [\frac{γ_{11} (x) γ_{00} (x)}{γ_{10} (x) γ_{01} (x)}] = x^{T} θ_{3},

(2)

where

θ_{1}

,

θ_{2}

, and

θ_{3}

are vectors of parameters,

γ_{1} (x)

and

γ_{2} (x)

are marginal probabilities of responses, and

ψ (x)

is the odds ratio of responses depending on covariates, which shows that the responses are correlated.

The vectors of parameters are symbolized by

θ_{1} = {[\begin{matrix} θ_{01} & θ_{11} & θ_{21} & \dots & θ_{p 1} \end{matrix}]}^{T}, θ_{2} = {[\begin{matrix} θ_{02} & θ_{12} & θ_{22} & \dots & θ_{p 2} \end{matrix}]}^{T}, θ_{3} = {[\begin{matrix} θ_{03} & θ_{13} & θ_{23} & \dots & θ_{p 3} \end{matrix}]}^{T} .

(3)

The marginal probabilities of responses are defined as follows:

P (Y_{1} = 1 | x) = γ_{1} (x) = \frac{\exp (x^{T} θ_{1})}{1 + \exp (x^{T} θ_{1})}, P (Y_{2} = 1 | x) = γ_{2} (x) = \frac{\exp (x^{T} θ_{2})}{1 + \exp (x^{T} θ_{2})} .

(4)

The joint probability of

γ_{11} (x)

in Equation (2) is defined by

P (Y_{1} = 1, Y_{2} = 1 | x) = γ_{11} (x) = {\begin{array}{l} \frac{1}{2} {(ψ (x) - 1)}^{- 1} (a - \sqrt{a^{2} + b}), & ψ (x) \neq 1 \\ γ_{1} (x) γ_{2} (x), & ψ (x) \neq 1 \end{array},

(5)

where

a = 1 + (γ_{1} (x) + γ_{2} (x)) (ψ (x) - 1)

and

b = 4 ψ (x) (1 - ψ (x)) γ_{1} (x) γ_{2} (x) .

If

ψ (x) = 1

, then the responses are independent [30].

Based on Table 1 and Equation (5), the probabilities of

γ_{10} (x)

,

γ_{01} (x)

, and

γ_{00} (x)

in Equation (2) are as follows:

γ_{10} (x) = γ_{1} (x) - γ_{11} (x), γ_{01} (x) = γ_{2} (x) - γ_{11} (x), γ_{00} (x) = 1 - γ_{11} (x) - γ_{10} (x) - γ_{01} (x) = 1 - γ_{1} (x) - γ_{2} (x) + γ_{11} (x) .

(6)

3. Estimation of the BBL Model

The estimation of the BBL model’s parameters is one of the main results of this study. The BBL model in Equation (2) has

3 (p + 1)

parameters, where

(p + 1)

parameters show the dependencies among responses, and

2 (p + 1)

parameters describe the relationships between responses and covariates. The BBL model’s parameters are denoted by

θ

and expressed as

θ = {[\begin{matrix} θ_{1}^{T} & θ_{2}^{T} & θ_{3}^{T} \end{matrix}]}^{T},

(7)

where

θ_{1}^{T}

,

θ_{2}^{T}

, and

θ_{3}^{T}

are given by Equation (3).

To obtain the parameters estimator of the BBL model in Equation (7), the ML method was employed. Based on the ML method, the estimator of

\hat{θ}

is the value of

θ

, maximized by the likelihood function and the log-likelihood function. The ML estimator can be obtained by determining the first partial derivatives of the log-likelihood function, then equating them to zero.

Based on Equation (2), the likelihood equation contains the interdependence equations, which have a non-explicit form. Therefore, the ML estimator of the BBL model’s parameters was not obtained analytically. The ML estimator was approximated by the likelihood equation’s roots, which were obtained via an iterative process using the BHHH method. Determining the ML estimator of the BBL model’s parameters using the BHHH method needs the gradient vector and the Hessian matrix. In the following, we present Lemmas 1 and 2 for the gradient vector and Hessian matrix, respectively.

Lemma 1.

Let

y_{i} = {[\begin{matrix} Y_{1 i} & Y_{2 i} \end{matrix}]}^{T} = {[\begin{matrix} Y_{11 i} & Y_{10 i} & Y_{01 i} & Y_{00 i} \end{matrix}]}^{T}, i = 1, 2, \dots, n

be a random vector sample that is mutually independent and identical with a multinomial distribution denoted by

y_{i} ~ M U L T (1; γ_{11} (x_{i}), γ_{10} (x_{i}), γ_{01} (x_{i}), γ_{00} (x_{i}))

, where

γ_{11} (x_{i})

,

γ_{10} (x_{i})

,

γ_{01} (x_{i})

, and

γ_{00} (x_{i})

are probabilities of the random variables of

Y_{11 i}

,

Y_{10 i}

,

Y_{01 i}

, and

Y_{00 i}

that contain the parameter

θ

. If the likelihood function of the BBL model is denoted by

ℒ (θ)

, where

θ

is as in Equation (7), then the gradient vector is

g (θ) = {[\begin{matrix} {[\frac{\partial ℓ (θ)}{\partial θ_{1}}]}^{T} & {[\frac{\partial ℓ (θ)}{\partial θ_{2}}]}^{T} & {[\frac{\partial ℓ (θ)}{\partial θ_{3}}]}^{T} \end{matrix}]}^{T},

(8)

where

\frac{\partial ℓ (θ)}{\partial θ_{1}} = \sum_{i = 1}^{n} \frac{1}{Δ_{1 i}} x_{i} [y_{11 i} (\frac{γ_{01 i}}{γ_{2 i}}) + y_{10 i} (\frac{γ_{00 i}}{1 - γ_{2 i}}) - y_{01 i} (\frac{γ_{11 i}}{γ_{2 i}}) - y_{00 i} (\frac{γ_{10 i}}{1 - γ_{2 i}})], \frac{\partial ℓ (θ)}{\partial θ_{2}} = \sum_{i = 1}^{n} \frac{1}{Δ_{1 i}} x_{i} [y_{11 i} (\frac{γ_{10 i}}{γ_{1 i}}) - y_{10 i} (\frac{γ_{11 i}}{γ_{1 i}}) + y_{01 i} (\frac{γ_{00 i}}{1 - γ_{1 i}}) - y_{00 i} (\frac{γ_{01 i}}{1 - γ_{1 i}})], \frac{\partial ℓ (θ)}{\partial θ_{3}} = \sum_{i = 1}^{n} Δ_{2 i} x_{i} [\frac{y_{11 i}}{γ_{11 i}} - \frac{y_{10 i}}{γ_{10 i}} - \frac{y_{01 i}}{γ_{01 i}} + \frac{y_{00 i}}{γ_{00 i}}], Δ_{1 i} = \frac{γ_{11 i} γ_{10 i} γ_{01 i} γ_{00 i}}{γ_{1 i} (1 - γ_{1 i}) γ_{2 i} (1 - γ_{2 i}) Δ_{2 i}}, Δ_{2 i} = {(\frac{1}{γ_{11 i}} + \frac{1}{γ_{10 i}} + \frac{1}{γ_{01 i}} + \frac{1}{γ_{00 i}})}^{- 1} .

Proof of Lemma 1.

Suppose that

(Y_{1 i}, Y_{2 i}) = (Y_{11 i}, Y_{10 i}, Y_{01 i}, Y_{00 i})

is a vector of the random sample that is independently and identically multinomial distributed; then the joint probability is defined by

\begin{matrix} f (y_{i} | θ) = P (Y_{11 i} & = y_{11 i}, Y_{10 i} = y_{10 i}, Y_{01 i} = y_{01 i}, Y_{00 i} = y_{00 i}) \\ = γ_{11 i}^{y_{11 i}} (x_{i}) γ_{10 i}^{y_{10 i}} (x_{i}) γ_{01 i}^{y_{01 i}} (x_{i}) γ_{00 i}^{y_{00 i}} (x_{i}) . \end{matrix}

(9)

As in Equation (9), the likelihood function is as follows:

\begin{matrix} ℒ (θ | y) & = \prod_{i = 1}^{n} f (y_{i} | θ) \\ = \prod_{i = 1}^{n} P (Y_{11 i} = y_{11 i}, Y_{10 i} = y_{10 i}, Y_{01 i} = y_{01 i}, Y_{00 i} = y_{00 i}) \\ = \prod_{i = 1}^{n} γ_{11 i}^{y_{11 i}} (x_{i}) γ_{10 i}^{y_{10 i}} (x_{i}) γ_{01 i}^{y_{01 i}} (x_{i}) γ_{00 i}^{y_{00 i}} (x_{i}) . \end{matrix}

(10)

For simplicity, let

γ_{q r}^{y_{q r}} (x_{i}) = γ_{q r i}^{y_{q r i}}

for

q, r = 0, 1

; then the likelihood function in Equation (10) can be rewritten as

ℒ (θ) = \prod_{i = 1}^{n} (γ_{11 i}^{y_{11 i}} γ_{10 i}^{y_{10 i}} γ_{01 i}^{y_{01 i}} γ_{00 i}^{y_{00 i}}) .

(11)

To obtain the log-likelihood function of the BBL model, both sides of the likelihood function in Equation (11) were transformed by the natural logarithm, which gives

\begin{matrix} ℓ (θ) & = \log ℒ (θ) \\ = \sum_{i = 1}^{n} (y_{11 i} \log γ_{11 i} + y_{10 i} \log γ_{10 i} + y_{01 i} \log γ_{01 i} + y_{00 i} \log γ_{00 i}) . \end{matrix}

(12)

The log-likelihood function in Equation (12) is that the vector of

θ

has

3 (p + 1) -

dimensions. Following the definition in Greene [31], the gradient vector of the log-likelihood function in Equation (12) is

g (θ) = {[\begin{matrix} {[\frac{\partial ℓ (θ)}{\partial θ_{1}}]}^{T} & {[\frac{\partial ℓ (θ)}{\partial θ_{2}}]}^{T} & {[\frac{\partial ℓ (θ)}{\partial θ_{3}}]}^{T} \end{matrix}]}^{T},

(13)

where the vector of

θ

is given by Equation (7).

Regarding the BBL model in Equation (2), we define the vector of

τ

, which is denoted by

τ = {[\begin{matrix} τ_{1} & τ_{2} & τ_{3} \end{matrix}]}^{T}

, where

τ_{1} = τ_{1} (x)

,

τ_{2} = τ_{2} (x)

, and

τ_{3} = τ_{3} (x)

. The vector of the joint probability of

y

is defined by

γ = {[\begin{matrix} γ_{11} & γ_{10} & γ_{01} & γ_{00} \end{matrix}]}^{T}

. Furthermore, the derivative of

τ

with respect to

γ

is denoted by

\partial τ / \partial γ

. To get a symmetrical matrix of

\partial τ / \partial γ

, suppose

τ_{0} = \log γ_{. .}

with

γ_{. .} = \sum_{q = 0}^{1} \sum_{r = 0}^{1} γ_{q r}

; then the vector of

τ

is

τ = {[\begin{matrix} τ_{0} & τ_{1} & τ_{2} & τ_{3} \end{matrix}]}^{T}

. Thus, the matrix of

\partial τ / \partial γ

is

\frac{\partial τ}{\partial γ} = [\begin{matrix} 1 & 1 & 1 & 1 \\ \frac{1}{γ_{1}} & \frac{1}{γ_{1}} & - \frac{1}{1 - γ_{1}} & - \frac{1}{1 - γ_{1}} \\ \frac{1}{γ_{2}} & - \frac{1}{1 - γ_{2}} & \frac{1}{γ_{2}} & - \frac{1}{1 - γ_{2}} \\ \frac{1}{γ_{11}} & - \frac{1}{γ_{10}} & - \frac{1}{γ_{01}} & \frac{1}{γ_{00}} \end{matrix}] .

(14)

The inverse matrix of

\partial τ / \partial γ

in Equation (14) is as follows:

{\frac{\partial τ}{\partial γ}}^{- 1} = [\begin{matrix} γ_{11} & \frac{γ_{11} γ_{01}}{γ_{2} Δ_{1}} & \frac{γ_{11} γ_{10}}{γ_{1} Δ_{1}} & Δ_{2} \\ γ_{10} & \frac{γ_{10} γ_{00}}{(1 - γ_{2}) Δ_{1}} & - \frac{γ_{11} γ_{10}}{γ_{1} Δ_{1}} & - Δ_{2} \\ γ_{01} & - \frac{γ_{11} γ_{01}}{γ_{2} Δ_{1}} & \frac{γ_{01} γ_{00}}{(1 - γ_{1}) Δ_{1}} & - Δ_{2} \\ γ_{00} & - \frac{γ_{10} γ_{00}}{(1 - γ_{2}) Δ_{1}} & \frac{γ_{01} γ_{00}}{(1 - γ_{1}) Δ_{1}} & Δ_{2} \end{matrix}],

(15)

where

Δ_{1} = \frac{γ_{11} γ_{10} γ_{01} γ_{00}}{γ_{1} (1 - γ_{1}) γ_{2} (1 - γ_{2}) Δ_{2}}

and

Δ_{2} = {(\frac{1}{γ_{11}} + \frac{1}{γ_{10}} + \frac{1}{γ_{01}} + \frac{1}{γ_{00}})}^{- 1} .

The gradient vector of the log-likelihood function in Equation (12) can be written as

g (θ) = \frac{\partial ℓ (θ)}{\partial θ} .

(16)

In relation to Equations (13)–(15) and the chain rule of derivatives, the elements of the gradient vector in Equation (16) can be obtained as follows:

\begin{matrix} \frac{\partial ℓ (θ)}{\partial θ_{1}} & = \sum_{i = 1}^{n} (\frac{y_{11 i}}{γ_{11 i}} (\frac{\partial γ_{11 i}}{\partial θ_{1}}) + \frac{y_{10 i}}{γ_{10 i}} (\frac{\partial γ_{10 i}}{\partial θ_{1}}) + \frac{y_{01 i}}{γ_{01 i}} (\frac{\partial γ_{01 i}}{\partial θ_{1}}) + \frac{y_{00 i}}{γ_{00 i}} (\frac{\partial γ_{00 i}}{\partial θ_{1}})) \\ = \sum_{i = 1}^{n} \frac{1}{Δ_{1 i}} x_{i} (y_{11 i} (\frac{γ_{01 i}}{γ_{2 i}}) + y_{10 i} (\frac{γ_{00 i}}{1 - γ_{2 i}}) - y_{01 i} (\frac{γ_{11 i}}{γ_{2 i}}) - y_{00 i} (\frac{γ_{10 i}}{1 - γ_{2 i}})); \end{matrix}

(17)

\begin{matrix} \frac{\partial ℓ (θ)}{\partial θ_{2}} & = \sum_{i = 1}^{n} (\frac{y_{11 i}}{γ_{11 i}} (\frac{\partial γ_{11 i}}{\partial θ_{2}}) + \frac{y_{10 i}}{γ_{10 i}} (\frac{\partial γ_{10 i}}{\partial θ_{2}}) + \frac{y_{01 i}}{γ_{01 i}} (\frac{\partial γ_{01 i}}{\partial θ_{2}}) + \frac{y_{00 i}}{γ_{00 i}} (\frac{\partial γ_{00 i}}{\partial θ_{2}})) \\ = \sum_{i = 1}^{n} \frac{1}{Δ_{1 i}} x_{i} (y_{11 i} (\frac{γ_{10 i}}{γ_{1 i}}) - y_{10 i} (\frac{γ_{11 i}}{γ_{1 i}}) + y_{01 i} (\frac{γ_{00 i}}{1 - γ_{1 i}}) - y_{00 i} (\frac{γ_{01 i}}{1 - γ_{1 i}})); \end{matrix}

(18)

\begin{matrix} \frac{\partial ℓ (θ)}{\partial θ_{3}} & = \sum_{i = 1}^{n} (\frac{y_{11 i}}{γ_{11 i}} (\frac{\partial γ_{11 i}}{\partial θ_{3}}) + \frac{y_{10 i}}{γ_{10 i}} (\frac{\partial γ_{10 i}}{\partial θ_{3}}) + \frac{y_{01 i}}{γ_{01 i}} (\frac{\partial γ_{01 i}}{\partial θ_{3}}) + \frac{y_{00 i}}{γ_{00 i}} (\frac{\partial γ_{00 i}}{\partial θ_{3}})) \\ = \sum_{i = 1}^{n} Δ_{2 i} x_{i} (\frac{y_{11 i}}{γ_{11 i}} - \frac{y_{10 i}}{γ_{10 i}} - \frac{y_{01 i}}{γ_{01 i}} + \frac{y_{00 i}}{γ_{00 i}}), \end{matrix}

(19)

where

Δ_{1 i}

and

Δ_{2 i}

, for

i = 1, 2, \dots, n

, given in Equation (15). □

Lemma 2.

If the log-likelihood function of the BBL model is

ℓ (θ)

and the vector of

θ

is the BBL model’s parameters, then the Hessian matrix of

ℓ (θ)

is

H (θ) = - \frac{1}{n} [g^{T} (θ) g (θ)],

(20)

where

n

is the sample size.

Proof of Lemma 2.

The BBL model’s parameters

(θ)

and the log-likelihood function

(ℓ (θ))

were given in Equations (7) and (12), respectively. Based on Lemma 1, the gradient vector of the log-likelihood function

ℓ (θ)

is

g (θ)

. According to Greene [31], the Hessian matrix can be obtained by the Berndt–Hall–Hall–Hausman (BHHH) method. On the other hand, the Hessian matrix depends on the gradient vector [31], which is shown below:

E [g (θ)] = 0, V a r [g (θ)] = E [g^{T} (θ) g (θ)] .

(21)

Meanwhile, the gradient vector and the Hessian matrix associated with the information matrix and can be expressed by

I (θ) = - H (θ), V a r [g (θ)] = n I (θ) .

(22)

The information matrix in Equation (22) is also referred to as the Fisher information matrix [32]. Based on Equations (21) and (22), the Hessian matrix is

H (θ) = - \frac{1}{n} [g^{T} (θ) g (θ)] .

(23)

Regarding Lemmas 1 and 2, an iteration process can be carried out using the BHHH method. Following [33], the BHHH algorithm in this study is as follows:

Determine the initial value for ${\hat{θ}}^{(0)} = {[\begin{matrix} {\hat{θ}}_{1}^{T (0)} & {\hat{θ}}_{2}^{T (0)} & {\hat{θ}}_{3}^{T (0)} \end{matrix}]}^{T}$ .
Determine the tolerance value $(ε)$ for the BHHH iteration process stopping.
Start the BHHH iteration process using the formula:

${\hat{θ}}^{(t + 1)} = {\hat{θ}}^{(t)} - H^{- 1} ({\hat{θ}}^{(t)}) g ({\hat{θ}}^{(t)}), t = 0, 1, 2, \dots, T .$

(24)
The iteration stops at the $T$ -th iteration if the condition of convergence is satisfied, which is ${\hat{θ}}^{(T + 1)} - {\hat{θ}}^{(T)} \leq ε$ . The estimator values of the parameters are obtained in the last iteration.

Akaike’s information criterion (AIC) and the Bayesian information criterion (BIC) determine the best model in this study. The AIC and BIC values can be obtained by

AIC = - 2 ℓ (\hat{θ}) + (p + 1),

(25)

BIC = - 2 ℓ (\hat{θ}) + \log (n) (p + 1),

(26)

where

ℓ (\hat{θ})

is the log-likelihood value of the parameter’s estimate,

p

is the number of covariates, and

n

is the sample size. The best model is the BBL model, which has the smallest values of AIC and BIC. □

4. Hypothesis Testing of the BBL Model

Hypothesis testing of the BBL model contains the simultaneous test and the partial test. The simultaneous test and the partial test obtain the significance of the BBL model’s parameters jointly and individually, respectively. The simultaneous test and the partial test in this study were done using the maximum likelihood ratio test (MLRT) method.

The hypotheses of the simultaneous test are as follows:

H_{0} : θ_{1 h} = θ_{2 h} = \dots = θ_{p h} = 0, h = 1, 2, 3 H_{1} : at least one of θ_{g h} \neq 0, g = 1, 2, \dots, p .

(27)

In the following we present a lemma used to determine the likelihood ratio (LR) statistic, the distribution of the LR statistic, and the rejection region of the simultaneous test.

Lemma 3.

If

θ

is the BBL model’s parameter and

\hat{θ}

is the ML estimator of

θ

, then:

a): The LR statistic of the simultaneous test is $G_{1}^{2} = 2 (L (\hat{θ}) - L ({\hat{θ}}^{*}))$ , where ${\hat{θ}}^{*}$ is the ML estimator of the parameter space under the null hypothesis and $\hat{θ}$ is the ML estimator of the parameter space under the population.
b): The distribution of the LR statistic follows an asymptotic chi-square distribution, which is $G_{1}^{2} = 2 (L (\hat{θ}) - L ({\hat{θ}}^{*})) \overset{d}{\to} χ_{v_{1}}^{2}, n \to \infty$ .
c): The rejection region at the significance level of $α$ is $G_{1}^{2} > χ_{(α, v_{1})}^{2}$ .

Proof of Lemma 3.

(a) The first step to obtain the statistical test of the hypothesis in Equation (27) is to define the parameter space under the null hypothesis, denoted by

Ω_{01} = {θ_{01}, θ_{02}, θ_{03}}

. Furthermore, we determine the likelihood function formulated by

L (Ω_{01}) = \prod_{i = 1}^{n} ({(γ_{11 i}^{*})}^{y_{11 i}} {(γ_{10 i}^{*})}^{y_{10 i}} {(γ_{01 i}^{*})}^{y_{01 i}} {(γ_{00 i}^{*})}^{y_{00 i}}) .

(28)

Let

{\hat{θ}}^{*} = {[\begin{matrix} {\hat{θ}}_{01} & {\hat{θ}}_{02} & {\hat{θ}}_{03} \end{matrix}]}^{T}

be the ML estimator that maximizes the likelihood function of

L (Ω_{01})

. The ML estimator of

θ^{*}

can be obtained by using Lemmas 1 and 2. Therefore, the maximum likelihood function of

L (Ω_{01})

is

L ({\hat{Ω}}_{01}) = \prod_{i = 1}^{n} ({({\hat{γ}}_{11 i}^{*})}^{y_{11 i}} {({\hat{γ}}_{10 i}^{*})}^{y_{10 i}} {({\hat{γ}}_{01 i}^{*})}^{y_{01 i}} {({\hat{γ}}_{00 i}^{*})}^{y_{00 i}}) .

(29)

where:

{\hat{γ}}_{11 i}^{*} = {\begin{matrix} \frac{1}{2} {(ψ_{1 i} - 1)}^{- 1} (a_{1 i} - \sqrt{a_{1 i}^{2} + b_{1 i}}), ψ_{1 i} \neq 1 \\ {\hat{γ}}_{1 i}^{*} {\hat{γ}}_{2 i}^{*}, ψ_{1 i} \neq 1 \end{matrix} a_{1 i} = 1 + ({\hat{γ}}_{1 i}^{*} + {\hat{γ}}_{2 i}^{*}) (ψ_{1 i} - 1) b_{1 i} = 4 ψ_{1 i} (1 - ψ_{1 i}) {\hat{γ}}_{1 i}^{*} {\hat{γ}}_{2 i}^{*} ψ_{1 i} = {\hat{γ}}_{11 i}^{*} {\hat{γ}}_{00 i}^{*} / ({\hat{γ}}_{10 i}^{*} {\hat{γ}}_{01 i}^{*}) {\hat{γ}}_{1 i}^{*} = \exp ({\hat{θ}}_{01}) / (1 + \exp ({\hat{θ}}_{01})) {\hat{γ}}_{2 i}^{*} = \exp ({\hat{θ}}_{02}) / (1 + \exp ({\hat{θ}}_{02})); {\hat{γ}}_{10 i}^{*} = {\hat{γ}}_{1 i}^{*} - {\hat{γ}}_{11 i}^{*}; {\hat{γ}}_{01 i}^{*} = {\hat{γ}}_{2 i}^{*} - {\hat{γ}}_{11 i}^{*}; {\hat{γ}}_{00 i}^{*} = 1 - {\hat{γ}}_{1 i}^{*} - {\hat{γ}}_{2 i}^{*} + {\hat{γ}}_{11 i}^{*} .

The parameters under the population are

Ω_{11} = {θ_{g h}, g = 0, 1, \dots, p; h = 1, 2, 3}

, and the likelihood function is

L (Ω_{11}) = \prod_{i = 1}^{n} (γ_{11 i}^{y_{11 i}} γ_{10 i}^{y_{10 i}} γ_{01 i}^{y_{01 i}} γ_{00 i}^{y_{00 i}}) .

(30)

Suppose the ML estimator that maximizes the likelihood function of

L (Ω_{11})

is

\hat{θ} = {[\begin{matrix} {\hat{θ}}_{1}^{T} & {\hat{θ}}_{2}^{T} & {\hat{θ}}_{3}^{T} \end{matrix}]}^{T}

. However, the ML estimator of

θ

was obtained using Lemmas 1 and 2. Therefore, the maximum likelihood function of

L (Ω_{11})

is

L ({\hat{Ω}}_{11}) = \prod_{i = 1}^{n} ({\hat{γ}}_{11 i}^{y_{11 i}} {\hat{γ}}_{10 i}^{y_{10 i}} {\hat{γ}}_{01 i}^{y_{01 i}} {\hat{γ}}_{00 i}^{y_{00 i}}) .

(31)

where:

{\hat{γ}}_{11 i} = {\begin{matrix} \frac{1}{2} {(ψ_{2 i} - 1)}^{- 1} (a_{2 i} - \sqrt{a_{2 i}^{2} + b_{2 i}}), ψ_{2 i} \neq 1 \\ {\hat{γ}}_{1 i} {\hat{γ}}_{2 i}, ψ_{2 i} \neq 1 \end{matrix},

with

a_{2 i} = 1 + ({\hat{γ}}_{1 i} + {\hat{γ}}_{2 i}) (ψ_{2 i} - 1)

,

b_{2 i} = 4 ψ_{2 i} (1 - ψ_{2 i}) {\hat{γ}}_{1 i} {\hat{γ}}_{2 i}

,

ψ_{2 i} = {\hat{γ}}_{11 i} {\hat{γ}}_{00 i} / ({\hat{γ}}_{10 i} {\hat{γ}}_{01 i})

,

{\hat{γ}}_{1 i} = \exp (x_{i}^{T} {\hat{θ}}_{1}) / (1 + \exp (x_{i}^{T} {\hat{θ}}_{1}))

,

{\hat{γ}}_{2 i} = \exp (x_{i}^{T} {\hat{θ}}_{2}) / (1 + \exp (x_{i}^{T} {\hat{θ}}_{2}))

;

{\hat{γ}}_{10 i} = {\hat{γ}}_{1 i} - {\hat{γ}}_{11 i}

;

{\hat{γ}}_{01 i} = {\hat{γ}}_{2 i} - {\hat{γ}}_{11 i}

; and

{\hat{γ}}_{00 i} = 1 - {\hat{γ}}_{1 i} - {\hat{γ}}_{2 i} + {\hat{γ}}_{11 i}

.

The likelihood ratio (LR) statistic of the hypothesis in Equation (27) is

Λ = \frac{L ({\hat{Ω}}_{01})}{L ({\hat{Ω}}_{11})} .

(32)

With regard to Equations (29) and (31), the LR statistic in Equation (32) can be written as

Λ = \frac{\prod_{i = 1}^{n} ({({\hat{γ}}_{11 i}^{*})}^{y_{11 i}} {({\hat{γ}}_{10 i}^{*})}^{y_{10 i}} {({\hat{γ}}_{01 i}^{*})}^{y_{01 i}} {({\hat{γ}}_{00 i}^{*})}^{y_{00 i}})}{\prod_{i = 1}^{n} ({\hat{γ}}_{11 i}^{y_{11 i}} {\hat{γ}}_{10 i}^{y_{10 i}} {\hat{γ}}_{01 i}^{y_{01 i}} {\hat{γ}}_{00 i}^{y_{00 i}})} .

(33)

However, the form of the LR statistic in Equation (33) is complicated, and we cannot do the calculation analytically. Therefore, to simplify the calculation, the LR statistic is transformed in a form equivalent to

Λ^{- 2} = {[\frac{L ({\hat{Ω}}_{01})}{L ({\hat{Ω}}_{11})}]}^{- 2} = {[\frac{L ({\hat{Ω}}_{11})}{L ({\hat{Ω}}_{01})}]}^{2} .

(34)

The LR statistic in Equation (34) is also transformed using the natural logarithm. Thus, the formula of the LR statistic is

G_{1}^{2} = - 2 \log Λ = - 2 \log [\frac{L ({\hat{Ω}}_{01})}{L ({\hat{Ω}}_{11})}] = 2 (L (\hat{θ}) - L ({\hat{θ}}^{*})),

(35)

where

L (\hat{θ}) = \log L ({\hat{Ω}}_{11})

and

L ({\hat{θ}}^{*}) = \log L ({\hat{Ω}}_{01})

.

(b) Suppose the ML estimator under the population is partitioned by

\hat{θ} = {[\begin{matrix} {\hat{θ}}_{11}^{T} & {\hat{θ}}_{12}^{T} \end{matrix}]}^{T},

where

{\hat{θ}}_{11} = {[\begin{matrix} {\hat{θ}}_{1 h} & {\hat{θ}}_{2 h} & \dots & {\hat{θ}}_{p h} \end{matrix}]}^{T}, h = 1, 2, 3,

and

{\hat{θ}}_{12} = {[\begin{matrix} {\hat{θ}}_{01} & {\hat{θ}}_{02} & {\hat{θ}}_{03} \end{matrix}]}^{T}

.

The ML estimator and the known parameter in the null hypothesis are partitioned by

{\hat{θ}}^{*} = {[\begin{matrix} θ_{01}^{* T} & {\hat{θ}}_{02}^{* T} \end{matrix}]}^{T},

where

θ_{01 (3 p \times 1)}^{*} = {[\begin{matrix} 0 & 0 & \dots & 0 \end{matrix}]}^{T}

and

{\hat{θ}}_{02}^{*} = {[\begin{matrix} {\hat{θ}}_{001} & {\hat{θ}}_{002} & {\hat{θ}}_{003} \end{matrix}]}^{T}

, with the true parameter partitioned as

θ^{*} = {[\begin{matrix} θ_{01}^{* T} & θ_{12}^{T} \end{matrix}]}^{T}

.

The hypothesis in Equation (27) can be rewritten as

H_{0} : θ_{11} = θ_{01}^{*} H_{1} : θ_{11} \neq θ_{01}^{*} .

(36)

The LR statistic in Equation (35) can be specified by

G_{1}^{2} = 2 (L (\hat{θ}) - L ({\hat{θ}}^{*})) = 2 (L (\hat{θ}) - L (θ^{*})) - 2 (L ({\hat{θ}}^{*}) - L (θ^{*})) .

(37)

The function of

L (θ^{*})

is approximated using Taylor’s second-order expansion around

\hat{θ}

; we have

L (θ^{*}) \approx L (\hat{θ}) + g (\hat{θ}) (θ^{*} - \hat{θ}) - \frac{1}{2} {(θ^{*} - \hat{θ})}^{T} I (\hat{θ}) (θ^{*} - \hat{θ}),

where

g (\hat{θ}) = {\frac{\partial L (θ)}{\partial θ} |}_{θ = \hat{θ}}, I (\hat{θ}) = - {\frac{\partial^{2} L (θ)}{\partial θ \partial θ^{T}} |}_{θ = \hat{θ}} .

Since

g (\hat{θ}) = 0

, we have

2 (L (\hat{θ}) - L (θ^{*})) \approx {(\hat{θ} - θ^{*})}^{T} I (\hat{θ}) (\hat{θ} - θ^{*}) .

(38)

Analogously, in the previous step, the function of

L (θ^{*})

was approximated using Taylor’s second-order expansion around

{\hat{θ}}^{*}

, which is

L (θ^{*}) \approx L ({\hat{θ}}^{*}) + g (\hat{θ}) (θ^{*} - {\hat{θ}}^{*}) - \frac{1}{2} {(θ^{*} - {\hat{θ}}^{*})}^{T} I (\hat{θ}) (θ^{*} - {\hat{θ}}^{*}),

or it can be written as

2 (L ({\hat{θ}}^{*}) - L (θ^{*})) \approx {({\hat{θ}}^{*} - θ^{*})}^{T} I (\hat{θ}) ({\hat{θ}}^{*} - θ^{*}) .

(39)

Following Equations (38) and (39), the LR statistic in Equation (37) can be rewritten as

G_{1}^{2} \approx {(\hat{θ} - θ^{*})}^{T} I (\hat{θ}) (\hat{θ} - θ^{*}) - {({\hat{θ}}^{*} - θ^{*})}^{T} I (\hat{θ}) ({\hat{θ}}^{*} - θ^{*}) .

(40)

Suppose the forms of the partition of the Fisher information matrix and its inverse are as follows.

I {(\hat{θ})}_{(3 p + 3) \times (3 p + 3)} = [\begin{matrix} I_{11 (3 p \times 3 p)} & I_{12 (1 \times (3 p + 3))} \\ I_{21 ((3 p + 3) \times 1)} & I_{22 (3 \times 3)} \end{matrix}],

and

{[I (\hat{θ})]}_{(3 p + 3) \times (3 p + 3)}^{- 1} = [\begin{matrix} I_{11 (3 p \times 3 p)} & I_{12 (1 \times (3 p + 3))} \\ I_{21 ((3 p + 3) \times 1)} & I_{22 (3 \times 3)} \end{matrix}] .

Following the concept of the conditional distribution, given as

θ_{11} = θ_{01}^{*}

,

{\hat{θ}}_{11}

, and

{\hat{θ}}_{12}

, we have

{\hat{θ}}_{02}^{*} = {\hat{θ}}_{12} - I_{21} I_{11}^{- 1} ({\hat{θ}}_{11} - θ_{01}^{*}) .

(41)

Simplifying the form of Equation (41) using a simple manipulation of the partitioned matrix of the Fisher information gets

{\hat{θ}}_{02}^{*} = {\hat{θ}}_{12} + I_{22}^{- 1} I_{21} ({\hat{θ}}_{11} - θ_{01}^{*}) .

(42)

Since

({\hat{θ}}^{*} - θ^{*}) = (0, {\hat{θ}}_{02}^{*} - θ_{12})

and

θ_{02}^{*} - θ_{12} = {\hat{θ}}_{12} - θ_{12} + I_{22}^{- 1} I_{21} ({\hat{θ}}_{11} - θ_{01}^{*})

, we have

{({\hat{θ}}^{*} - θ^{*})}^{T} I (\hat{θ}) ({\hat{θ}}^{*} - θ^{*}) = {({\hat{θ}}_{02}^{*} - θ_{12})}^{T} I_{22} ({\hat{θ}}_{02}^{*} - θ_{12}) = {[\begin{matrix} {\hat{θ}}_{11} - θ_{01}^{*} \\ {\hat{θ}}_{12} - θ_{12} \end{matrix}]}^{T} [\begin{matrix} I_{12} I_{22}^{- 1} I_{21} & I_{12} \\ I_{21} & I_{22} \end{matrix}] [\begin{matrix} {\hat{θ}}_{11} - θ_{01}^{*} \\ {\hat{θ}}_{12} - θ_{12} \end{matrix}]

The LR statistic in Equation (40) can be formulated as

G_{1}^{2} \approx {({\hat{θ}}_{11} - θ_{01}^{*})}^{T} (I_{11} - I_{12} I_{22}^{- 1} I_{21}) ({\hat{θ}}_{11} - θ_{01}^{*}) = {({\hat{θ}}_{11} - θ_{01}^{*})}^{T} I_{11}^{- 1} ({\hat{θ}}_{11} - θ_{01}^{*}) .

(43)

When considering the normality properties for ML under the regularity conditions [31], the distribution of the partitioned matrix is

[\begin{matrix} {\hat{θ}}_{11} - θ_{01}^{*} \\ {\hat{θ}}_{12} - θ_{12} \end{matrix}] \overset{d}{\to} N (0, {[I (θ)]}^{- 1} \equiv [\begin{matrix} I_{11} & I_{12} \\ I_{21} & I_{22} \end{matrix}]), n \to \infty .

Therefore, the LR statistic is obtained as follows:

({\hat{θ}}_{11} - θ_{11}) \overset{d}{\to} N (0, I_{11}), n \to \infty, {[I_{11}]}^{- 1 / 2} ({\hat{θ}}_{11} - θ_{11}) \overset{d}{\to} N (0, I_{v_{1}}), n \to \infty, G_{1}^{2} = {({\hat{θ}}_{11} - θ_{01}^{*})}^{T} I_{11}^{- 1} ({\hat{θ}}_{11} - θ_{01}^{*}) \overset{d}{\to} χ_{v_{1}}^{2}, n \to \infty .

(44)

The LR statistic in Equation (44) has an asymptotic chi-square distribution with

v_{1}

degrees of freedom.

v_{1}

is the difference of the parameter sets under the population and the null hypothesis.

(c) Determining the rejection region or the critical region of the null hypothesis in Equation (27) requires the MLRT method, where the null hypothesis is rejected when

Λ < c

,

0 < c \leq 1

, and where

Λ

is given in Equation (32) and

c

is a constant. Let

α

be the significance level for

0 < α < 1,

and

0 < c_{α} \leq 1

be a constant; the

c_{α}

value depends on the significance level of

α

and satisfies

P_{θ ϵ Ω_{01}} (Λ < c_{α}) = α

. Based on the definition of the significance level of

α

, we have

\begin{array}{l} α & = P_{θ ϵ Ω_{01}} (Λ < c_{α}) \\ = P (- 2 \log Λ > - 2 \log c_{α}) \\ = P (2 [L (\hat{θ}) - L ({\hat{θ}}^{*})] > c) \\ = P (G_{1}^{2} > c), \end{array}

(45)

where

G_{1}^{2}

is the LR statistic, which has an asymptotic chi-square distribution with

v_{1}

degrees of freedom. The value of the constant

c

in Equation (45) is

χ_{(α, v_{1})}^{2}

and

\int_{c}^{\infty} f (w) d w = α,

where

f (w) = w^{(v_{1} / 2) - 1} e^{- w / 2} / (Γ (v_{1} / 2) 2^{v_{1} / 2})

is the probability density function of the chi-square distribution with

v_{1}

degrees of freedom. Therefore, the rejection region at the significance level of

α

is

G_{1}^{2} > χ_{(α, v_{1})}^{2}

.

The last one is a partial test. This test aims to obtain covariates that have a significant effect on the responses individually. The procedures of the partial test in this study follow the simultaneous test. The hypothesis of the partial test is

H_{0} : θ_{11} = θ_{12} = θ_{13} = 0, H_{1} : at least one of θ_{1 h} \neq 0, h = 1, 2, 3 .

(46)

The parameter set under the null hypothesis for each covariate is

Ω_{02} = {θ_{01}, θ_{02}, θ_{03}}

. The parameter set under the population for each covariate is

Ω_{12} = {θ_{01}, θ_{11}, θ_{02}, θ_{12}, θ_{03}, θ_{13}}

. Analogously, in the proof of Theorem 1, the LR statistic for the hypothesis in Equation (46) is

G_{2}^{2} = {({\hat{θ}}_{11} - θ_{01}^{*})}^{T} I_{11}^{- 1} ({\hat{θ}}_{11} - θ_{01}^{*}) \overset{d}{\to} χ_{v_{2}}^{2}, n \to \infty .

(47)

The LR statistic in Equation (47) has an asymptotic chi-square distribution with

v_{2}

degrees of freedom. The rejection region at the significance level

α

is

G_{2}^{2} > χ_{(α, v_{2})}^{2}

. □

5. Application

The BBL model was applied to model the factors influencing the status of the human development index (HDI) and public health development index (PHDI) of regencies/municipalities in Kalimantan, Indonesia, in 2018. The HDI is an index measured from four components of the essential dimensions of human development: life expectancy, the average length of schooling, expected length of schooling, and adjusted per-capita income. Life expectancy represents an indicator of health, the average length of schooling and the expected length of schooling represent educational indicators, and adjusted per capita income represents an economic indicator [34]. The PHDI is an index that measures the health of the regencies/municipalities and provinces in the Republic of Indonesia [35].

The HDI status data and covariates’ data were collected from the National Bureau of Statistics of the Republic of Indonesia, whereas the PHDI data were collected from the Republic of Indonesia’s Ministry of Health. The variables in this study consist of two responses and five covariates. The responses are the HDI status and the PHDI status of regencies/municipalities, denoted by

Y_{1}

and

Y_{2}

. The covariates are the economic growth (

X_{1}

), the net enrollment rate of the junior high school (

X_{2}

), the percentage of people that have the minimum level of education in junior high school (

X_{3}

), the number of doctors per 1000 people (

X_{4}

), and the number of public health centers (

X_{5}

). The regencies and municipalities’ HDI status has four categories: low HDI, medium HDI, high HDI, and very high HDI [34]. Regencies/municipalities in Kalimantan, Indonesia, in 2018, had HDI in the medium and high categories. Therefore, the HDI status (

Y_{1}

) has two categories: the medium HDI coded by 0 and the high HDI coded by 1.

Meanwhile, the Ministry of Health of the Republic of Indonesia classifies regencies/municipalities’ health status based on the PHDI into two categories. Regencies/municipalities with a low PHDI have health problems, and vice versa [36]. Therefore, the PHDI status (

Y_{2}

) has two categories: the regencies/municipalities with low PHDI values are coded by 0, and the regencies/municipalities with high PHDI values are coded by 1. This study’s observation unit is the regency/municipality. Five provinces in Kalimantan, Indonesia were used (2018 data), including 47 regencies and nine municipalities. Therefore, the sample size is 56.

The descriptive statistics of the responses HDI status (

Y_{1}

) and PHDI status (

Y_{2}

), consisting of observed frequencies, are presented in Table 2.

Table 2 shows that 20 regencies/municipalities had high HDI and PHDI, and six regencies/municipalities had high HDI and low PHDI. We also see that three regencies/municipalities had medium HDI and high PHDI. Finally, 27 regencies/municipalities had medium HDI and low PHDI. The HDI status (

Y_{1}

) and PHDI status (

Y_{2}

) of regencies/municipalities are displayed in Figure 1.

The descriptive statistics of the responses show that the majority of regencies/municipalities in Kalimantan, Indonesia, in 2018, had medium HDI and low PHDI. The descriptive statistics of the covariates are summarized in Table 3.

The HDI status (

Y_{1}

) and PHDI status (

Y_{2}

) are correlated. Based on the observed frequencies in Table 2, the odds ratio (OR) value of HDI status (

Y_{1}

) and PHDI status (

Y_{2}

) was 30 with a 95% confidence interval of 6.6826 ≤ OR ≤ 134.6783. This result indicates that the responses are highly positively correlated. Meanwhile, we also employed the dependence test of the responses HDI status (

Y_{1}

) and PHDI status (

Y_{2}

), provided in Table 4.

Three statistical tests demonstrated a dependence test of HDI status (

Y_{1}

) and PHDI status (

Y_{2}

). The result in Table 4 shows that all of the statistical test values had greater than the chi-square table value (i.e.,

χ_{(0.05, 1)}^{2} = 3.8415

) and p-values less than the significance level value (i.e., α = 0.05). Therefore, the conclusion was to reject the null hypothesis (

H_{0}

), and the HDI status (

Y_{1}

) and PHDI status (

Y_{2}

) are dependencies. Based on the OR value and the dependence test, the HDI status (

Y_{1}

) and PHDI status (

Y_{2}

) are appropriate for the BBL model.

The variance inflation factor detected the multicollinearity of the covariates. The variance inflation factor values of all covariates in Table 5 are less than ten, which indicates that the covariates are independent of each other (i.e., no multicollinearity). Therefore, all covariates can be used in the BBL model.

The estimation of the BBL model’s parameters using the ML and BHHH methods was employed. Table 6 provides the bias values and the numbers of BHHH iterations of the parameter estimation process for the BBL model with the single and multiple covariates.

The BBL model with the single covariate of economic growth (

X_{1}

) and public health centers (

X_{5}

) in Table 6 was not convergent. Therefore, both covariates, economic growth (

X_{1}

) and public health centers (

X_{5}

), were not used in the BBL model. Based on Table 6, the BBL model for modeling the factors that affect the HDI status and PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018 was obtained.

Table 7 displays the ML estimates of the BBL model with multiple covariates (i.e.,

X_{2}

,

X_{3}

,

X_{4}

), giving the parameter estimates, the LR statistic of the simultaneous test (

G_{1}^{2}

), the degrees of freedom (df), and the p-value.

The LR statistic value in Table 7 is 99.739, and the p-value is 1.7685 × 10⁻²¹ (p < 0.001). Meanwhile, the chi-square table’s value with nine degrees of freedom and a 5% significance level was 16.919. The LR statistic value is greater than the chi-square table’s value, and the p-value is less than the 5% significance level. Therefore, the null hypothesis was rejected, and we conclude that the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people were jointly significantly affecting the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018. The BBL model for the HDI status and the PHDI status of regencies/municipalities can be written as follows:

{\hat{τ}}_{1} (x) = - 0.0034 - 0.1916 X_{2} + 0.0449 X_{3} + 0.0013 X_{4}, {\hat{τ}}_{2} (x) = - 0.0016 - 0.1403 X_{2} + 0.0395 X_{3} + 0.0011 X_{4}, {\hat{τ}}_{3} (x) = - 0.0023 - 0.1440 X_{2} - 0.1071 X_{3} + 0.0002 X_{4} .

The partial test using the MLRT method was used to obtain the covariates that individually affect the HDI status and the PHDI status of regencies/municipalities. Table 8 describes the BBL model with the single covariate, which covers the parameter estimates, the LR statistic value (

G_{2}^{2}

), the degrees of freedom (df), and the p-value.

The LR statistic’s value of the estimated parameter for each covariate (the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people; Table 8) was greater than the chi-square table’s value; the chi-square table’s value with three degrees of freedom and 5% significance level was 7.8147. Meanwhile, the p-value of each covariate was less than the 5% significance level. Therefore, we concluded that the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people individually significantly influenced the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018.

The BBL model with a single covariate (e.g.,

X_{4}

) for the HDI status and the PHDI status of regencies/municipalities can be expressed as follows:

{\hat{τ}}_{1} (x) = - 15.3744 + 12.5080 X_{4}, {\hat{τ}}_{2} (x) = - 17.0178 + 9.3722 X_{4}, {\hat{τ}}_{3} (x) = 4.2464 + 4.1612 X_{4} .

The AIC and BIC methods in Equations (25) and (26) were used for the evaluation of the BBL model’s performance. The AIC and BIC values of the BBL models are shown in Table 9.

The BBL model with the single covariate in Table 9 has the smallest AIC and BIC values compared to the BBL model with the multiple covariates. Therefore, the BBL model with the single covariate is the best model for modeling the relationships between the responses (i.e., the HDI status and the PHDI status) and the covariates (i.e., the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people) of regencies/municipalities in Kalimantan, Indonesia, in 2018. Furthermore, the net enrollment rate of the junior high school, the percentage of people that have the minimum level of education in junior high school, and the number of doctors per 1000 people individually significantly affected the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018.

However, some recommendations and future research from this work are possible. Firstly, the logit models in this research are limited to two responses. The BBL model, with more than two responses, should be considered for future research. Secondly, other numerical optimization methods that improve the performance of the BBL model should also be considered for future research.

6. Conclusions

The BBL model is the development of the binary logit model. It was constructed using the multinomial distribution. The ML method was applied to get the BBL model’s parameter estimator. The ML estimator of the BBL model’s parameters does not have a closed-form, and it needs an iterative numerical procedure. The BHHH iterative method was used. Hypothesis testing of the BBL model includes the simultaneous test and the partial test. The simultaneous and partial tests were done by using the MLRT method. The LR statistics of the simultaneous test and the partial test were asymptotically chi-square distributed. The BBL model was applied to model the factors affecting the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018. The BBL model with the single covariate performed better than the BBL model with multiple covariates. The factors significantly affecting the HDI status and the PHDI status of regencies/municipalities in Kalimantan, Indonesia, in 2018, were the net enrollment rate of junior high schools, the percentage of people who have the minimum level of education in junior high school, and the number of doctors per 1000 people.

Author Contributions

Conceptualization, P.P. and M.F.; methodology, P.P.; software, M.F.; validation, P.P.; formal analysis, P.P. and M.F.; investigation, P.P. and M.F.; resources, P.P. and M.F.; data curation, M.F.; writing—original draft preparation, M.F.; writing—review and editing, P.P. and M.F.; visualization, M.F.; supervision, P.P.; project administration, P.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Ministry of Research and Technology/National Agency for Research and Innovation of the Republic of Indonesia with grant number 3/E1/KP.PTNBH/2020.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset in this article was derived from the National Bureau of Statistics of the Republic of Indonesia, https://www.bps.go.id/ and the Ministry of Health of the Republic of Indonesia, https://www.litbang.kemkes.go.id/buku-ipkm-2018/.

Acknowledgments

We would like to thank the reviewers and the editors for constructive and helpful comments.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

AIC	Akaike’s information criterion
BHHH	Berndt–Hall–Hall–Hausman
BIC	Bayesian information criterion
HDI	Human development index
LR	Likelihood ratio
ML	Maximum likelihood
MLRT	Maximum likelihood ratio test
MULT	Multinomial distribution
PHDI	Public health development index
VIF	Variance inflation factor

References

McCullagh, P.; Nelder, J.A. Generalized Linear Models, 2nd ed.; Chapman and Hall: London, UK, 1989. [Google Scholar]
Glonek, G.F.V.; McCullagh, P. Multivariate logistic models. J. R. Stat. Soc. B 1995, 57, 533–546. [Google Scholar] [CrossRef]
Kauermann, G. A note on multivariate logistic models for contingency table. Austral. J. Stat. 1997, 39, 261–276. [Google Scholar] [CrossRef]
Glonek, G.F.V. A class of regression models for multivariate categorical responses. Biometrika 1996, 83, 15–28. [Google Scholar] [CrossRef]
Bergsma, W.P.; Rudas, T. Marginal models for categorical data. Ann. Stat. 2002, 30, 140–159. [Google Scholar]
Qaqish, B.F.; Ivanova, A. Multivariate logistic models. Biometrika 2006, 93, 1011–1017. [Google Scholar]
Lipsitz, S.R.; Laird, N.M.; Harrington, D.P. Maximum likelihood regression methods for paired binary data. Stat. Med. 1990, 9, 1517–1525. [Google Scholar] [CrossRef] [PubMed]
Liang, K.Y.; Zeger, S.L.; Qaqish, B. Multivariate regression analyses for categorical data (with discussion). J. R. Stat. Soc. B 1992, 54, 3–40. [Google Scholar]
Carey, V.; Zeger, S.L.; Diggle, P. Modelling multivariate binary data with alternating logistic regressions. Biometika 1993, 80, 517–526. [Google Scholar] [CrossRef]
Cessie, S.L.; Houwelingen, J.C. Logistic regression for correlated binary data. Appl. Stat. 1994, 43, 95–108. [Google Scholar] [CrossRef]
Lang, J.B.; Agresti, A. Simultaneously modeling joint and marginal distributions of multivariate categorical responses. J. Am. Stat. Assoc. 1994, 89, 625–632. [Google Scholar] [CrossRef]
Shoukri, M.M.; Martin, S.W.; Mian, I.U.H. Maximum likelihood estimation of the kappa coefficient from models of matched binary responses. Stat. Med. 1995, 14, 83–99. [Google Scholar] [CrossRef] [PubMed]
Shoukri, M.M.; Mian, I.U.H. Maximum likelihood estimation of the kappa coefficient from bivariate logistic regression. Stat. Med. 1996, 15, 1409–1419. [Google Scholar] [CrossRef]
Molenberghs, G.; Lesaffre, E. Marginal modelling of multivariate categorical data. Stat. Med. 1999, 18, 2237–2255. [Google Scholar] [CrossRef] [Green Version]
Ekholm, A.; Smith, P.W.F.; McDonal, J.W. Marginal regression analysis of a multivariate binary response. Biometrika 1995, 82, 847–854. [Google Scholar] [CrossRef]
Ekholm, A.; McDonald, J.W.; Smith, P.W.F. Association models for a multivariate binary response. Biometrics 2000, 56, 712–718. [Google Scholar] [CrossRef] [PubMed]
Joe, H.; Liu, Y. A model for a multivariate binary response with covariates based on compatible conditionally specified logistic regressions. Stat. Probab. Lett. 1996, 31, 113–120. [Google Scholar] [CrossRef]
Islam, M.A.; Chowdhury, R.I.; Briollais, L. A bivariate binary model for testing dependence in outcomes. Bull. Malays. Math. Sci. Soc. 2012, 35, 845–858. [Google Scholar]
El-Sayed, A.M.; Islam, M.A.; Alzaid, A.A. Estimation and test of measures of association for correlated binary data. Bull. Malays. Math. Sci. Soc. 2013, 36, 985–1008. [Google Scholar]
Islam, M.A.; Alzaid, A.A.; Chowdhury, R.I.; Sultan, K.S. A generalized bivariate Bernoulli model with covariate dependence. J. Appl. Stat. 2013, 40, 1064–1075. [Google Scholar] [CrossRef]
Bhuyan, M.J.; Islam, M.A.; Rahman, M.S. A bivariate Bernoulli model for analyzing malnutrition. Health Serv. Outcomes Res. Method 2018, 18, 109–127. [Google Scholar] [CrossRef]
Sinha, S.S.; Laird, N.M.; Fitzmaurice, G.M. Multivariate logistic regression with incomplete covariate and auxiliary information. J. Multivar. Anal. 2010, 101, 2389–2397. [Google Scholar] [CrossRef] [Green Version]
Horton, N.J.; Laird, N.M. Maximum likelihood analysis of logistic regression models with incomplete covariate data and auxiliary information. Biometrics 2001, 57, 34–42. [Google Scholar] [CrossRef]
Chen, Z.; Yi, G.Y.; Wu, C. Marginal methods for correlated binary data with misclassified responses. Biometrika 2011, 98, 647–662. [Google Scholar] [CrossRef]
O’Brien, S.M.; Dunson, D.B. Bayesian multivariate logistic regression. Biometrics 2004, 60, 739–746. [Google Scholar] [CrossRef]
Fathurahman, M.; Purhadi; Sutikno; Ratnasari, V. Hypothesis testing of geographically weighted bivariate logistic regression. J. Phys. Conf. Ser. 2019, 1417, 012008. [Google Scholar] [CrossRef]
Fathurahman, M.; Purhadi; Sutikno; Ratnasari, V. Geographically Weighted Multivariate Logistic Regression Model and Its Application. Abstr. Appl. Anal. 2020, 2020, 8353481. [Google Scholar] [CrossRef]
Berndt, E.K.; Hall, B.H.; Hall, R.E.; Hausman, J.A. Estimation and inference in nonlinear structural models. Ann. Econ. Soc. Meas. 1974, 3, 653–665. [Google Scholar]
Mardalena, S.; Purhadi, P.; Purnomo, J.D.T.; Prastyo, D.D. Parameter estimation and hypothesis testing of multivariate Poisson inverse Gaussian regression. Symmetry 2020, 12, 1738. [Google Scholar] [CrossRef]
Dale, J.R. Global cross-ratio models for bivariate, discrete, ordered responses. Biometrics 1986, 42, 909–917. [Google Scholar] [CrossRef] [PubMed]
Greene, W.H. Econometric Analysis, 6th ed.; Pearson Education: Cranbury, NJ, USA, 2008. [Google Scholar]
Pawitan, Y. All Likelihood: Statistical Modelling and Inference Using Likelihood, 1st ed.; Clarendon Press: Oxford, UK, 2001. [Google Scholar]
Rahayu, A.; Purhadi; Sutikno; Prastyo, D.D. Multivariate gamma regression: Parameter estimation, hypothesis testing, and its application. Symmetry 2020, 12, 813. [Google Scholar] [CrossRef]
National Bureau of Statistics. Human Development Index 2018; BPS: Jakarta, Indonesia, 2019.
Ministry of Health. Public Health Development Index; LPB: Jakarta, Indonesia, 2019.
Ministry of Health. General Guidelines for Dealing with Areas with Health Problems; LPB: Jakarta, Indonesia, 2010.

Figure 1. The proportions of the responses with certain HDI status (

Y_{1}

) and PHDI status (

Y_{2}

).

n Y_{00}

is the number of regencies/municipalities that had medium HDI and low PHDI;

n Y_{01}

is the number of regencies/municipalities that had medium HDI and high PHDI;

n Y_{10}

is the number of regencies/municipalities that had high HDI and low PHDI;

n Y_{11}

is the number of regencies/municipalities that had high HDI and PHDI.

Figure 1. The proportions of the responses with certain HDI status (

Y_{1}

) and PHDI status (

Y_{2}

).

n Y_{00}

is the number of regencies/municipalities that had medium HDI and low PHDI;

n Y_{01}

is the number of regencies/municipalities that had medium HDI and high PHDI;

n Y_{10}

is the number of regencies/municipalities that had high HDI and low PHDI;

n Y_{11}

is the number of regencies/municipalities that had high HDI and PHDI.

Table 1. Probabilities for the responses.

$Y_{1}$	$Y_{2}$		Total
$Y_{1}$	$Y_{2} = 1$	$Y_{2} = 0$	Total
$Y_{1} = 1$	$γ_{11}$	$γ_{10}$	$γ_{1}$
$Y_{1} = 0$	$γ_{01}$	$γ_{00}$	$1 - γ_{1}$
Total	$γ_{2}$	$1 - γ_{2}$	$1$

Table 2. The observed frequencies of the responses.

$Y_{1}$	$Y_{2}$		Total
$Y_{1}$	$Y_{2} = 1$	$Y_{2} = 0$	Total
$Y_{1} = 1$	20	3	23
$Y_{1} = 0$	6	27	33
Total	26	30	56

Table 3. The summary of descriptive statistics for the covariates.

Covariates	Minimum	Maximum	Mean	Standard Deviation
$X_{1}$	−4.10	7.99	5.08	1.83
$X_{2}$	68.37	98.82	81.19	8.14
$X_{3}$	35.58	81.37	54.25	11.05
$X_{4}$	1.00	46.40	10.23	9.47
$X_{5}$	5.00	33.00	17.57	6.93

Table 4. Statistical test values of the dependence test of responses.

Statistical Tests	$χ^{2}$	df	p-Value
Pearson	25.7750	1	3.8370 × 10⁻⁷
Pearson with Yates’ continuity correction	23.0840	1	1.5510 × 10⁻⁶
Likelihood ratio	28.2420	1	1.0708 × 10⁻⁷

Table 5. Variance inflation factor (VIF) values of the covariates.

Covariates	VIF
$X_{1}$	1.0404
$X_{2}$	1.4948
$X_{3}$	2.3002
$X_{4}$	2.5617
$X_{5}$	1.5165

Table 6. The bias values and the numbers of BHHH iterations for the BBL model with single and multiple covariates.

Covariates	Bias	Iteration
$X_{1}$	6.2266 × 10⁻⁵	1000
$X_{2}$	1.9227 × 10⁻⁸ *	23
$X_{3}$	2.0283 × 10⁻⁸ *	25
$X_{4}$	3.8014 × 10⁻⁶ *	9
$X_{5}$	5.6987 × 10⁻⁵	1000
$X_{2} X_{3} X_{4}$	8.5186 × 10⁻⁸ *	146

* Convergent at tolerance limit value (i.e., ε = 1 × 10⁻⁵).

Table 7. Parameter estimates and the statistical test value of the simultaneous test for the BBL model with the multiple covariates.

Parameter	Estimation	$G_{1}^{2}$	df	p-Value
$θ_{01}$	−0.0034	99.7390	9	1.7685 × 10⁻²¹
$θ_{11}$	−0.1916
$θ_{21}$	0.0449
$θ_{31}$	0.0013
$θ_{02}$	−0.0016
$θ_{12}$	−0.1403
$θ_{22}$	0.0395
$θ_{32}$	0.0011
$θ_{03}$	−0.0023
$θ_{13}$	−0.1440
$θ_{23}$	−0.1071
$θ_{33}$	0.0002

Table 8. Parameter estimates and the LR statistic value of the partial test for the BBL model with the single covariate.

Covariate	Parameter	Estimation	$G_{2}^{2}$	df	p-Value
$X_{2}$	$θ_{01}$	−0.0010	90.2309	3	1.9542 × 10⁻¹⁹
	$θ_{11}$	−0.0446
	$θ_{02}$	−0.0026
	$θ_{12}$	−0.2251
	$θ_{03}$	0.0008
	$θ_{13}$	0.0931
$X_{3}$	$θ_{01}$	−0.0104	111.4570	3	5.3304 × 10⁻²⁴
	$θ_{11}$	−0.1630
	$θ_{02}$	−0.0079
	$θ_{12}$	−0.2145
	$θ_{03}$	0.0004
	$θ_{13}$	0.0483
$X_{4}$	$θ_{01}$	−15.3744	174.2092	3	1.5700 × 10⁻³⁷
	$θ_{11}$	12.5080
	$θ_{02}$	−17.0178
	$θ_{12}$	9.3722
	$θ_{03}$	4.2464
	$θ_{13}$	4.1612

Table 9. The AIC and BIC values of the BBL model are the single and multiple covariates.

Covariates	AIC	BIC
$X_{2}$	512.9107	525.0628
$X_{3}$	546.2378	558.3899
$X_{4}$	546.1944	558.3465
$X_{2} X_{3} X_{4}$	552.6286	576.9328

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Purhadi, P.; Fathurahman, M. A Logit Model for Bivariate Binary Responses. Symmetry 2021, 13, 326. https://doi.org/10.3390/sym13020326

AMA Style

Purhadi P, Fathurahman M. A Logit Model for Bivariate Binary Responses. Symmetry. 2021; 13(2):326. https://doi.org/10.3390/sym13020326

Chicago/Turabian Style

Purhadi, Purhadi, and M. Fathurahman. 2021. "A Logit Model for Bivariate Binary Responses" Symmetry 13, no. 2: 326. https://doi.org/10.3390/sym13020326

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Logit Model for Bivariate Binary Responses

Abstract

1. Introduction

2. Bivariate Binary Logit Model

3. Estimation of the BBL Model

4. Hypothesis Testing of the BBL Model

5. Application

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI