A New Matrix Statistic for the Hausman Endogeneity Test under Heteroskedasticity

Papadopoulos, Alecos

doi:10.3390/econometrics11040023

Open AccessArticle

A New Matrix Statistic for the Hausman Endogeneity Test under Heteroskedasticity

by

Alecos Papadopoulos

Department of Economics, Athens University of Economics and Business, TK 10434 Athens, Greece

Econometrics 2023, 11(4), 23; https://doi.org/10.3390/econometrics11040023

Submission received: 10 August 2023 / Revised: 28 September 2023 / Accepted: 2 October 2023 / Published: 10 October 2023

Download Versions Notes

Abstract

:

We derive a new matrix statistic for the Hausman test for endogeneity in cross-sectional Instrumental Variables estimation, that incorporates heteroskedasticity in a natural way and does not use a generalized inverse. A Monte Carlo study examines the performance of the statistic for different heteroskedasticity-robust variance estimators and different skedastic situations. We find that the test statistic performs well as regards empirical size in almost all cases; however, as regards empirical power, how one corrects for heteroskedasticity matters. We also compare its performance with that of the Wald statistic from the augmented regression setup that is often used for the endogeneity test, and we find that the choice between them may depend on the desired significance level of the test.

Keywords:

Hausman test; heteroskedasticity; endogeneity; instrumental variables; generalized matrix inverse; Wald statistic

JEL Classification:

C12; C26; C21

1. Introduction

The Hausman family of specification tests was introduced by Hausman (1978) and it has seen unabated use in econometrics ever since. Amini et al. (2012) detail its wide reach and different implementations for panel data, while in a cross-sectional setting, the test has often been used to test for regressor endogeneity.

In the cross-sectional setting, the test statistic is formally based on a “vector of contrasts”, the difference of two estimators, where under the null hypothesis, both are consistent and the one is also efficient, while under the alternative, only one is consistent. This form of the test uses a variance expression that is often singular, requiring a generalized inverse. To bypass the singularity of the variance matrix an “augmented regression” approach has been developed, linking the test with its precursors (Durbin 1954; Wu 1973, 1974).1

The efficiency of one of the estimators under the null hypothesis has a very convenient consequence: the variance of the difference of the two estimators equals the difference of their variances; thus, we do not have to compute covariances. However, when heteroskedasticity is present (and it is expected to exist regularly in cross sectional studies), this helpful simplification is no longer valid. Adkins et al. (2012) have examined in great detail this endogeneity test in situations of heteroskedasticity, and they take the augmented regression route to formulate the various test variants that they implement.

In this study, we push a known result in the literature to its conclusion, and we arrive at a new matrix Hausman statistic for an endogeneity test. This new statistic is a useful additional tool to have, for the following reasons: it handles heteroskedasticity in a natural way; it could be a more familiar tool to use for researchers that are accustomed to using matrix algebra and forms; compared to the original form of the Hausman statistic, it does not use generalized inverses. In fact, if the matrix involved is not invertible, it reflects the existence of perfect collinearity between some instruments and some endogenous regressors, which invalidates the instruments. Finally, in Monte Carlo simulations that we will present, it performed better than the “augmented regression” test in terms of power, when executing the test at the 10% significance level.

2. The Matrix Hausman Statistic for Testing Endogeneity

We follow the notation of Adkins et al. (2012). We consider the linear regression model

y = X β + u

. The vectors

y

and

u

are

n \times 1

, and

u

is assumed to be zero-mean. The regressor matrix is partitioned

X = [X_{1} X_{2}]

.

X_{1}

is an

n \times K_{1}

submatrix of regressors thought to be endogenous (so not orthogonal to the error term) while

X_{2}

is an

n \times K_{2}

submatrix of exogenous regressors (or “internal instruments”). The unknown of interest is the vector

β

. We have available

Λ_{1} \geq K_{1}

“external instruments” collected in matrix

Z_{1}

and the full instruments matrix is

Z = [Z_{1} X_{2}]

. We write the orthogonal projection matrix

P_{z} = Z {(Z^{'} Z)}^{- 1} Z^{'}

and the residual-maker (or annihilator) matrix

M_{x} = I_{n} - P_{x}

, with

I_{n}

being the

n \times n

Identity matrix. The subscript in P and M determines which collection of variables we use in each case. These matrices are symmetric and idempotent. We write

\hat{u} = M_{x} y

for the residuals from the Ordinary Least Squares regression (OLS). We write

P_{z} X \equiv \hat{X} = [{\hat{X}}_{1} X_{2}]

for the linear projection of X on the columns of Z (the “fitted values”). Note that

P_{z} X_{2} = X_{2}

because

X_{2}

belongs in the column space of

P_{z}

.

The OLS estimator for

β

is

{\hat{β}}_{O L S} = {(X^{'} X)}^{- 1} X^{'} y

while the benchmark Instrumental Variables (IV) estimator when instruments are more in number than endogenous regressors is

{\hat{β}}_{I V} = {({\hat{X}}^{'} \hat{X})}^{- 1} {\hat{X}}^{'} y

(two-stage least-squares). The basic expression of the Hausman statistic under homoskedasticity of the error term (with the OLS estimate of the error variance

{\hat{σ}}_{u}^{2}

) is

{({\hat{σ}}_{u}^{2})}^{- 1} \cdot {({\hat{β}}_{I V} - {\hat{β}}_{O L S})}^{'} {({({\hat{X}}^{'} \hat{X})}^{- 1} - {(X^{'} X)}^{- 1})}^{- 1} ({\hat{β}}_{I V} - {\hat{β}}_{O L S})

(1)

This is the statistic where we may encounter trouble in inverting the middle matrix, which, moreover, may not even be positive definite in finite samples. It may render the test inapplicable, or necessitate the use of a generalized inverse instead.

To bypass this issue, while simultaneously accounting for heteroskedasticity, we start by noting that the core of the statistic for the Hausman test is the difference

\begin{matrix} {\hat{β}}_{I V} - {\hat{β}}_{O L S} & = {({\hat{X}}^{'} \hat{X})}^{- 1} {\hat{X}}^{'} y - {(X^{'} X)}^{- 1} X^{'} y \\ = {(X^{'} P_{z} X)}^{- 1} X^{'} P_{z} y - {(X^{'} X)}^{- 1} X^{'} y \\ = {(X^{'} P_{z} X)}^{- 1} [X^{'} P_{z} y - X^{'} P_{z} X {(X^{'} X)}^{- 1} X^{'} y] \\ = {(X^{'} P_{z} X)}^{- 1} X^{'} P_{z} [I_{n} - X {(X^{'} X)}^{- 1} X^{'}] y \\ = {(X^{'} P_{z} X)}^{- 1} X^{'} P_{z} M_{x} y \\ = {({\hat{X}}^{'} \hat{X})}^{- 1} {\hat{X}}^{'} \hat{u} . \end{matrix}

(2)

We have used the fact that

P_{z}, M_{x}

are symmetric and idempotent and that

M_{x} y = M_{x} u = \hat{u}

. Result (2) is known in the literature. For example, Greene (2012, p. 276) arrives at it, but he does not go further. Also Adkins et al. (2012) actually start with it (their Equation (1)), but then they use the augmented regression approach to proceed. Also, later in their paper, when they re-purpose the “weak vs. strong instruments” test of Hahn et al. (2011), they “directly estimate the asymptotic covariance matrix of the contrast”, but the expression they give is inconveniently long, since this variance can be nicely compacted, as we will show. To our knowledge, the result in Equation (2) has not been pursued to the very end for the construction of a Hausman statistic and test, and this is what we will do here.

The null hypothesis of the Hausman test is that the two estimators converge to the same probability limit (plim):

H_{0} : plim ({\hat{β}}_{I V} - {\hat{β}}_{O L S}) = 0 .

To examine this hypothesis we consider the limiting distribution of the scaled difference and its variance, which, under the null hypothesis and given Equation (2), is

Avar [\sqrt{n} ({\hat{β}}_{I V} - {\hat{β}}_{O L S})] \equiv V = plim [{(n^{- 1} {\hat{X}}^{'} \hat{X})}^{- 1} S {(n^{- 1} {\hat{X}}^{'} \hat{X})}^{- 1}] .

(3)

The middle matrix is

S = plim (n^{- 1} {\hat{X}}^{'} \hat{u} {\hat{u}}^{'} \hat{X})

. We can then formulate a theoretical statistic for the endogeneity test,

q = n \cdot {({\hat{β}}_{I V} - {\hat{β}}_{O L S})}^{'} V^{-} ({\hat{β}}_{I V} - {\hat{β}}_{O L S}) \to_{H_{0}}^{d} χ_{K_{1}}^{2} .

(4)

Here,

V^{-}

denotes a generalized inverse of V.2 Combining Equation (2) and (3) with (4), we arrive at the following statistic feasible to compute, for some consistent estimator

\hat{S}

,

\hat{q} = n^{- 1} \cdot {\hat{u}}^{'} \hat{X} {({\hat{X}}^{'} \hat{X})}^{- 1} {[{({\hat{X}}^{'} \hat{X})}^{- 1} \hat{S} {({\hat{X}}^{'} \hat{X})}^{- 1}]}^{-} {({\hat{X}}^{'} \hat{X})}^{- 1} {\hat{X}}^{'} \hat{u} .

(5)

We show in Appendix A that a generalized inverse of the middle matrix is

({\hat{X}}^{'} \hat{X}) {\hat{S}}^{+} ({\hat{X}}^{'} \hat{X})

, where

{\hat{S}}^{+}

denotes the Moore–Penrose generalized inverse. Inserting this in the expression for

\hat{q}

, we can simplify,

\hat{q} = n^{- 1} \cdot {\hat{u}}^{'} \hat{X} {\hat{S}}^{+} {\hat{X}}^{'} \hat{u} .

(6)

Next, because

\hat{X}

includes the submatrix

X_{2}

, which is by construction orthogonal to the OLS residuals

\hat{u}

, we obtain that

S = [\begin{matrix} Q_{K_{1} \times K_{1}} & 0_{K_{1} \times K_{2}} \\ 0_{K_{2} \times K_{1}} & 0_{K_{2} \times K_{2}} \end{matrix}], Q = plim (n^{- 1} {\hat{X}}_{1}^{'} \hat{u} {\hat{u}}^{'} {\hat{X}}_{1}) .

(7)

We show in Appendix B that

S^{+} = [\begin{matrix} Q_{K_{1} \times K_{1}}^{- 1} & 0_{K_{1} \times K_{2}} \\ 0_{K_{2} \times K_{1}} & 0_{K_{2} \times K_{2}} \end{matrix}] .

(8)

We have managed to eliminate the generalized inverse and to use a proper inverse.3 What remains now is to find a consistent estimator for the Q matrix. Decomposing the OLS residuals, we have

{\hat{X}}_{1}^{'} \hat{u} {\hat{u}}^{'} {\hat{X}}_{1} = {\hat{X}}_{1}^{'} M_{x} u u^{'} M_{x} {\hat{X}}_{1} .

This matrix expression, where we sandwich the outer product of the error vector, may look familiar to those acquainted with the heteroskedasticity-robust estimation literature and one could expect that we could now use the squared residuals in place of

u u^{'}

to estimate Q. However, there is an issue: the matrix

M_{x}

is

n \times n

, growing in both dimensions as the sample size increases. So it is not clear that the related proof strategy of White (1980) is applicable here. Nevertheless, by drilling down even more, we arrive, in Appendix C, at an expression that contains matrix products with finite dimensions. Thus, we can indeed apply this substitution, which provides a consistent estimator for Q as

\hat{Q} = n^{- 1} {\hat{X}}_{1}^{'} M_{x} {\hat{Ω}}_{0} M_{x} {\hat{X}}_{1}, {\hat{Ω}}_{0} = diag {{\hat{u}}_{i}^{2}} .

(9)

This indeed looks like a “White” estimator of a heteroskedastic covariance matrix. The expression is valid under the formal assumptions stated in White (1980), which we do not repeat here for brevity.

Equation (9) allows us also to conclude that the matrix Q must be invertible; otherwise, at least one component of the instruments matrix is not valid.

This can be shown in the following way: Let

x_{1 j}, j = 1, . . ., K_{1}

be a column of

X_{1}

, the submatrix with the endogenous regressors. If

P_{z} x_{1 j} = {\hat{x}}_{1 j} = x_{1 j}

, we will have

M_{x} {\hat{x}}_{1 j} = 0

so

{\hat{X}}_{1}^{'} M_{x}

will have a column of zeros and Q will be singular. However,

P_{z} x_{1 j} = x_{1 j}

implies that

x_{1 j}

belongs to the column space of the instruments matrix Z, meaning that it is an exact linear combination of the columns of Z. However, if this were the case, it necessarily implies that at least one of the instruments would be correlated with the error term, and so Z too would suffer from endogeneity. Namely, if

x_{1 j} = \sum_{ℓ}^{Λ_{1} + K_{2}} a_{ℓ} z_{ℓ}

while

E (x_{1 j}^{'} u) \neq 0

, we will have

E (u^{'} \sum_{ℓ}^{Λ_{1} + K_{2}} a_{ℓ} z_{ℓ}) \neq 0

.

Therefore, using a proper inverse here also serves the function of alerting us that such exact linear dependence exists between instruments and endogenous regressors, if the matrix Q proves to be non-invertible. In such a case, executing the endogeneity test using a generalized inverse would be wrong; one first has to somehow correct the instruments matrix to restore the validity of the instruments.4

Lastly, using these results in the expression for

\hat{q}

, and again the fact that

{\hat{u}}^{'} \hat{X} = [{\hat{u}}^{'} {\hat{X}}_{1} : 0]

, we arrive at the final expression for the heteroskedasticity-robust matrix Hausman statistic (where we have also canceled out the

n^{- 1}

factors),

{\hat{q}}_{h e t} = {\hat{u}}^{'} {\hat{X}}_{1} {[{\hat{X}}_{1}^{'} M_{x} {\hat{Ω}}_{0} M_{x} {\hat{X}}_{1}]}^{- 1} {\hat{X}}_{1}^{'} \hat{u} \to_{H_{0}}^{d} χ_{K_{1}}^{2}, {\hat{Ω}}_{0} = diag {{\hat{u}}_{i}^{2}} .

(10)

Computing this statistic first requires running OLS estimation on the original model to obtain the residuals

\hat{u}

and their squares for

{\hat{Ω}}_{0}

, and then using the matrices

P_{z}

and

X_{1}

(since

{\hat{X}}_{1} = P_{z} X_{1}

), and also the matrix

M_{x}

. The matrices

P_{z}

and

M_{x}

are of dimension

n \times n

; thus, for very large samples, they may be taxing for the software (although they will be used just once in an actual applied study; it is simulation studies that may be considerably slowed down when using them). If one wishes to avoid them, to obtain

{\hat{X}}_{1}

, we can run OLS regressions for the columns of

X_{1}

on Z, and also, to compute

M_{x} {\hat{X}}_{1}

we can run regressions of

{\hat{X}}_{1}

on X and obtain the resulting residual series.

The statistic can be used also under the assumption of homoskedasticity, in which case it becomes

{\hat{q}}_{h o m} = {({\hat{σ}}_{u}^{2})}^{- 1} {\hat{u}}^{'} {\hat{X}}_{1} {[{\hat{X}}_{1}^{'} M_{x} {\hat{X}}_{1}]}^{- 1} {\hat{X}}_{1}^{'} \hat{u} \to_{H_{0}}^{d} χ_{K_{1}}^{2} .

(11)

To increase power, one should use the error variance estimator from the OLS regression.

Equations (10) and (11) are the main theoretical contribution of this study. We have exploited the expression for the vector of contrasts in terms of projected regressors and OLS residuals, and we have arrived at a matrix Hausman statistic that incorporates possible heteroskedasticity in a natural way, it has a compact form, it does not use generalized inverses, and it guards against invalid instruments.

In the next section we present results from a simulation study to examine the performance of this matrix Hausman statistic, looking also into the variants that have been proposed for

\hat{Ω}

in an attempt to improve finite-sample performance. A recent overview and Monte Carlo study for these “HCx” estimators for heteroskedastic variance matrices can be found in MacKinnon (2013).

3. Monte Carlo Study

3.1. Description

We constructed a data generation process (DGP) with a constant term, one exogenous variable, two “suspected” endogenous variables and three external valid instruments. We considered a case where the DGP includes an unobservable covariate uncorrelated with the regressors (so here, OLS is consistent even if this variable is not included in the regressor matrix), and one where it is correlated (so there is endogeneity and OLS is inconsistent). The first case serves to examine the empirical size of the test, while the second provides information about the power of the test. The technical details of the Monte Carlo study are presented in Appendix D.

We created four scenarios as regards heteroskedasticity of the error term: homoskedasticity, heteroskedasticity with the error variance randomly changing per observation independently of the regressors, “group-wise” heteroskedasticity, where the error variance takes only three distinct values with equal probability, again independent of the regressors, and finally, a “random-coefficients” model, which leads to the error variance being a function of the regressors, without this affecting mean-independence. We considered sample sizes

n = 50, 75, 100, 200

, and in each case, we executed 10,000 repetitions. In all cases, we initiated the random number generator by the same seed. This has two consequences: first, for a given sample size all scenarios have identical series for the observable variables, and they differ only with respect to the endogeneity/heteroskedasticity aspect. Second, for each scenario, as we increase the sample size the previous generated values are fully part of the larger sample. In this way, we mimic the accumulation of data rather than the availability of independent larger data sets.

As regards the statistic, we used both its homoskedastic variant (i.e., assuming, correctly or not, that the true error is homoskedastic), as well as the four best-known alternatives for estimation of heteroskedastic variance matrices,

HCx, x = 0, 1, 2, 3

, as these are defined in MacKinnon (2013): writing

h_{i i}

for the diagonal element of the projection matrix

P_{x}

, we have

\begin{matrix} HC 0 : {\hat{Ω}}_{0} & = diag {{\hat{u}}_{i}^{2}}, \\ HC 1 : {\hat{Ω}}_{1} & = \frac{n}{n - k} diag {{\hat{u}}_{i}^{2}}, \\ HC 2 : {\hat{Ω}}_{2} & = diag {{\hat{u}}_{i}^{2} / (1 - h_{i i})}, \\ HC 3 : {\hat{Ω}}_{3} & = diag {{\hat{u}}_{i}^{2} / {(1 - h_{i i})}^{2}} . \end{matrix}

Note that k is the number of regressors each time. For our matrix statistic, the number of regressors is

k = K_{1} + K_{2}

, where

K_{1} = K_{2} = 2

.

3.2. Comparative Performance of the Variants of the Matrix Hausman Statistic

Here, we assess how the statistic performs in terms of empirical size and power, as we change the heteroskedasticity correction. We do not compare it with other forms of the Hausman test, because we want first to determine whether it has an acceptable performance (empirical size close to nominal, power rising fast with the sample size). If it performs acceptably, then a case arises to compare it to other forms of the Hausman test. We present the results in Table 1, which relates to testing at the

5 %

significance level.

We have the following main observations: first, the behavior of the Hausman matrix statistic as regards empirical size is rather stable, across different true skedastic scenarios as well as across different

HCx

ways to incorporate the possible heteroskedasticity. In fact, it performs acceptably in relation to the size of the test, even if we ignore the possible presence of heteroskedasticity and use (11) instead of (10). Second, for the various

HCx

variants to account for heteroskedasticity, the empirical size monotonically falls as we increase the strength of the finite-sample correction that we apply. Results for testing at the

10 %

significance level (available upon request) show a similar behavior in relation to empirical size.

As regards empirical power, the choice of the heteroskedastic variant for

\hat{Ω}

matters even more, for small sample sizes. Power also deteriorates monotonically and visibly, while the highest power is achieved when we use the homoskedastic variant (where the test is slightly conservative).5 Overall, for testing at the

5 %

significance level, it appears that the prudent thing to do when applying this statistic, is to use its

HC 0

heteroskedastic formula. When testing at the

10 %

significance level, power increases visibly. For example, under conditional heteroskedasticity, the power at

10 %

for sample sizes

n = 50, 75

tends to be higher by a factor of

1.2

to almost

1.9

, i.e., almost double the power at the

5 %

significance level, all else being equal. For the

10 %

significance level therefore, it appears best to use the homoskedastic variant of the matrix statistic, Equation (11).

The finding that ignoring heteroskedasticity while it exists may lead to better-performing tests should not be surprising for small samples. To account for heteroskedasticity, we use additional estimated quantities, the OLS residuals individually and this should be expected to negatively affect statistical power in the context of a small sample.

3.3. Comparison with the Wald Statistic from the Augmented Regression Approach

Using the exact same simulated data sets, we have also computed the Wald statistic coming from the augmented regression setup to test for endogeneity. Here, we first regress the suspected endogenous regressors

X_{1}

on the full instrument matrix Z, we obtain the residuals

M_{z} X_{1}

and we include these residuals in an augmented regressor matrix

X_{A} = [X_{1} : X_{2} : M_{z} X_{1}]

.6 We run an OLS regression of the dependent variable on

X_{A}

, and we compute a Wald test for the coefficients of the regressors in the submatrix

M_{z} X_{1}

.7

In the interest of space, we do not report the full results here. The Wald statistic clearly has a size problem for these small samples: it tends to over-reject the correct null hypothesis, sometimes even having empirical size nearly double the nominal one (for both

5 %

and

10 %

nominal significance levels). The over-rejection becomes less than one percentile across skedastic scenarios, only with the

HC 3

heteroskedasticity correction.8 In Table 2, we report the performance for this statistic for testing at the

5 %

significance level, and we repeat the performance metrics for our matrix statistic with the

HC 0

formula from Table 1.

The Wald statistic appears to have an advantage as regards power, even though some of this advantage will be lost due to the correction for the slightly oversized test. However, the picture changes if we want to test at the

10 %

significance level. Here, it is our matrix statistic (which moreover assumes homoskedasticity) that has the advantage in terms of power, as is shown in Table 3.

Overall, no statistic dominates the other, and for sample sizes larger than

n = 200

the two are essentially equivalent in terms of size and power. For smaller samples, the desired significance level of the test can be our guide in order to choose between them.

Funding

This research received no external funding.

Data Availability Statement

No data were used for this study.

Conflicts of Interest

The author declare no conflicts of interest.

Appendix A

We argue that a generalized inverse of

{({\hat{X}}^{'} \hat{X})}^{- 1} \hat{S} {({\hat{X}}^{'} \hat{X})}^{- 1}

is

({\hat{X}}^{'} \hat{X}) {\hat{S}}^{+} ({\hat{X}}^{'} \hat{X})

.

A generalized inverse

A^{-}

of matrix A satisfies

A A^{-} A = A

. Setting for compactness

{({\hat{X}}^{'} \hat{X})}^{- 1} \equiv C^{- 1}

and

A = C^{- 1} \hat{S} C^{- 1}

, our candidate generalized inverse is

A^{-} = C {\hat{S}}^{+} C

. We have

\begin{matrix} A A^{-} A & = [C^{- 1} \hat{S} C^{- 1}] [C {\hat{S}}^{+} C] [C^{- 1} \hat{S} C^{- 1}] \\ = C^{- 1} \hat{S} {\hat{S}}^{+} \hat{S} C^{- 1} \\ = C^{- 1} \hat{S} C^{- 1} = A, \end{matrix}

which is what we wanted to show.

\hat{S} {\hat{S}}^{+} \hat{S} = \hat{S}

holds because the Moore–Penrose inverse satisfies this condition, among others.

Appendix B

We argue that

S = [\begin{matrix} Q_{K_{1} \times K_{1}} & 0_{K_{1} \times K_{2}} \\ 0_{K_{2} \times K_{1}} & 0_{K_{2} \times K_{2}} \end{matrix}] \Rightarrow S^{+} = [\begin{matrix} Q_{K_{1} \times K_{1}}^{- 1} & 0_{K_{1} \times K_{2}} \\ 0_{K_{2} \times K_{1}} & 0_{K_{2} \times K_{2}} \end{matrix}] .

In order for a matrix

A^{+}

to be indeed the unique Moore–Penrose pseudo-inverse of matrix A, it must satisfy four conditions:

$A A^{+} A = A$
$A^{+} A A^{+} = A^{+}$
${(A A^{+})}^{'} = A A^{+}$
${(A^{+} A)}^{'} = A^{+} A$ .

Note that Q is symmetric. For condition 1, we have

[\begin{matrix} Q & 0 \\ 0 & 0 \end{matrix}] \cdot [\begin{matrix} Q^{- 1} & 0 \\ 0 & 0 \end{matrix}] \cdot [\begin{matrix} Q & 0 \\ 0 & 0 \end{matrix}] = [\begin{matrix} I_{K_{1}} & 0 \\ 0 & 0 \end{matrix}] \cdot [\begin{matrix} Q & 0 \\ 0 & 0 \end{matrix}] = {[\begin{matrix} Q & 0 \\ 0 & 0 \end{matrix}]}_{.}

For condition 2, we have

[\begin{matrix} Q^{- 1} & 0 \\ 0 & 0 \end{matrix}] \cdot [\begin{matrix} Q & 0 \\ 0 & 0 \end{matrix}] \cdot [\begin{matrix} Q^{- 1} & 0 \\ 0 & 0 \end{matrix}] = [\begin{matrix} I_{K_{1}} & 0 \\ 0 & 0 \end{matrix}] \cdot [\begin{matrix} Q^{- 1} & 0 \\ 0 & 0 \end{matrix}] = {[\begin{matrix} Q^{- 1} & 0 \\ 0 & 0 \end{matrix}]}_{.}

For condition 3, we have

{([\begin{matrix} Q & 0 \\ 0 & 0 \end{matrix}] \cdot [\begin{matrix} Q^{- 1} & 0 \\ 0 & 0 \end{matrix}])}^{'} = [\begin{matrix} I_{K_{1}} & 0 \\ 0 & 0 \end{matrix}] = [\begin{matrix} Q & 0 \\ 0 & 0 \end{matrix}] \cdot {[\begin{matrix} Q^{- 1} & 0 \\ 0 & 0 \end{matrix}]}_{,}

and for condition 4, we have analogously

{([\begin{matrix} Q^{- 1} & 0 \\ 0 & 0 \end{matrix}] \cdot [\begin{matrix} Q & 0 \\ 0 & 0 \end{matrix}])}^{'} = [\begin{matrix} I_{K_{1}} & 0 \\ 0 & 0 \end{matrix}] = [\begin{matrix} Q^{- 1} & 0 \\ 0 & 0 \end{matrix}] \cdot {[\begin{matrix} Q & 0 \\ 0 & 0 \end{matrix}]}_{.}

Appendix C

We want to find a consistent estimator for

Q = plim (n^{- 1} {\hat{X}}_{1}^{'} \hat{u} {\hat{u}}^{'} {\hat{X}}_{1}) .

We have

{\hat{X}}_{1}^{'} \hat{u} = X_{1}^{'} P_{z} \hat{u} = X_{1}^{'} Z {(Z^{'} Z)}^{- 1} Z^{'} \hat{u} = X_{1}^{'} Z {(Z^{'} Z)}^{- 1} [\begin{matrix} Z_{1}^{'} \\ X_{2}^{'} \end{matrix}] \hat{u} .

However,

X_{2}^{'} \hat{u} = 0

; thus, carrying out the multiplications and including the sample size as a scaling factor, we arrive at

Q = plim \{(n^{- 1} X_{1}^{'} Z) {(n^{- 1} Z^{'} Z)}^{- 1} [\begin{matrix} n^{- 1} Z_{1}^{'} \hat{u} {\hat{u}}^{'} Z_{1} & 0 \\ 0 & 0 \end{matrix}] {(n^{- 1} Z^{'} Z)}^{- 1} (n^{- 1} Z^{'} X_{1})\} .

The standard regularity conditions are assumed to hold, and so the matrices that sandwich the middle one are well defined and converge to a finite probability limit. Focusing on the middle one, we have

\hat{u} {\hat{u}}^{'} = M_{x} u u^{'} M_{x} = (I_{n} - P_{x}) u u^{'} (I_{n} - P_{x}) = u u^{'} - u u^{'} P_{x} - P_{x} u u^{'} + P_{x} u u^{'} P_{x} .

Further,

Z_{1}^{'} P_{x} = Z_{1}^{'} X {(X^{'} X)}^{- 1} X^{'}

and

P_{x} Z_{1} = X {(X^{'} X)}^{- 1} X^{'} Z_{1}

; thus, adding scaling factors again, we have

\begin{matrix} n^{- 1} Z_{1}^{'} \hat{u} {\hat{u}}^{'} Z_{1} = & n^{- 1} Z_{1}^{'} u u^{'} Z_{1} - (n^{- 1} Z_{1}^{'} u u^{'} X) {(X^{'} X)}^{- 1} X^{'} Z_{1} \\ - (n^{- 1} Z_{1}^{'} X) {(n^{- 1} X^{'} X)}^{- 1} (n^{- 1} X^{'} u u^{'} Z_{1}) \\ + (n^{- 1} Z_{1}^{'} X) {(n^{- 1} X^{'} X)}^{- 1} (n^{- 1} X^{'} u u^{'} X) {(n^{- 1} X^{'} X)}^{- 1} (n^{- 1} X^{'} Z_{1}) . \end{matrix}

Under the regularity conditions and the assumptions in White (1980), the probability limits of all these matrix products can be consistently estimated if in place of

u u^{'}

, we use

{\hat{Ω}}_{0} = diag {{\hat{u}}_{i}^{2}}

. Reverting back, this means that

n^{- 1} Z_{1}^{'} M_{x} {\hat{Ω}}_{0} M_{x} Z_{1} \to_{p} plim [n^{- 1} Z_{1}^{'} M_{x} u u^{'} M_{x} Z_{1}] = plim [n^{- 1} Z_{1}^{'} \hat{u} {\hat{u}}^{'} Z_{1}] .

Thus, we have obtained that a consistent estimator of the matrix Q is (now eliminating the redundant scaling factors)

\hat{Q} = X_{1}^{'} Z {(Z^{'} Z)}^{- 1} [\begin{matrix} n^{- 1} Z_{1} M_{x} {\hat{Ω}}_{0} M_{x} Z_{1} & 0 \\ 0 & 0 \end{matrix}] {(Z^{'} Z)}^{- 1} Z^{'} X_{1} \to_{p} Q .

This can be compacted. Consider the matrix, suitably bracketed,

{\hat{X}}_{1}^{'} M_{x} {\hat{Ω}}_{0} M_{x} {\hat{X}}_{1} = X_{1}^{'} P_{z} M_{x} {\hat{Ω}}_{0} M_{x} P_{z} X_{1} = X_{1}^{'} Z {(Z^{'} Z)}^{- 1} [Z^{'} M_{x} {\hat{Ω}}_{0} M_{x} Z] {(Z^{'} Z)}^{- 1} Z^{'} X_{1} .

The outer terms are identical to the outer terms of

\hat{Q}

. Its middle term

Z^{'} M_{x} {\hat{Ω}}_{0} M_{x} Z

can be decomposed,

Z^{'} M_{x} {\hat{Ω}}_{0} M_{x} Z = [\begin{matrix} Z_{1}^{'} \\ X_{2}^{'} \end{matrix}] M_{x} {\hat{Ω}}_{0} M_{x} [\begin{matrix} Z_{1} & X_{2} \end{matrix}] = [\begin{matrix} Z_{1}^{'} M_{x} {\hat{Ω}}_{0} \\ X_{2}^{'} M_{x} {\hat{Ω}}_{0} \end{matrix}] [\begin{matrix} M_{x} Z_{1} & M_{x} X_{2} \end{matrix}] .

However,

M_{x} X_{2} = X_{2}^{'} M_{x} = 0

so

Z^{'} M_{x} {\hat{Ω}}_{0} M_{x} Z = [\begin{matrix} Z_{1} M_{x} {\hat{Ω}}_{0} M_{x} Z_{1} & 0 \\ 0 & 0 \end{matrix}] .

This is identical to the middle component of

\hat{Q}

; thus, we arrive at

\begin{matrix} \hat{Q} & = n^{- 1} X_{1}^{'} Z {(Z^{'} Z)}^{- 1} [Z^{'} M_{x} {\hat{Ω}}_{0} M_{x} Z] {(Z^{'} Z)}^{- 1} Z^{'} X_{1} \\ = n^{- 1} X_{1}^{'} P_{z} M_{x} {\hat{Ω}}_{0} M_{x} P_{z} X_{1} \\ = n^{- 1} {\hat{X}}_{1}^{'} M_{x} {\hat{Ω}}_{0} M_{x} {\hat{X}}_{1} . \end{matrix}

Appendix D

We present here the details of the Monte Carlo (MC) study whose results we report in the main text. The study was conducted using the software “gretl”. For the random number generator, we have used the seed 1930021000.

Table A1 contains the random variables that we have used as building blocks.

Table A1. Building blocks of MC simulation.

Symbol	Distribtuion	Description
$U_{1}$	$F (20, 15)$	Snedecor’s F-distr. with d.f. 20 (num.) and 15 (denom.)
$U_{3}$	P(1)	Poisson with mean equal to 1
$U_{4}$	$χ^{2} (3)$	Chi-square with 3 d.f.
$U_{5}$	$N (- 1, 2)$	Normal with mean equal to −1 and st.dev. 2
$U_{6}$	$t (6)$	Student’s-t with 6 d.f.
$U_{7}$	$U (- 2, 2)$	Continuous Uniform in (−2, 2)
$U_{8}$	$U (0, 2)$	Continuous Uniform in (0, 2)
$U_{9}$	$U_{d} (0, 2)$	Discrete Uniform in ${0, 1, 2}$

We have generated one “unobservable” that creates endogeneity and one that it does not, two regressors that become endogenous when the correlated unobservable is used in the data generation process, one exogenous, and three instruments. Table A2 contains the generating expressions.

Table A2. Variables in the MC simulation.

Symbol	Expression	Status
$L_{1}$	$0.7 U_{6} + U_{7}$	Latent correlated
$L_{2}$	$U_{7}$	Latent uncorrelated
$X_{11}$	$U_{1} + U_{3} + U_{6}$	Endogenous given $L_{1}$
$X_{12}$	$0.5 U_{3} + U_{5} - 0.5 U_{6}$	Endogenous given $L_{1}$
$X_{2}$	$U_{1} + U_{5}$	Exogenous
$Z_{11}$	$\sqrt{U_{3}} - U_{1}$	Instrument
$Z_{12}$	$\| U_{5} \|$	Instrument
$Z_{13}$	$U_{3} - U_{5}$	Instrument

We note that, even though

U_{5}

is a Normal random variable, the instrument

Z_{12} = | U_{5} |

is relevant (correlated with the endogenous variables), because

U_{5}

has a non-zero mean, and so its absolute value is a Folded Normal, and remains correlated with

U_{5}

(in contrast, if

U_{5}

was a zero-mean Normal and consequently

Z_{12}

a Half Normal, their covariance would be zero).

As regards the scenarios of homoskedasticity, random heteroskedasticity, and groupwise heteroskedasticity, the error term (including the unobservable variable) was generated as shown in Table A3 (

s = 1

implies that the correlated

L_{1}

latent variable was used).

Table A3. Error terms in the MC simulation.

Symbol	Expression	Model
$u^{[1, s]}$	$N (0, 2) + 3 L_{s}, s = 1, 2$	Homoskedasticity
$u^{[2, s]}$	$N (0, 1 + U_{8}) + 3 L_{s}, s = 1, 2$	Random Heteroskedasticity
$u^{[3, s]}$	$N (0, 1 + U_{9}) + 3 L_{s}, s = 1, 2$	Groupwise Heteroskedasticity

For these three setups, the dependent variable was generated as

Y^{[t, s]} = 1 - 5 X_{2} + 2 X_{11} + 1.5 X_{12} + u^{[t, s]}, t = 1, 2, 3, s = 1, 2 .

So, for example,

Y^{[2, 2]}

is the situation where we have random heteroskedasticity and no endogeneity; thus, it was used to assess the empirical size of the test for this specific heteroskedastic scenario.

For the conditional heteroskedasticity scheme, we used a random-coefficients model, with mean values of the random coefficients equal to the specified coefficients above, together with the homoskedastic error term

u^{[1, s]}

, namely

Y^{[4, s]} = N (1, 0.2) - N (5, 1) \cdot X_{2} + N (2, 0.4) \cdot X_{11} + N (1.5, 0.3) \cdot X_{12} + u^{[1, s]}, s = 1, 2 .

Decomposed, this leads to

\begin{matrix} Y^{[4, s]} = 1 - 5 X_{2} + 2 X_{11} + 1.5 X_{12} + u^{[4, s]}, \\ u^{[4, s]} = N (0, 0.2) - N (0, 1) \cdot X_{2} + N (0, 0.4) \cdot X_{11} + N (0, 0.3) \cdot X_{12} + u^{[1, s]}, s = 1, 2 . \end{matrix}

Since all conditional heteroskedasticity factors are scaled by independent zero-mean Normals, no additional source of endogeneity is created.

The general matrix Hausman statistic is

{\hat{q}}_{h e t} = {\hat{u}}^{'} {\hat{X}}_{1} {[{\hat{X}}_{1}^{'} M_{x} \hat{Ω} M_{x} {\hat{X}}_{1}]}^{- 1} {\hat{X}}_{1}^{'} \hat{u} .

The OLS regressions regressed

Y^{[t, s]}

on a constant and

X = (X_{2} : X_{11} : X_{12})

.

\hat{u}

are the OLS residuals used also in computing

\hat{Ω}

.

M_{x} = I_{n} - P_{x}

, where

P_{x}

is the projection matrix of X. Its diagonal elements

h_{i i}

were used for the variants of

\hat{Ω}

.

X_{1} = (X_{11} : X_{12})

and

{\hat{X}}_{1} = P_{z} X_{1}

, where

P_{z}

is the projection matrix of a constant and of

(X_{2} : Z_{1} : Z_{2} : Z_{3})

. For the homoskedastic variant of the statistic, the estimated OLS error variance

{\hat{σ}}_{u}

was used, instead of

\hat{Ω}

.

Notes

1	Sometimes it is also called the “artificial regression” or “control function” approach.
2	In the literature, the test is presented with the use of the Moore–Penrose pseudo-inverse $V^{+}$ , most likely because its uniqueness avoids the necessity to choose among alternatives in an ad hoc manner, as well as the uncertainty of obtaining possibly different results for different generalized inverses in finite samples. Regardless, the limiting distributional result holds for any generalized inverse, see Hausman and Taylor (1981).
3	The need for a generalized inverse in the original formulation of the test is treated as “cumbersome” in the literature, see for example Greene (2012, p. 276) and Wooldridge (2002, p. 119), and it is also put forth as an argument to favor the use of the augmented regression test.
4	The “augmented regression” test also guards against this possibility, since it uses the residuals from regressing each endogenous variable on the instruments. If exact linear dependence exists, the related series of residuals will be a series of zeros.
5	This monotonic fall of power, as we “intensify” the degree to which we attempt to correct the heteroskedasticity estimator for finite sample performance, is in accord with what MacKinnon (2013, pp. 456–57) found.
6	In case there is an issue with the validity of the instruments, as discussed earlier, in the augmented regression method, we would get at least one series of zero residuals.
7	So, as regards the heteroskedasticity corrector $HC 1$ , the number of regressors in the augmented regression setup is $k = 2 K_{1} + K_{2}$ , while for $HC 2$ and $HC 3$ , the diagonal element $h_{i i}$ is of a projection matrix that includes these additional variables.
8	MacKinnon (2013, pp. 449–52) also found in his simulations that the HC3 variant performs best as regards empirical size in small samples.

References

Adkins, Lee C., Randall C. Campbell, Viera Chmelarova, and R. Carter Hill. 2012. The Hausman test, and some alternatives, with heteroskedastic data. In Essays in Honor of Jerry Hausman. Advances in Econometrics, vol. 29. Leeds: Emerald Group Publishing Ltd. [Google Scholar]
Amini, Shahram, Michael S. Delgado, Daniel J. Henderson, and Christopher F. Parmeter. 2012. Fixed vs. random: The Hausman test four decades later. In Essays in Honor of Jerry Hausman. Advances in Econometrics, vol. 29. Leeds: Emerald Group Publishing Ltd. [Google Scholar]
Durbin, James. 1954. Errors in variables. Revue de l’institut International de Statistique 22: 23–32. [Google Scholar] [CrossRef]
Greene, William H. 2012. Econometric Analysis, 7th ed. Harlow: Pearson Education Ltd. [Google Scholar]
Hahn, Jinyong, John C. Ham, and Hyungsik Roger Moon. 2011. The Hausman test and weak instruments. Journal of Econometrics 160: 289–99. [Google Scholar] [CrossRef]
Hausman, Jerry A. 1978. Specification tests in econometrics. Econometrica 46: 1251–71. [Google Scholar] [CrossRef]
Hausman, Jerry A., and William E. Taylor. 1981. A generalized specification test. Economics Letters 8: 239–45. [Google Scholar] [CrossRef]
MacKinnon, James G. 2013. Thirty years of heteroskedasticity-robust inference. In Recent Advances and Future Directions in Causality, Prediction, and Specification Analysis. Edited by Xiaohong Chen and Norman R. Swanson. New York: Springer, pp. 437–61. [Google Scholar]
White, Halbert. 1980. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48: 817–38. [Google Scholar] [CrossRef]
Wooldridge, Jeffrey M. 2002. Econometric Analysis of cross Section and Panel Data. Cambridge: MIT Press. [Google Scholar]
Wu, De-Min. 1973. Alternative tests of independence between stochastic regressors and disturbances. Econometrica 41: 733–50. [Google Scholar] [CrossRef]
Wu, De-Min. 1974. Alternative tests of independence between stochastic regressors and disturbances: Finite sample results. Econometrica 42: 529–46. [Google Scholar] [CrossRef]

Table 1. Monte Carlo simulation study. Empirical Size and Power of the matrix Hausman statistic. Nominal size: 5%.

	Skedastic Scenario	Homoskedasticity		Random		Group-Wise		Conditional
n	Robust Estimation	Size	Power	Size	Power	Size	Power	Size	Power
50	Homoskedastic	4.69	49.05	4.77	48.43	4.86	47.23	5.50	38.57
	HC0	5.54	41.09	5.55	40.67	5.45	40.11	5.81	32.04
	HC1	4.30	35.06	4.23	34.64	3.96	34.22	4.03	26.53
	HC2	4.03	33.00	4.00	32.22	3.67	32.11	3.59	24.30
	HC3	2.77	24.93	2.70	24.46	2.23	24.10	2.18	17.36
75	Homoskedastic	4.50	71.74	4.61	71.37	4.78	70.51	5.53	57.97
	HC0	5.21	64.98	5.30	64.36	5.35	63.60	5.61	51.21
	HC1	4.37	61.44	4.43	60.94	4.39	60.08	4.53	47.26
	HC2	4.15	59.41	4.09	58.97	4.16	58.06	4.13	45.04
	HC3	3.18	53.43	3.12	52.77	3.15	51.41	2.91	38.67
100	Homoskedastic	4.62	86.16	4.70	85.31	4.80	84.96	5.76	73.35
	HC0	5.02	81.33	5.23	80.72	5.17	80.25	5.05	66.79
	HC1	4.51	79.55	4.34	78.59	4.53	78.39	4.47	64.38
	HC2	4.34	78.23	4.16	77.38	4.24	77.04	4.15	62.46
	HC3	3.43	74.23	3.36	73.25	3.58	72.65	3.32	57.35
200	Homoskedastic	4.90	99.50	4.85	99.45	4.47	99.31	6.08	96.79
	HC0	5.19	99.13	5.00	99.02	4.86	98.95	5.26	94.74
	HC1	4.86	99.06	4.81	98.98	4.50	98.87	4.97	94.34
	HC2	4.70	98.94	4.75	98.85	4.36	98.73	4.79	93.77
	HC3	4.27	98.53	4.29	98.43	3.95	98.32	4.28	92.67

Table 2. Comparison in empirical size and power of the matrix Hausman statistic vs. the Wald statistic from the augmented regression setup. Nominal size: 5%.

	Skedastic Scenario	Homoskedasticity		Random		Group-Wise		Conditional
n	Statistic	Size	Power	Size	Power	Size	Power	Size	Power
50	${\hat{q}}_{h e t}$ -HC0	5.54	41.09	5.55	40.67	5.45	40.11	5.81	32.04
	Wald-HC3	5.76	45.17	5.74	44.84	5.76	44.21	5.64	34.93
75	${\hat{q}}_{h e t}$ -HC0	5.21	64.98	5.30	64.36	5.35	63.60	5.61	51.21
	Wald-HC3	5.57	68.42	5.45	67.76	5.53	67.53	5.43	52.96
100	${\hat{q}}_{h e t}$ -HC0	5.02	81.33	5.23	80.72	5.17	80.25	5.05	66.79
	Wald-HC3	5.40	83.62	5.29	82.63	5.39	82.56	5.31	67.84
200	${\hat{q}}_{h e t}$ -HC0	5.19	99.13	5.00	99.02	4.86	98.95	5.26	94.74
	Wald-HC3	5.36	99.38	5.18	99.31	4.87	99.16	5.39	94.80

Table 3. Comparison in empirical size and power of the matrix Hausman statistic vs. the Wald statistic from the augmented regression setup. Nominal size: 10%.

	Skedastic Scenario	Homoskedasticity		Random		Group-Wise		Conditional
n	Statistic	Size	Power	Size	Power	Size	Power	Size	Power
50	${\hat{q}}_{h o m}$	9.72	62.89	9.63	62.31	10.14	60.37	10.60	51.81
	Wald-HC3	10.16	57.35	10.15	56.43	10.44	55.14	9.91	45.69
75	${\hat{q}}_{h o m}$	9.79	82.04	9.79	81.41	9.95	80.93	11.05	70.08
	Wald-HC3	10.11	77.89	10.09	77.51	10.18	77.27	9.95	64.43
100	${\hat{q}}_{h o m}$	9.77	92.14	9.74	91.68	10.26	91.34	11.23	82.55
	Wald-HC3	10.07	90.45	10.02	89.91	10.26	89.51	10.10	78.26
200	${\hat{q}}_{h o m}$	9.96	99.80	9.81	99.74	9.96	99.75	11.60	98.45
	Wald-HC3	9.90	99.77	9.93	99.66	10.33	99.62	10.23	97.33

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Papadopoulos, A. A New Matrix Statistic for the Hausman Endogeneity Test under Heteroskedasticity. Econometrics 2023, 11, 23. https://doi.org/10.3390/econometrics11040023

AMA Style

Papadopoulos A. A New Matrix Statistic for the Hausman Endogeneity Test under Heteroskedasticity. Econometrics. 2023; 11(4):23. https://doi.org/10.3390/econometrics11040023

Chicago/Turabian Style

Papadopoulos, Alecos. 2023. "A New Matrix Statistic for the Hausman Endogeneity Test under Heteroskedasticity" Econometrics 11, no. 4: 23. https://doi.org/10.3390/econometrics11040023

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Matrix Statistic for the Hausman Endogeneity Test under Heteroskedasticity

Abstract

1. Introduction

2. The Matrix Hausman Statistic for Testing Endogeneity

3. Monte Carlo Study

3.1. Description

3.2. Comparative Performance of the Variants of the Matrix Hausman Statistic

3.3. Comparison with the Wald Statistic from the Augmented Regression Approach

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

Appendix C

Appendix D

Notes

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI