Next Article in Journal
N-States Continuous Maxwell Demon
Previous Article in Journal
Characterizing the Impact of Communication on Cellular and Collective Behavior Using a Three-Dimensional Multiscale Cellular Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Non-Iterative Multiscale Estimation for Spatial Autoregressive Geographically Weighted Regression Models

1
Department of Finance and Statistics, School of Science, Xi’an Polytechnic University, Xi’an 710048, China
2
Department of Statistics, School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049, China
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(2), 320; https://doi.org/10.3390/e25020320
Submission received: 15 November 2022 / Revised: 2 February 2023 / Accepted: 6 February 2023 / Published: 9 February 2023
(This article belongs to the Section Information Theory, Probability and Statistics)

Abstract

:
Multiscale estimation for geographically weighted regression (GWR) and the related models has attracted much attention due to their superiority. This kind of estimation method will not only improve the accuracy of the coefficient estimators but also reveal the underlying spatial scale of each explanatory variable. However, most of the existing multiscale estimation approaches are backfitting-based iterative procedures that are very time-consuming. To alleviate the computation complexity, we propose in this paper a non-iterative multiscale estimation method and its simplified scenario for spatial autoregressive geographically weighted regression (SARGWR) models, a kind of important GWR-related model that simultaneously takes into account spatial autocorrelation in the response variable and spatial heterogeneity in the regression relationship. In the proposed multiscale estimation methods, the two-stage least-squares (2SLS) based GWR and the local-linear GWR estimators of the regression coefficients with a shrunk bandwidth size are respectively taken to be the initial estimators to obtain the final multiscale estimators of the coefficients without iteration. A simulation study is conducted to assess the performance of the proposed multiscale estimation methods, and the results show that the proposed methods are much more efficient than the backfitting-based estimation procedure. In addition, the proposed methods can also yield accurate coefficient estimators and such variable-specific optimal bandwidth sizes that correctly reflect the underlying spatial scales of the explanatory variables. A real-life example is further provided to demonstrate the applicability of the proposed multiscale estimation methods.

1. Introduction

Geographically weighted regression (GWR) models [1,2,3], which are an extension of the linear regression models by allowing the regression coefficients to vary over space, have been a popular tool for modeling spatial heterogeneity in regression relationships. A GWR model is originally calibrated by the locally weighted least-squares procedure, where the local weights at each focal spatial location are determined by a pre-specified kernel function with a single bandwidth for all of the regression coefficients, and the optimal bandwidth size is chosen by a data-driven criterion, such as the cross-validation (CV) or the corrected Akaike information criterion (AICc) [2,3]. In the GWR technique, spatial heterogeneity in regression relationships is revealed by spatial variation patterns of the estimated regression coefficients. Therefore, the accuracy of coefficient estimators is essential to validly interpret spatial heterogeneity in regression relationships.
In geographical information science, spatial scale is one of the most important concepts [4], and a spatial process inherently operates at a spatial scale [5]. In a GWR model, different spatial scales of the explanatory variables may lead to the fact that their respective coefficients have different levels of spatial heterogeneity, and a common bandwidth in the traditional GWR technique can not produce valid estimators for all of the coefficients [6,7], which has also been theoretically proven in statistically varying coefficient models [8]. In order to overcome this shortcoming, Yang [6] proposed GWR with flexible bandwidths in which the backfitting procedure [9] is employed to iteratively estimate the spatially varying coefficients with the variable-specific bandwidth sizes selected by the AICc or CV criterion. Furthermore, Fotheringham et al. [7] explicitly connected the spatial scale of each explanatory variable with its specific bandwidth and termed the model as a multiscale geographically weighted regression model. A similar calibration procedure was also suggested by Leong and Yue [10] from the perspective of improving the coefficient estimation accuracy of a GWR model. For ease of presentation, we henceforth refer to this multiscale estimation method as GWR-BF, meaning that a GWR model is calibrated by the backfitting-based iterative procedure. It has been demonstrated that GWR-BF will yield not only more accurate estimators of the regression coefficients but also provide information about the spatial scale at which each explanatory variable operates [7,10,11].
Recently, GWR-BF has been extended to some other GWR-related models. For example, Chen and Mei [12] extended the GWR-BF method and formulated a multiscale estimation procedure for the semi-parametric GWR models originally proposed by Brunsdon et al. [13]; furthermore, combining the backfitting procedure with the profile maximum likelihood estimation for spatial autoregressive geographically weighted regression (SARGWR) models [14,15,16], Chen et al. [17] formulated a multiscale estimation method for the models; Wu et al. [18] proposed a backfitting-based multiscale estimation approach for geographically and temporally weighted regression (GTWR) models [19,20]; Zhang et al. [21] suggested a unilateral temporal weighting scheme and proposed a more flexible multiscale estimation method for GTWR models. Due to its superiority, the multiscale estimation for the GWR and the related models have been applied to many areas for spatial or spatiotemporal data analysis (see, for example, [22,23,24,25,26,27,28]).
The backfitting-based multiscale estimation methods for the GWR and the related models, however, are iterative algorithms in estimating individual regression coefficients and searching for their respective optimal bandwidth sizes. Therefore, such methods are very time-consuming, especially when the number of explanatory variables is large. In particular, the multiscale estimation method for SARGWR models in Chen et al. [17] is much more time-demanding because, in addition to the computation cost of the iterative backfitting algorithm and maximum likelihood estimation, the spatial autoregressive coefficient should also be optimized in each iteration loop, where the determinant of the related matrix with its order being the sample size needs to be calculated for each candidate value of the autoregressive coefficient, although this maximum-likelihood-based multiscale estimation can yield more accurate estimators for both the autoregressive and the regression coefficients. Therefore, the development of cost-effective algorithms is essential for real-world applications of multiscale estimation methods of the GWR and its related models.
Recently, there has been a rapid development of machine-learning-based GWR methods, such as geographically weighted extreme learning machine [29], geographically weighted elastic net [30], geographically (and temporally) neural network weighted regression [31,32], geographically weighted regression with the integration of machine learning [33], and the adapted geographically LASSO [34]. One can refer to the reference [35] for a comprehensive overview of spatial machine learning methods, including GWR models. The machine-learning-based GWR methods are, in general, more efficient in estimation and accurate in prediction than the multiscale and even the traditional calibration methods. Nevertheless, how to derive the spatial scale information of each explanatory variable with machine-learning-based GWR methods remains to be investigated.
Focusing on the GWR-BF multiscale estimation procedure, some efforts have been devoted to the reduction of computation cost. For example, Yang [6] suggested to pre-specify a larger value of the convergence threshold. However, this may lead to the premature of coefficient estimators. Based on the power of modern computers, Li and Fotheringham [36] formulated a parallel implementation for GWR-BF. Very recently, Wu et al. [37] proposed a non-iterative multiscale estimation procedure for GWR models. In this estimation method, the idea of the two-step locally weighted least-squares procedure proposed by Fan and Zhang [8] for fitting the statistically varying coefficient models is employed to implement the non-iterative estimation and the local-linear-fitting method for calibrating GWR models [38] is further used in each step to reduce the boundary effect of coefficient estimators. We henceforth abbreviate this non-iterative multiscale estimation method as GWR-LL, where “LL” means the coefficients are estimated by the local-linear-fitting method in both steps. The simulation study with a real-life example in Wu et al. [37] shows that GWR-LL is not only much more cost-effective than GWR-BF but can also significantly reduce the boundary effect of coefficient estimators. Furthermore, GWR-LL also yields such variable-specific optimal bandwidth sizes that correctly characterize the respective spatial scales of the explanatory variables.
As a kind of important GWR-related model, the spatial autoregressive geographically weighted regression (SARGWR) model incorporates a spatial lag term of the response variable into the GWR model to simultaneously take into account spatial autocorrelation in the response variable and spatial heterogeneity in the regression relationships [14,15,16]. Since SARGWR models can simultaneously consider the two fundamental properties of spatial data, i.e., spatial autocorrelation and spatial heterogeneity, they have a wide application background in spatial data analysis. Therefore, it is of great importance to the development of some cost-effective multiscale estimation methods in view of the advantages of the multiscale estimation methods for calibrating the GWR and its related models. As aforementioned, however, the existing iterative multiscale estimation for SARGWR models proposed by Chen et al. [17] is extremely time-consuming.
Motivated by the simplicity of the two-stage least-squares (2SLS) method in which the spatial lag term is replaced by an instrument-variables-based estimator, and then the autoregressive coefficient is estimated by the ordinary least-squares procedure [39,40], and considering that the GWR-based 2SLS estimation for SARGWR models has been explicitly formulated in Mei and Chen [41], we first extend in this paper the GWR-based 2SLS estimation to the local-linear GWR-based 2SLS estimation of SARGWR models. Then, the idea of GWR-LL in Wu et al. [37] is employed to develop a non-iterative multiscale estimation method for SARGWR models. Specifically, the extended local-linear GWR-based 2SLS estimation, instead of the profile maximum likelihood estimation in Chen et al. [17], of SARGWR models are used to derive the initial estimators of the spatial autoregressive coefficient and the regression coefficients for the purpose of reducing the computation cost. Then, the GWR-LL procedure is applied to the estimation of the regression coefficients. Furthermore, the 2SLS estimation for spatial autoregressive models [39,40] is used to re-estimate the autoregressive coefficient. This non-iterative multiscale estimation method is referred to as SARGWR-LL henceforth, meaning that the local-linear GWR estimation method [38] is used in both the initial and final steps. Moreover, considering that the local-linear estimation of GWR models is more time-consuming than their traditional estimation, and the initial estimators of the regression coefficients might have less influence on their final multiscale estimators, we further propose a simplified scenario of SARGWR-LL by replacing the initial local-linear GWR-based 2SLS estimators of the spatial autoregressive coefficient and regression coefficients with their respective GWR-based 2SLS estimators, which we refer to as SARGWR-GL in the subsequent presentation, where “G” means that the GWR-based 2SLS estimation method is employed in the initial step.
The rest of the paper is organized as follows. In Section 2, the GWR-based 2SLS estimation of SARGWR models [41] is briefly reviewed, on which its local-linear GWR-based version is derived; based on the GWR-based 2SLS estimation and its local-linear GWR-based version of SARGWR models, the SARGWR-LL and its simplified scenario SARGWR-GL are finally formulated. A simulation study and a real-life example based on Dublin voter turnout data are conducted in Section 3 to assess and compare the performance of the related multiscale estimation methods for SARGWR models. The paper is ended with a conclusion and discussion.

2. Methods

2.1. Spatial Autoregressive Geographically Weighted Regression (SARGWR) Model

Let Y be the response variable, and X 1 , X 2 , , X p be the associated explanatory variables. Given their observations { ( y i ; x i 1 , x i 2 , , x i p ) } i = 1 n collected at n spatial locations { ( u i , v i ) } i = 1 n , the SARGWR model studied in this paper is
y i = ρ j = 1 n w i j y j + j = 1 p β j u i , v i x i j + ε i , i = 1 , 2 , , n ,
where the parameter ρ , which is assumed to be | ρ | < 1 , is called the autoregressive coefficient, measuring the intensity of spatial autocorrelation in the response variable Y; { w i j } i , j = 1 n are the elements of a pre-specified row-standardized spatial weights matrix W = w i j n × n of the n sampling locations with w i i = 0 ( i = 1 , 2 , , n ) assumed by convention, which characterizes the neighborhood relationship between each sampling point and its neighbors; { β j ( u , v ) } j = 1 p are p regression coefficients which are unknown functions of the spatial coordinates ( u , v ) ; and { ε i } i = 1 n are independent and identically distributed errors with E ( ε i ) = 0 and Var ( ε i ) = σ 2 > 0 . We can take X 1 = 1 (i.e., x i 1 = 1 for i = 1 , 2 , , n ) to make the model include a spatially varying intercept.
Let y = y 1 , y 2 , , y n T ; x i = x i 1 , x i 2 , , x i p T ( i = 1 , 2 , , n ) ; ε = ( ε 1 , ε 2 , , ε n ) T ; β ( u i , v i ) = β 1 u i , v i , β 2 u i , v i , , β p u i , v i T ( i = 1 , 2 , , n ) ; and M = x 1 T β ( u 1 , v 1 ) , x 2 T β ( u 2 , v 2 ) , , x n T β ( u n , v n ) T . The SARGWR model in Equation (1) can be expressed by matrix notation as
y = ρ Wy + M + ε .
Remark 1.
The above SARGWR model assumes that the autoregressive coefficient ρ is constant. This means that spatial autocorrelation in the response variable between the observation of the response variable at any spatial sampling unit and those observed at its neighbors is the same over space, which may be unrealistic for many real-life spatial data sets. The more general SARGWR model should allow ρ to vary over space, i.e., ρ ( u , v ) instead. Considering that the SARGWR model with constant autoregressive coefficient is a kind of standard model in the application, and the 2SLS procedure can be directly employed to estimate the parameter ρ, we then mainly focus this kind of SARGWR model on deriving its non-iterative multiscale estimation method. The extension to the SARGWR model with a spatially varying autoregressive coefficient ρ ( u , v ) will be discussed in the final section.

2.2. Preliminary Methods for Formulating the Non-Iterative Multiscale Estimation of the SARGWR Model

2.2.1. GWR-Based 2SLS Estimation of the SARGWR Model

In this subsection, we briefly review the GWR-based 2SLS estimation of the SARGWR model, which we will use to formulate the SARGWR-GL multiscale estimation procedure. For more detailed derivation, one can refer to the supplementary materials at https://doi.org/10.1016/j.spasta.2022.100666 provided in Mei and Chen [41].
We rewrite the model in Equation (2) as
y ˜ = y ρ Wy = M + ε = x 1 T β u 1 , v 1 , x 2 T β u 2 , v 2 , , x n T β u n , v n T + ε .
We treat the above model as a GWR model with the observation vector of the response variable being y ˜ = y ρ Wy , and obtain, according to Brunsdon et al. [2], that the GWR estimators of β ( u i , v i ) ( i = 1 , 2 , , n ) are
β ^ G ( u i , v i ) = ( β ^ G 1 ( u i , v i ) , β ^ G 2 ( u i , v i ) , , β ^ G p ( u i , v i ) ) T = ( X T W h ( u i , v i ) X ) 1 X T W h ( u i , v i ) ( y ρ Wy ) , i = 1 , 2 , , n ,
where the subscript “G” means the traditional GWR estimation method, X = x 1 , x 2 , , x n T is the design matrix, and
W h ( u i , v i ) = Diag ( w 1 h ( u i , v i ) , w 2 h ( u i , v i ) , , w n h ( u i , v i ) )
is the calibration weights matrix at ( u i , v i ) , which is related to the bandwidth h. Here, the word “calibration” is used in order to distinguish this matrix from the spatial weights matrix W in the model.
Substituting β ^ G ( u i , v i ) ( i = 1 , 2 , , n ) into M , we obtain the estimator of M as
M ^ G = ( x 1 T β ^ G ( u 1 , v 1 ) , x 2 T β ^ G ( u 2 , v 2 ) , , x n T β ^ G ( u n , v n ) ) T = S G ( h ) y ˜ = S G ( h ) ( I n ρ W ) y ,
where I n is the identity matrix of order n, and
S G h = x 1 T X T W h ( u 1 , v 1 ) X 1 X T W h ( u 1 , v 1 ) x 2 T X T W h ( u 2 , v 2 ) X 1 X T W h ( u 2 , v 2 ) x n T X T W h ( u n , v n ) X 1 X T W h ( u n , v n ) .
Furthermore, replacing M in Equation (3) with its estimator M ^ G yields the following artificial spatial autoregressive model:
I n S G h y = ρ I n S G ( h ) Wy + ε .
In the above model, since Wy is endogenous, an instrumental estimator of Wy is needed to derive a consistent estimator of ρ in the framework of least-squares estimation. As suggested by Geniaux and Martinetti [16] as well as by Mei and Chen [41], the instrumental variables Q = ( X , W X ( 1 ) , W 2 X ( 1 ) ) , where X is the design matrix, and X ( 1 ) is such a matrix that the first column of X is removed when an intercept is included in the SARGWR model, can be used to estimate Wy by formulating the following GWR model:
y w i = j = 1 n w i j y j = q i T α ( u i , v i ) + η i , i = 1 , 2 , , n ,
where y w i is the i-th element of Wy and q i T = ( 1 , q i 2 , , q i , 3 p 2 ) is the i-th row of the instrumental variables Q . Calibrating the model in Equation (7) with the traditional GWR technique [2], we obtain the fitted values of y w i ( i = 1 , 2 , , n ) as
y ^ w i = q i T Q T W h u i , v i Q 1 Q T W h u i , v i Wy , i = 1 , 2 , , n .
Therefore, the instrumental estimator of Wy , which we denoted by y ^ w , is
y ^ w = y ^ w 1 , y ^ w 2 , , y ^ w n T = S Q h Wy ,
where
S Q ( h ) = q 1 T ( Q T W h ( u 1 , v 1 ) Q ) 1 Q T W h ( u 1 , v 1 ) q 2 T ( Q T W h ( u 2 , v 2 ) Q ) 1 Q T W h ( u 2 , v 2 ) q n T ( Q T W h ( u n , v n ) Q ) 1 Q T W h ( u n , v n ) .
Replacing Wy in the model in Equation (6) with its instrumental estimator y ^ w in Equation (8) and using the least-squares estimation procedure, we obtain the GWR-based 2SLS estimator of the autoregressive coefficient ρ as
ρ ^ G = y ^ w T I n S G h T I n S G h y ^ w 1 y ^ w T I n S G h T I n S G h y .
Substituting ρ ^ G into Equation (4), we obtain the GWR-based 2SLS estimators of the regression coefficient vectors β ( u i , v i ) ( i = 1 , 2 , , n ) as
β ^ G u i , v i = β ^ G 1 u i , v i , β ^ G 2 u i , v i , , β ^ G p u i , v i T = X T W h u i , v i X 1 X T W h u i , v i I n ρ ^ G W y , i = 1 , 2 , , n .
Furthermore, the fitted values of the response variable Y at { ( u i , v i ) } i = 1 n are
y ^ G i h = ρ ^ G j = 1 n w i j y j + j = 1 p β ^ G j u i , v i x i j , i = 1 , 2 , , n ,
and the fitted vector can be expressed as
y ^ G h = y ^ G 1 h , y ^ G 2 h , , y ^ G n h T = H G h y ,
where the hat matrix is
H G h = I n S G h Wy y ^ w T I n S G h T I n S G h y ^ w 1 × y ^ w T I n S G h T I n S G h + S G h .
For ease of presentation, we henceforth refer to the GWR-based 2SLS estimators ρ ^ G and β ^ G ( u i , v i ) ( i = 1 , 2 , , n ) in Equations (10) and (11) as their respective GWR-2SLS estimators.

2.2.2. Local-Linear GWR-Based 2SLS Estimation of the SARGWR Model

In what follows, we extend the GWR-based 2SLS estimation of the SARGWR model to local-linear GWR-based estimation, which will be used to formulate the multiscale estimation procedure, SARGWR-LL.
Treating once again the model in Equation (3) as a GWR model with the observation vector of the response variable being y ˜ = y ρ Wy . According to Wang et al. [38], the local-linear estimators of β ( u i , v i ) ( i = 1 , 2 , , n ) are
β ^ L u i , v i = β ^ L 1 u i , v i , β ^ L 2 u i , v i , , β ^ L p u i , v i T = I p , 0 p × ( 2 p ) X T u i , v i W h u i , v i X u i , v i 1 X T u i , v i × W h u i , v i y ρ Wy , i = 1 , 2 , , n ,
where the subscript “L” represents the local-linear GWR estimation,
X ( u i , v i ) = x 11 x 1 p x 11 ( u 1 u i ) x 1 p ( u 1 u i ) x 11 ( v 1 v i ) x 1 p ( v 1 v i ) x 21 x 2 p x 21 ( u 2 u i ) x 2 p ( u 2 u i ) x 21 ( v 2 v i ) x 2 p ( v 2 v i ) x n 1 x n p x n 1 ( u n u i ) x n p ( u n u i ) x n 1 ( v n v i ) x n p ( v n v i ) ,
is the design matrix at ( u i , v i ) , 0 p × ( 2 p ) is the zero matrix of order p × ( 2 p ) , and W h u i , v i is the same diagonal calibration weights matrix in Equation (5). The resulting estimator of M = x 1 T β ( u 1 , v 1 ) , x 2 T β ( u 2 , v 2 ) , , x n T β ( u n , v n ) T is
M ^ L = S L h I n ρ W y ,
where
S L ( h ) = x 1 T , 0 1 × ( 2 p ) X T u 1 , v 1 W h u 1 , v 1 X u 1 , v 1 1 X T u 1 , v 1 W h u 1 , v 1 x 2 T , 0 1 × ( 2 p ) X T u 2 , v 2 W h u 2 , v 2 X u 2 , v 2 1 X T u 2 , v 2 W h u 2 , v 2 x n T , 0 1 × ( 2 p ) X T u n , v n W h u n , v n X u n , v n 1 X T u n , v n W h u n , v n .
The same derivation for Equation (6) yields the following artificial spatial autoregressive model:
I n S L h y = ρ I n S L h Wy + ε .
Substitute the instrumental estimator y ^ w of Wy in Equation (8) into the above model and obtain the local-linear GWR-based 2SLS (henceforth referred to as LGWR-2SLS) estimator of ρ as
ρ ^ L = y ^ w T I n S L h T I n S L h y ^ w 1 y ^ w T I n S L h T I n S L h y .
The LGWR-2SLS estimators of the regression coefficient vectors are
β ^ L u i , v i = I p , 0 p × ( 2 p ) X T u i , v i W h u i , v i X u i , v i 1 × X T u i , v i W h u i , v i I n ρ ^ L W y , i = 1 , 2 , , n .
The fitted vector of the response variable at the sampling locations { ( u i , v i ) } i = 1 n can be expressed as
y ^ L h = y ^ L 1 h , y ^ L 2 h , , y ^ L n h T = H L h y ,
where the hat matrix H L h is of the same form as that in Equation (13) except that S G h therein is replaced by S L h , shown in Equation (16).

2.2.3. Generating the Calibration Weights Matrix W h u i , v i and Selecting the Bandwidth h

As is well known in the GWR literature [2,3], the elements in the calibration weights matrix in Equation (5) are generated by a kernel function K ( t ) , which is usually taken to be the Gaussian kernel or the bisquare kernel, and the bandwidth h can be set to be fixed or adaptive. Specifically, given a focal sampling location u i , v i , the weights with a fixed bandwidth at u i , v i are
w j h ( u i , v i ) = K d i j h , j = 1 , 2 , , n ,
where { d i j } j = 1 n are the distances (usually the Euclidean distance) from ( u i , v i ) to all of the sampling locations ( u j , v j ) j = 1 n . The weights with an adaptive bandwidth at ( u i , v i ) are commonly generated by the bisquare kernel, which shows
w j k ( u i , v i ) = 1 d i j h i k 2 2 , d i j h i k , 0 , d i j > h i k , j = 1 , 2 , , n ,
where the bandwidth h i k is the distance from ( u i , v i ) to its k-th nearest sampling location and is variable with ( u i , v i ) . As shown by Gollini et al. [42], when the sampling locations ( u i , v i ) i = 1 n are irregularly distributed over space, and the weights with an adaptive bandwidth perform better than those with a fixed bandwidth.
Throughout this paper, the AICc criterion [3] is employed to select the optimal bandwidth size in both GWR-2SLS and LGWR-2SLS estimation approaches. Specifically, let H h be the hat matrix in either of the two estimation methods. The AICc score is computed by
AICc = log 1 n y T I n H ( h ) T I n H ( h ) y + n + tr ( H ( h ) ) n 2 tr ( H ( h ) ) ,
where H ( h ) = H G ( h ) in Equation (13) for GWR-2SLS and H ( h ) = H L ( h ) in Equation (19) for LGWR-2SLS. The optimal bandwidth size, which we denote by h 0 , is
h 0 = arg min h > 0 AICc ( h ) .
Remark 2.
When the adaptive bandwidth is set, the optimal size of the parameter k, which is taken as a proxy of the adaptive bandwidth h i k at each ( u i , v i ) , is selected by the AICc criterion. To avoid causing confusion, henceforth, we call k the bandwidth.

2.3. Non-Iterative Multiscale Estimation Procedures for the SARGWR Model

2.3.1. SARGWR-LL Procedure

Based on the LGWR-2SLS estimators ρ ^ L in Equation (17) and β ^ L u i , v i in Equation (18) of the autoregressive coefficient ρ and the regression coefficients β u i , v i ( i = 1 , 2 , , n ) , the SARGWR-LL non-iterative multiscale estimation is formulated by the following steps:
(i) Let h 0 L be the optimal bandwidth size selected in the LGWR-2SLS estimation of the SARGWR model. Fix the instrumental estimator y ^ w of Wy in Equation (8) and ρ ^ L in Equation (17) at h 0 L , which we denote by y ^ w ( h 0 L ) and ρ ^ L ( h 0 L ) , respectively;
(ii) Let h ˜ = c h 0 L where c ( 0 , 1 ) is a constant called the bandwidth shrinking parameter. Compute the coefficient estimators β ^ L u i , v i ( i = 1 , 2 , , n ) in Equation (18) in which the bandwidth h is replaced by h ˜ and ρ ^ L is substituted by ρ ^ L ( h 0 L ) . We denote the resulting estimators by
β ˜ h ˜ ( u i , v i ) = β ˜ 1 ( h ˜ ) ( u i , v i ) , β ˜ 2 ( h ˜ ) ( u i , v i ) , , β ˜ p ( h ˜ ) ( u i , v i ) T , i = 1 , 2 , , n ;
(iii) Fixing each m 1 , 2 , , p and substituting ρ ^ L ( h 0 L ) and β ˜ j ( h ˜ ) ( u i , v i ) j = 1 , j m p into the SARGWR model in Equation (1), we formulate the following artificial GWR model with a single spatially varying coefficient:
y i * ( m ) = y i ρ ^ L h 0 L j = 1 n w i j y j j = 1 , j m p β ˜ j ( h ˜ ) u i , v i x i j = β m u i , v i x i m + ε ˜ i , i = 1 , 2 , , n .
Calibrating the above model by the local-linear GWR estimation [38] with the optimal bandwidth size selected by the AICc criterion, we obtain the SARGWR-LL estimators of β m u i , v i ( i = 1 , 2 , , n ) as
β ^ m h m u i , v i = 1 , 0 , 0 X m T u i , v i W h m u i , v i X m u i , v i 1 × X m T u i , v i W h m u i , v i y * m , i = 1 , 2 , , n ,
where
X m ( u i , v i ) = x 1 m x 1 m ( u 1 u i ) x 1 m ( v 1 v i ) x 2 m x 2 m ( u 2 u i ) x 2 m ( v 2 v i ) x n m x n m ( u n u i ) x n m ( v n v i ) , y * m = y 1 * m y 2 * m y n * m ,
and h m is the optimal bandwidth size selected by the AICc criterion in which y and the hat matrix H ( h ) in the AICc score in Equation (22) are respectively replaced by y * ( m ) and
H m h = x 1 m , 0 , 0 P ( u 1 , v 1 ) x 2 m , 0 , 0 P ( u 2 , v 2 ) x n m , 0 , 0 P ( u n , v n )
with
P ( u i , v i ) = X m T u i , v i W h u i , v i X m u i , v i 1 X m T u i , v i W h u i , v i , i = 1 , 2 , , n ;
(iv) Repeating Step (iii) for each of m = 1 , 2 , , p , we finally obtain the SARGWR-LL estimators β ^ 1 ( h 1 ) u i , v i , β ^ 2 ( h 2 ) u i , v i , ⋯, β ^ p ( h p ) u i , v i at each of ( u i , v i ) i = 1 n with h 1 , h 2 , ⋯, h p being the final optimal bandwidth sizes of the p coefficients β 1 ( u , v ) , β 2 ( u , v ) , , β p ( u , v ) in the SARGWR model in Equation (1), respectively;
(v) For each i = 1 , 2 , , n , substituting β ^ m ( h m ) ( u i , v i ) m = 1 p into the SARGWR model in Equation (1) yields the following artificial spatial autoregressive model:
y ˜ i = y i m = 1 p β ^ m h m u i , v i x i m = ρ j = 1 n w i j y j + ε ˜ i , i = 1 , 2 , , n ,
or
y ˜ = ρ Wy + ε ˜ ,
where y ˜ = y ˜ 1 , y ˜ 2 , , y ˜ n T and ε ˜ = ε ˜ 1 , ε ˜ 2 , , ε ˜ n T . Replacing Wy in the above model with its instrumental estimator y ^ w h 0 L and re-estimating ρ by the least-squares method, we obtain the final SARGWR-LL estimator of ρ as
ρ ^ = y ^ w T h 0 L y ^ w h 0 L 1 y ^ w T h 0 L y ˜ .
Remark 3.
In step (ii), a bandwidth shrinking parameter c is introduced to shrink the optimal bandwidth size h 0 L selected in the LGWR-2SLS estimation, and the shrunk bandwidth h ˜ is used to obtain the initial estimators of the regression coefficients. As noted by Fan and Zhang [8], a smaller bandwidth size will reduce estimation biases but increase estimation variances of the regression coefficients. The less biased initial estimators in Equation (24) are helpful in increasing the accuracy of the final estimators of the regression coefficients, while the increased variances can be expected to be smoothed out in the following step (iii).

2.3.2. SARGWR-GL Procedure

The above SARGWR-LL procedure takes the LGWR-2SLS estimators of the regression coefficients to be the initial estimators. As shown in Equation (18), the LGWR-2SLS estimators of the regression coefficients relate to a more complicated design matrix X ( u i , v i ) , shown in Equation (15), which should be re-set at each of the sampling locations { ( u i , v i ) } i = 1 n . Therefore, computing ρ ^ L in Equation (17) and β ^ L ( u i , v i ) ( i = 1 , 2 , , n ) in Equation (18) is more time-demanding than computing their GWR-2SLS estimators ρ ^ G in Equation (10) and β ^ G ( u i , v i ) ( i = 1 , 2 , , n ) in Equation (11). Furthermore, the initial estimators of ρ and β ( u i , v i ) ( i = 1 , 2 , , n ) might have less effect on their final estimators because β ( u i , v i ) ( i = 1 , 2 , , n ) will be re-estimated by the local-linear GWR procedure and ρ will be re-estimated by the 2SLS estimation method. With these considerations, we replace the initial LGWR-2SLS estimators of ρ and β ( u i , v i ) ( i = 1 , 2 , , n ) with their respective GWR-2SLS estimators and propose a simplified scenario of SARGWR-LL, which we refer, as mentioned in the introduction, to SARGWR-GL. The main steps of the SARGWR-GL procedure are as follows:
(i) Let h 0 G be the optimal bandwidth size selected in the GWR-2SLS estimation of the SARGWR model. Fix the instrumental estimator y ^ w of Wy in Equation (8) and ρ ^ G in Equation (10) at h 0 G , which we denote by y ^ w ( h 0 G ) and ρ ^ G ( h 0 G ) , respectively.
(ii) Let h ˜ = c h 0 G . The estimators β ^ h ˜ ( u i , v i ) ( i = 1 , 2 , , n ) in Equation (24) are computed from β ^ G ( u i , v i ) ( i = 1 , 2 , , n ) in Equation (11), where h and ρ ^ G are replaced by h ˜ and ρ ^ G ( h 0 G ) , respectively.
(iii) The steps followed are totally the same as those in Section 2.3.1, except that y ^ w ( h 0 L ) in Equation (25) is replaced by y ^ w ( h 0 G ) .

2.3.3. SARGWR-BF Procedure

Moreover, for the purpose of comparison, we accordingly formulate a backfitting-based multiscale estimation procedure for the SARGWR model, in which the GWR-2SLS estimation method is used in each iteration. We refer to this procedure as SARGWR-BF and describe its detailed steps in Appendix A because this part is less related to the main theme of this paper.

3. Simulation Study and Real-Life Example

In this section, a simulation study is conducted to assess the performance of the proposed SARGWR-LL and SARGWR-GL multiscale estimation methods for the SARGWR model. In particular, the proposed non-iterative multiscale estimation methods and the iterative multiscale estimation method SARGWR-BF described in Appendix A are compared in both the accuracy of the coefficient estimators and the computation efficiency. Furthermore, a real-life example based on Dublin voter turnout data is given to show the applicability of the proposed multiscale estimation methods.

3.1. Simulation Study

3.1.1. Design of the Experiment

(i) 
Spatial layout
We took the unit square [0,1] × [0,1] in a Cartesian coordinate system as the spatial region. Considering that the sampling locations in many practical problems are irregularly distributed over space, the sampling points { ( u i , v i ) } i = 1 n were designed in the way that each pair of ( u i , v i ) was independently drawn from the uniform distribution U 0 , 1 with n = 400 . The sampling points used in the simulation are depicted in Figure 1, and they are fixed throughout the simulation.
(ii) 
Model for generating the experimental data
The following SARGWR model was considered:
y i = ρ j = 1 n w i j y i + β 1 ( u i , v i ) + β 2 ( u i , v i ) x i 2 + β 3 ( u i , v i ) x i 3 + ε i , i = 1 , 2 , , n ,
where the row-standardized spatial weights matrix W = ( w i j ) n × n was determined by the l-nearest neighbor procedure. As pointed out by Boots and Tiefelsdorf [43], numerous studies have found that irregular spatial tessellations share, on average, many topological properties with a hexagonal tessellation in which a given hexagon has, in general, six neighbors when we define that one hexagon is a neighbor of another hexagon if they have a common side. Accordingly, we took l = 6 in the simulation study. Specifically, let d i j be the Euclidean distance between ( u i , v i ) and ( u j , v j ) . Given each ( u i , v i ) , for each of j = 1 , 2 , , n with j i , if d i j d i l , we set w ˜ i j = 1 ; if d i j > d i l , we set w ˜ i j = 0 . Then the elements in W are defined by w i j = w ˜ i j / j = 1 n w ˜ i j for i , j = 1 , 2 , , n with i j , and w i i = 0 ( i = 1 , 2 , , n ) by convention. The observations { x i 2 } i = 1 n and { x i 3 } i = 1 n of the explanatory variables X 2 and X 3 were independently drawn from the uniform distribution U ( 3 , 3 ) and the standard normal distribution N ( 0 , 1 ) , respectively. The model errors { ε i } i = 1 n were generated from the normal distribution N ( 0 , 0 . 5 2 ) . We designated the three regression coefficients as
β 1 ( u , v ) = 2 ( u + v ) ; β 2 ( u , v ) = 4 sin 12 ( u 0.5 ) 2 + 12 ( v 0.5 ) 2 12 ( u 0.5 ) 2 + 12 ( v 0.5 ) 2 ; β 3 ( u , v ) = 64 u v 1 u 1 v .
The true surfaces of the regression coefficients are shown in Figure 2, from which we can observe that their respective levels of spatial heterogeneity are obviously different. The autoregressive coefficient ρ was set to be from −0.9 to 0.9 with an increment of 0.3. When the values { x i 2 , x i 3 } i = 1 n of the explanatory variables X 2 and X 3 as well as { ε i } i = 1 n of the model errors have been drawn from their respective distributions, the observation vector y of the response variable is generated according to the matrix form of the model in Equation (2), i.e.,
y = I ρ W 1 M + ε ,
where M = x 1 T β u 1 , v 1 , x 2 T β u 2 , v 2 , , x n T β u n , v n T with x i T = 1 , x i 2 , x i 3 T , β u i , v i = β 1 u i , v i , β 2 u i , v i , β 3 u i , v i T , and ε = ε 1 , ε 2 , , ε n T .
(iii) 
Designs of the other experimental items
In both SARGWR-LL and SARGWR-GL, we set the shrinking parameter as c = 0.6, 0.7, 0.8, 0.9, and 1. The bisquare kernel with an adaptive bandwidth was used to generate the weights in Equation (21), and the optimal size of the bandwidth k was selected by the AICc criterion, where h ˜ in step (ii) of both SARGWR-LL and SARGWR-GL was taken to be the integer part of c h 0 L or c h 0 G .
(iv) 
Indices for measuring accuracy of the coefficient estimators
Each experimental setting was repeatedly run N times, where, in each replication, both model errors { ε i } i = 1 n and the observations { x i 2 , x i 3 } i = 1 n of the explanatory variables X 1 and X 2 were re-drawn from their respective distributions. Based on the coefficient estimators of the SARGWR model in Equation (26) in the N replications, we defined the following indices for measuring the accuracy of the coefficient estimators.
Given each of the SARGWR-LL, SARGWR-GL, and SARGWR-BF methods, let ρ ^ r be the estimator of the autoregressive coefficient ρ in the r-th replication. We take the mean of its estimators in the N replications,
Mean ρ ^ = 1 N r = 1 N ρ ^ r
as the final estimator of ρ and use the root mean square error (RMSE)
RMSE ρ ^ = 1 N r = 1 N ρ ^ r ρ 2 1 2
to measure the estimation accuracy of ρ .
Similarly, let β ^ j r u i , v i be the estimator of the j-th coefficient β j u , v at ( u i , v i ) in the r-th replication. The final estimator of β j ( u i , v i ) is defined by
Mean β ^ j u i , v i = 1 N r = 1 N β ^ j r u i , v i , i = 1 , 2 , , n .
In addition, the averaged root mean square errors (RMSEs) over the sampling points given by
ARMSE β ^ j u , v = 1 n i = 1 n 1 N r = 1 N β ^ j r u i , v i β j u i , v i 2 1 2 ,
are used to measure the global estimation accuracy of β j ( u , v ) .
In the simulation, we set N = 200 to compute the above indices.

3.1.2. Simulation Results with Analysis

(i) 
Estimation accuracy of the autoregressive and regression coefficients
With the experiment design in Section 3.1.1, the values of the estimation accuracy indices of the autoregressive and regression coefficients for SARGWR-LL and SARGWR-GL are reported in Table 1. For comparison, the related results from SARGWR-BF with the convergence threshold η 0 = 0.01 are also attached.
With regard to the autoregressive coefficient ρ , it is known from the fourth column of the table that its final estimators from the three multiscale estimation methods were comparable, and there was a trend where the estimators were somewhat smaller than their respective real values, especially when autocorrelation in the response variable is extremely high (i.e., the value of ρ is −0.9 and 0.9). In terms of the RMSE, the same trend is found for the estimation accuracy. However, for the small and moderate values of ρ , the three multiscale estimation methods all yield an accurate estimator of ρ . Moreover, the estimators of ρ yielded by the two non-iterative multiscale estimation models were rather robust to the variation of values of the shrinking parameter c in terms of both the mean and the RMSE indices.
For the regression coefficients β j ( u , v ) ( j = 1 , 2 , 3 ) , SARGWR-LL generally yielded more accurate estimators than SARGWR-GL except for the case of ρ = 0.9 for β 1 ( u , v ) . The gain in the coefficient estimation accuracy for SARGWR-LL should have come from the initial estimators of regression coefficients where the local-linear GWR estimation procedure was used because, as shown by Wang et al. [38], the local-linear GWR procedure can yield more accurate estimators of the regression coefficients than the traditional GWR method especially when a regression coefficient is a linear function of spatial coordinates. However, there was no notable difference in the estimation accuracy between the two proposed non-iterative multiscale methods, even in the cases of ρ = 0.9 and 0.9, which demonstrates that the initial estimators of regression coefficients, as expected previously, do not produce notable effects on the accuracy of the final multiscale estimators of regression coefficients. Moreover, comparing the values of ARMSE for c = 1 with those of c being less than 1, we know that, for both SARGWR-LL and SARGWR-GL, the improvement in accuracy of the final coefficient estimators can really be achieved by shrinking the optimal bandwidth size selected in the initial LGWR-2SLS or GWR-2SLS estimation in most cases, especially for the SARGWR-GL method. However, the values of ARMSE seemed robust for c = 0.6, 0.7, 0.8, and 0.9. This finding is useful in practice in that it provides a non-rigorous way for the analyst to choose the value of c. Among the three multiscale estimation methods, SARGWR-BF generally yielded the worst estimators of the regression coefficients in terms of RMSE, although the estimation accuracy could be improved by setting a smaller value of the convergence threshold η 0 . The computation efficiency of the three multiscale estimation methods will be discussed further later.
In order to visually show the performance of the three estimation methods in retrieving true surfaces of the regression coefficients, we depict in Figure 3, Figure 4 and Figure 5 the estimated surfaces of the regression coefficients via their respective final estimators defined in Equation (28) for the cases of ρ = 0.9 , 0 , 0.9 and c = 0.7 . The estimated surfaces in the other cases are all very similar, and we omitted them to save space. By comparing Figure 3, Figure 4 and Figure 5 with Figure 1, it can be observed that SARGWR-LL and SARGWR-GL retrieved the true surfaces of the regression coefficients more accurately than SARGWR-BF.
(ii) 
Optimal bandwidth sizes of the regression coefficients
As mentioned in the introduction section, the optimal bandwidth size of each regression coefficient can characterize the underlying spatial scale of the corresponding explanatory variable, which was one of the main objectives for developing multiscale estimation methodologies for the GWR and the related models. Figure 6 and Figure 7 show shows the boxplots of the optimal bandwidth sizes of the three regression coefficients selected in the 200 experiment replications for SARGWR-LL, SARGWR-GL, and SARGWR-BF, respectively. Here, we only show the boxplots in the cases of c = 0.7 , 1 , and ρ = 0.9 , 0 , 0.9 to illustrate the impact of the shrinking parameter, c, and the autoregressive coefficient, ρ , on the optimal bandwidth sizes for the three multiscale estimation methods. The boxplots for the other experiment settings were all similar, which we omitted here.
It can be observed from Figure 6 and Figure 7 that the three multiscale estimation methods all yielded such variable-specific optimal bandwidth sizes that are consistent with the heterogeneity levels of the respective coefficients in almost all experiment settings. That is, the more heterogeneous the coefficient was, the smaller the optimal bandwidth size was, which demonstrates that the three methods can all correctly reveal the respective underlying spatial scales of the explanatory variables. Compared to the iterative estimation method SARGWR-BF, both non-iterative estimation methods SARGWR-LL and SARGWR-GL could better reflect the difference in heterogeneity among the three coefficients because the optimal bandwidth sizes from SARGWR-LL and SARGWR-GL show on the whole, an evident difference and the correct order of spatial heterogeneity levels among the three coefficients. However, the uncertainty of the optimal bandwidth sizes yielded by SARGWR-LL and SARGWR-GL for the least heterogeneous coefficient β 1 ( u , v ) was much larger than that produced by SARGWR-BF. The reason for causing this large uncertainty in SARGWR-LL and SARGWR-GL was perhaps due to the property that, theoretically, the optimal bandwidth size for a linear function in the local-linear kernel smoothing technique tends to infinity with the increase of the sample size [44]. For the other two non-linear coefficients, uncertainty in the optimal bandwidth size was comparable among the three multiscale estimation methods. Moreover, the figures show that the variable-specific optimal bandwidth sizes for the three methods were affected by the intensity of the spatial autocorrelation in the response variable, especially for the least heterogeneous coefficient β 1 ( u , v ) . For example, when the very high positive spatial autocorrelation exists in the response variable (i.e., ρ = 0.9 ), the medians of the optimal bandwidth sizes yielded by SARGWR-LL and SARGWR-GL for β 1 ( u , v ) become even slightly smaller than those for β 3 ( u , v ) . However, less influence of the intensity of spatial autocorrelation on the optimal bandwidth sizes was observed for the SARGWR-BF method. In addition, comparing Figure 6 with Figure 7, we can see that the boxplots with c = 0.7 are very similar to the corresponding boxplots with c = 1 for both non-iterative multiscale estimation methods. This demonstrates that shrinking the optimal bandwidth size to obtain initial estimators of the regression coefficients had little impact on their respective final optimal bandwidth sizes.
(iii) 
Computation efficiency
As one of our main focuses for developing non-iterative multiscale estimation methods for the SARGWR model, computation efficiency is essential for SARGWR-LL and SARGWR-GL. According to the foregoing simulation, which was conducted by our writing the Matlab codes and carrying out the computation on our personal computer with AMD Ryzen 5 5600G @ 3.90GHz of CPU and 16GB of memory, the average time for running a replication was about 90 s for SARGWR-LL and about 80 s for SARGWR-GL. With the convergence threshold η 0 = 0.01 , the SARGWR-BF method took about 210 s to run a replication, which is more than two times as much as the computing time that SARGWR-LL and SARGWR-GL took, respectively. In essence, both SARGWR-LL and SARGWR-GL are one-step SARGWR-BF by taking the initial estimators of the regression coefficients to be the LGWR-2SLS and the GWR-2SLS estimators with a shrunk optimal bandwidth size, respectively. Therefore, SARGWR-LL and SARGWR-GL should be more efficient than SARGWR-BF when the iteration times for implementing SARGWR-BF until convergence are greater than 1. Moreover, as expected, SARGWR-GL was more efficient than SARGWR-LL, and the difference in the computation time between SARGWR-LL and SARGWR-GL will become larger with the number of explanatory variables increasing. Furthermore, in view of the foregoing findings that both the estimation accuracy of the coefficients and the variable-specific optimal bandwidth sizes are all comparable between SARGWR-LL and SARGWR-GL, the SARGWR-BF provides a more efficient alternative for dealing with large data sets. Although computation time for an estimation method closely depends on the optimization of the codes written and the computation equipment used, the above comparison still makes sense in understanding the relative computation efficiency of the three multiscale estimation methods, in that the codes were written by ourselves and the computation was carried out on a same computer.

3.2. Real-Life Example

3.2.1. Introduction to the Data Set with the Model Built

To demonstrate the applicability of the proposed non-iterative multiscale estimation methods, a Dublin voter turnout data set, which is publicly available in the R package attached in Gollini et al. [42], was analyzed in this section. This data set includes the observations of nine variables collected from 322 Electoral Divisions (EDs) of the Dublin area in the Irish 2004 Dáil elections, with the Cartesian coordinates ( u , v ) of each ED also provided. The nine variables are as follows:
  • GenEl: percentage of the voting population in each ED;
  • DiffAdd: percentage of the immigrant population one year ago in each ED;
  • LARent: percentage of renters in each ED;
  • SC1: percentage of people with high social class in each ED;
  • Unempl: percentage of unemployed people in each ED;
  • LowEduc: percentage of people without formal education in each ED;
  • Age1: percentage of people aged from 18 to 24 in each ED;
  • Age2: percentage of people aged from 25 to 44 in each ED;
  • Age3: percentage of people aged from 45 to 64 in each ED.
What we are interested in is exploring spatial autocorrelation in the percentage of the voting population among the n = 322 EDs and spatial heterogeneity of the impact of the eight variables on the percentage of voting population. For this purpose, we built the following SARGWR model:
GenEl i = ρ j = 1 n w i j GenEl i + β 1 u i , v i + β 2 u i , v i DiffAd d i + β 3 u i , v i LARen t i + β 4 u i , v i SC 1 i + β 5 u i , v i Unemp l i + β 6 u i , v i LowEdu c i + β 7 u i , v i Age 1 i + β 8 u i , v i Age 2 i + β 9 u i , v i Age 3 i + ε i , i = 1 , 2 , , 322 ,
where the spatial weights matrix W = w i j n × n was formulated in the same way as that in the simulation study. That is, the elements in W were first determined by the binary way with the 6-nearest neighbor procedure used in the simulation study, where the Euclidean distance between the spatial coordinates of two EDs was used to determine the neighbors of each ED, and then were row-standardized. Moreover, all of the explanatory variables were standardized.

3.2.2. Model Calibration with the Results

Both SARGWR-LL and SARGWR-GL methods were applied to the calibration of the above model, in which the bisquare kernel with an adaptive bandwidth was used to generate the calibration weights matrix in Equation (5), and the optimal bandwidth size was selected by the AICc criterion. Furthermore, the bandwidth shrinking parameter was set to c = 0.7 . For the purpose of comparison, the SARGWR-BF method was also used to calibrate the model, in which the convergence threshold was set to be η 0 = 0.01 .
The estimated value of the autoregressive coefficient is 0.2547 for SARGWR-LL, 0.1810 for SARGWR-GL, and 0.1714 for SARGWR-BF, which are all positive and quantitatively similar, meaning that there exists positive spatial autocorrelation among the percentages of the voting population in the EDs. The running time on our personal computer is 56 s and 43 s for SARGWR-LL and SARGWR-GL, respectively, while SARGWR-BF took 68 s when the convergence threshold η 0 = 0.01 is reached.
The variable-specific optimal bandwidth sizes for the three multiscale estimation methods are listed in Table 2. In order to assess the impact of the shrinking parameter on the optimal bandwidth sizes of individual explanatory variables, the corresponding optimal bandwidth sizes selected in SARGWR-LL and SARGWR-GL with c = 1 (i.e., the original optimal bandwidth size was not shrunk) are also reported in Table 2. It is known from the table that although the corresponding optimal bandwidth sizes among the three estimation methods are different from each other, their relative orders are roughly consistent, which provides information about the spatial scale of each explanatory variable. In particular, the influence of LARent, SC1, and LowEduc on the percentage of voting population (GenEl) is least heterogeneous because the three estimation methods all produce extremely large bandwidth sizes for these three explanatory variables, while the influence of DiffAdd is the most heterogeneous due to its very small bandwidth sizes yielded by the three estimation methods. Comparing the optimal bandwidth sizes with c = 0.7 and the corresponding ones with c = 1 , we can observe that they are all comparable, which demonstrates, as shown in the simulation study, that the final optimal bandwidth sizes of the individual explanatory variables are very robust to the variation of the shrinking parameter for the SARGWR-LL and SARGWR-GL methods.
For the assessment of the goodness-of-fits and the ability to extract spatial autocorrelation of the three multiscale estimation methods, we computed the values of R 2 and Moran’s I of the residuals with the p-values derived by 200 randomly permuted residual samples. The values of R 2 are 0.7936 for SARGWR-LL, 0.8993 for SARGWR-GL, and 0.6896 for SARGWR-BF, respectively. The smaller value of R 2 for SARGWR-BF may be due to the relative inaccuracy of the regression coefficient estimators, as demonstrated in the simulation study. The values of Moran’s I of the residuals are 0.0654 with the p-value 0.020, 0.0555 with the p-value 0.055, and 0.0258 with the p-value 0.195 for SARGWR-LL, SARGWR-GL, and SARGWR-BF, respectively, showing that SARGWR-BF is of stronger ability for extracting spatial autocorrelation. In addition, for the comparison of goodness-of-fits of the three multiscale estimation methods with other possible models, we calibrated the corresponding SARGWR model with the GWR-2SLS method, GWR model with the traditional estimation method, and MGWR model, where the coefficients of the LARent, SC1 and LowEduc were assumed to be constant, with the two-step estimation method. The values of R 2 are 0.7060, 0.7135, and 0.7154 for SARGWR, GWR, and MGWR models, respectively, which are all smaller than the corresponding R 2 values of the multiscale estimation methods except for SARGWR-BF.
The estimators of the regression coefficients by the three estimation methods are shown via heat maps in Figure 8. It can be observed that the spatial patterns of individual coefficients are basically consistent among the three estimation methods, although some local differences exist, especially between the estimator of intercept by SARGWR-LL and those by SARGWR-GL and SARGWR-BF, which might be caused by the larger difference between the estimated values of spatial autoregressive coefficient ρ and the different multiscale estimation methods used. Relatively, however, the heat maps of the regression coefficient estimators produced by SARGWR-LL and SARGWR-GL are more similar.
Based on the heat maps of the regression coefficients, the impact of each explanatory variable on the response variable can be qualitatively interpreted. For example, the influence of Age3 on GenEl increases from north to south, showing a positive effect in the south area, especially in the southeast area, and a negative effect in the north area; Age1 has an evident negative impact on GenEl mainly in the south area, while the most positive influence of Age2 appears in this area; the influence of Unempl on GenEl is negative over the whole area with the least negative impact being on the center area. DiffAdd, whose coefficient is the most spatially heterogeneous, shows a weakly positive influence in the northern area but a strong negative influence in the center and southeast areas. Due to the very large optimal bandwidth sizes of the LARent, SC1, and LowEduc, their influence could be interpreted to be global. However, as will be discussed in the last section, a formal statistical test is still needed to verify that their corresponding coefficients are constant.

4. Conclusions and Discussion

Inspired by the cost-effective multiscale estimation approach proposed by Wu et al. [37] for calibrating GWR models and considering the importance of SARGWR models in the application, in this article, we proposed two non-iterative multiscale estimation methods called SARGWR-LL and SARGWR-GL, respectively, for SARGWR models based on their 2SLS estimation. The simulation study and real-life data analysis demonstrate that both SARGWR-LL and SARGWR-GL perform as well as or better than the iterative multiscale estimation method SARGWR-BF, not only in estimating the autoregressive coefficient and retrieving the underlying spatial patterns of the regression coefficients but also in revealing the spatial scales at which the explanatory variables operate. Most importantly, the proposed SARGWR-LL and SARGWR-GL methods were much more efficient than SARGWR-BF in terms of computational efficiency.
As aforementioned in Remark 1, the SARGWR model, which this paper has focused on, assumes a constant autoregressive coefficient. This assumption may be unrealistic for many real-life spatial data sets. Recently, Mei and Chen [41] have extended the SARGWR model to allow the autoregressive coefficient to vary over space and proposed a GWR-based 2SLS estimation method for the extended SARGWR model. It seems possible for both SARGWR-LL and SARGWR-GL multiscale estimation procedures to be extended to the SARGWR model with a spatially varying autoregressive coefficient, provided that the spatially varying autoregressive coefficient can be efficiently estimated by some local smoothing technique. This extension is worth investigating in view of the wide application background of the extended SARGWR model.
Moreover, SARGWR models assume that all of the regression coefficients vary over space, and their multiscale estimation can additionally provide information on the relative levels of spatial heterogeneity of the regression coefficients via variable-specific optimal bandwidth sizes. As pointed out by Yu et al. [45] and Fotheringham [46], formal statistical tests for identifying the constant coefficients and locally evaluating the significance of the influence of the explanatory variables at each spatial sampling point are also essential to regression-based local modeling. For the SARGWR model, Li et al. [47] proposed two kinds of tests to respectively identify whether spatial autocorrelation exists in the response variable and whether some of the regression coefficients are constant. Similar tests have also been proposed by Mei and Chen [41] for the extended SARGWR model. In these tests, there were too many null hypotheses for the regression coefficients to be considered, which makes it very complex to comprehensively identify the constant regression coefficients in the models. Taking into account the information of spatial heterogeneity of each regression coefficient provided by a multiscale estimation method as the prior information, i.e., an especially large bandwidth size of a regression coefficient means that this coefficient is possibly constant, we can formulate much fewer but more specific null hypotheses for testing the possible constant coefficients in a SARGWR or the extended SARGWR model. After the constant regression coefficients in a SARGWR model are well identified, a semi-parametric SARGWR model, where some regression coefficients are constant and others vary over space, is then formulated. It also seems possible to extend both SARGWR-LL and SARGWR-GL’s multiscale estimation methods to semi-parametric SARGWR models. These issues deserve to be studied in future research.

Author Contributions

Conceptualization and methodology, C.-L.M.; software and visualization, S.-J.G.; writing—original draft preparation, C.-L.M.; writing—review and editing, C.-L.M., S.-J.G., Q.-X.X., and Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (grant numbers 11871056 and 12271420).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The Dublin voter turnout data set is publicly available in the R package GWmodel via https://cran.r-project.org/web/packages/GWmodel/index.html (accessed on 22 June 2022).

Acknowledgments

The authors sincerely thank the three reviewers for their valuable comments and constructive suggestions, which led to significant improvement in the manuscript.

Conflicts of Interest

The authors declare no conflict of interest related to this research.

Appendix A. SARGWR-BF Iterative Multiscale Estimation of the SARGWR Model

In order to make a comparison of the estimation accuracy and especially the computation efficiency between the proposed SARGWR-LL and SARGWR-GL procedures and the backfitting-based multiscale estimation for the SARGWR model, we describe in detail the SARGWR-BF method in this appendix. The SARGWR-BF method here is, in fact, more efficient than the backfitting-based multiscale estimation for the SARGWR model in Chen et al. [17] because the GWR-2SLS estimation, rather than the profile maximum likelihood estimation in Chen et al. [17], is used to derive the estimators of the autoregressive coefficient and the regression coefficients in each iteration loop. Therefore, this kind of comparison can better highlight the advantage of the SARGWR-LL and SARGWR-GL methods in reducing the computation complexity if SARGWR-LL and SARGWR-GL are more efficient than SARGWR-BF. What follows are the main steps of the SARGWR-BF procedure.
(i) Set the initial value of the autoregressive coefficient ρ to be ρ ( 0 ) = ρ ^ G ( h 0 G ) in Step (i) of SARGWR-GL and the initial estimators of the regression coefficients to be the GWR-2SLS estimators in Equation (11) with h = h 0 G , that is,
β ^ 0 u i , v i = β ^ 1 0 u i , v i , β ^ 2 0 u i , v i , , β ^ p 0 u i , v i T = X T W h 0 G u i , v i X 1 X T W h 0 G u i , v i I n ρ ^ G h 0 G W y , i = 1 , 2 , , n .
Fix the instrumental estimator of Wy as y ^ w ( h 0 G ) , which is described in Step (i) of SARGWR-GL.
(ii) Let ρ ^ ( t 1 ) and β ^ t 1 u i , v i = β ^ 1 t 1 u i , v i , β ^ 2 t 1 u i , v i , , β ^ p t 1 u i , v i T ( i = 1 , 2 , , n ) be the t-th iteration values of ρ and β ^ u i , v i ( i = 1 , 2 , , n ) , respectively. Formulate for each of m = 1 , 2 , , p the following univariate GWR model:
y ˜ i t = y i ρ ^ t 1 j = 1 n w i j y j j = 1 m 1 β ^ j t u i , v i x i j j = m + 1 p β ^ j t 1 u i , v i x i j = β m u i , v i x i m + ε ˜ i , i = 1 , 2 , , n .
Calibrating the model by the GWR technique [2] and selecting the optimal bandwidth size by the AICc criterion, we obtain the updated estimators β ^ m ( t ) ( u i , v i ) of β m ( u i , v i ) ( i = 1 , 2 , , n ) and the optimal bandwidth size h m ( t ) for each of m = 1 , 2 , , p .
(iii) Let
y i * t = y i j = 1 p β ^ t u i , v i x i j = ρ j = 1 n w i j y j + ε ˜ i , i = 1 , 2 , , n ,
or
y * t = y 1 * t , y 2 * t , , y n * t T = ρ Wy + ε ˜ ,
where ε ˜ = ( ε ˜ 1 , ε ˜ 2 , , ε ˜ n ) T . Replacing Wy in the above model by its instrumental estimator y ^ w ( h 0 G ) and using the least-square procedure, we obtain the updated estimator of ρ as
ρ ^ t = y ^ w T h 0 G y ^ w h 0 G 1 y ^ w T h 0 G y * t .
(iv) Iteratively carry out Steps (ii) and (iii) until a pre-specified convergence criterion is reached. Here, the following convergence criterion is considered. That is, define
η t = ρ ^ t Wy ρ ^ t 1 Wy 2 + j = 1 p f ^ j t f ^ j t 1 2 ρ ^ t Wy 2 + j = 1 p f ^ j t 2 1 2 ,
where
f ^ j ( t ) = β ^ j ( t ) u 1 , v 1 x 1 j , β ^ j ( t ) u 2 , v 2 x 2 j , , β ^ j ( t ) u n , v n x n j T , j = 1 , 2 , , p .
Given a threshold η 0 , when η t η 0 for some t 1 , we terminate the iteration process and take ρ ^ ( t ) as the final estimator of the autoregressive coefficient and β ^ t u i , v i = ( β ^ 1 t u i , v i , β ^ 2 t u i , v i , , β ^ p t u i , v i ) T i = 1 , 2 , , n as the final estimators of the regression coefficients with h 1 ( t ) , h 2 ( t ) , , h p ( t ) being their respective optimal bandwidth sizes.

References

  1. Fotheringham, A.S.; Charlton, M.; Brunsdon, C. The geography of parameter space: An investigation of spatial non-stationarity. Int. J. Geogr. Inf. Syst. 1996, 10, 605–627. [Google Scholar] [CrossRef]
  2. Brunsdon, C.; Fotheringham, A.S.; Charlton, M. Geographically weighted regression: A method for exploring spatial nonstationarity. Geogr. Anal. 1996, 28, 281–298. [Google Scholar] [CrossRef]
  3. Fotheringham, A.S.; Brunsdon, C.; Charlton, M. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships; Wiley: Chichester, UK, 2002. [Google Scholar]
  4. Goodchild, M.F. Modelling scale in geographical information science. In Models of Scale and Scales of Modelling; Tate, N., Atkinson, P.M., Eds.; Wiley: Chichester, UK, 2001; pp. 3–10. [Google Scholar]
  5. Gao, P.; Bian, L. Scale effects on spatially embedded contact networks. Comput. Environ. Urban Syst. 2016, 59, 142–151. [Google Scholar] [CrossRef] [PubMed]
  6. Yang, W. An Extension of Geographically Weighted Regression with Flexible Bandwidths. Ph.D. Dissertation, University of St Andrews, Newtown, UK, 2014. [Google Scholar]
  7. Fotheringham, A.S.; Yang, W.; Kang, W. Multiscale geographically weighted regression (MGWR). Ann. Am. Assoc. Geogr. 2017, 107, 1247–1265. [Google Scholar] [CrossRef]
  8. Fan, J.; Zhang, W. Statistical estimation in varying coefficient models. Ann. Stat. 1999, 27, 1491–1518. [Google Scholar] [CrossRef]
  9. Hastie, T.J.; Tibshirani, R.J. Generalized Additive Models; Chapman and Hall: London, UK, 1990. [Google Scholar]
  10. Leong, Y.Y.; Yue, J.C. A modification to geographically weighted regression. Int. J. Health. Geogr. 2017, 16, 1–18. [Google Scholar] [CrossRef]
  11. Murakami, D.; Lu, B.; Harris, P.; Brunsdon, C.; Charlton, M.; Nakaya, T.; Griffith, D.A. The importance of scale in spatially varying coefficient modeling. Ann. Am. Assoc. Geogr. 2019, 109, 50–70. [Google Scholar] [CrossRef]
  12. Chen, F.; Mei, C.L. Scale-adaptive estimation of mixed geographically weighted regression models. Econ. Model. 2021, 94, 737–747. [Google Scholar] [CrossRef]
  13. Brunsdon, C.; Fotheringham, A.S.; Charlton, M. Some notes on parametric significance tests for geographically weighted regression. J. Reg. Sci. 1999, 39, 497–524. [Google Scholar] [CrossRef]
  14. Brunsdon, C.; Fotheringham, A.S.; Charlton, M. Spatial nonstationarity and autoregressive models. Environ. Plan. A 1998, 30, 957–973. [Google Scholar] [CrossRef]
  15. Sun, Y.; Yan, H.; Zhang, W.; Lu, Z.D. A semiparametric spatial dynamic model. Ann. Stat. 2014, 42, 700–727. [Google Scholar] [CrossRef]
  16. Geniaux, G.; Martinetti, D. A new method for dealing simultaneously with spatial autocorrelation and spatial heterogeneity in regression models. Reg. Sci. Urban. Econ. 2018, 72, 74–85. [Google Scholar] [CrossRef]
  17. Chen, F.; Leung, Y.; Mei, C.L.; Fung, T. Backfitting estimation for geographically weighted regression models with spatial autocorrelation in the response. Geogr. Anal. 2022, 54, 357–381. [Google Scholar] [CrossRef]
  18. Wu, C.; Ren, F.; Hu, W.; Du, Q. Multiscale geographically and temporally weighted regression: Exploring the spatiotemporal determinants of housing prices. Int. J. Geogr. Inf. Sci. 2019, 33, 489–511. [Google Scholar] [CrossRef]
  19. Huang, B.; Wu, B.; Barry, M. Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices. Int. J. Geogr. Inf. Sci. 2016, 24, 383–401. [Google Scholar] [CrossRef]
  20. Fotheringham, A.S.; Crespo, R.; Yao, J. Geographical and temporal weighted regression (GTWR). Geogr. Anal. 2015, 47, 431–452. [Google Scholar] [CrossRef]
  21. Zhang, Z.; Li, J.; Fung, T.; Yu, H.Y.; Mei, C.L.; Leung, Y. Multiscale geographically and temporally weighted regression with a unilateral temporal weighting scheme and its application in the analysis of spatiotemporal characteristics of house prices in Beijing. Int. J. Geogr. Inf. Sci. 2021, 35, 2262–2286. [Google Scholar] [CrossRef]
  22. Fotheringham, A.S.; Yue, H.; Li, Z. Examining the influences of air quality in China’s cities using multi-scale geographically weighted regression. Trans. GIS 2019, 23, 1444–1464. [Google Scholar] [CrossRef]
  23. Mollalo, A.; Vahedi, B.; Rivera, K.M. GIS-based spatial modeling of COVID-19 incidence rate in the continental United States. Sci. Total Environ. 2020, 728, 138884. [Google Scholar] [CrossRef]
  24. Oshan, T.M.; Smith, J.P.; Fotheringham, A.S. Targeting the spatial context of obesity determinants via multiscale geographically weighted regression. Int. J. Health Geogr. 2020, 19, 1–17. [Google Scholar] [CrossRef]
  25. Liu, N.; Zou, B.; Li, S.; Zhang, H.; Qin, K. Prediction of PM2.5 concentrations at unsampled points using multiscale geographically and temporally weighted regression. Environ. Pollut. 2021, 284, 117116. [Google Scholar] [CrossRef] [PubMed]
  26. Lyanda, A.E.; Osayomi, T. Is there a relationship between economic indicators and road fatalities in Texas? A multiscale geographically weighted regression analysis. GeoJournal 2021, 86, 2787–2807. [Google Scholar] [CrossRef]
  27. Niu, L.; Zhang, Z.; Peng, Z.; Liang, Y.; Liu, M.; Jiang, Y.; Wei, J.; Tang, R. Identifying surface urban heat island drivers and their spatial heterogeneity in China’s 281 cities: An empirical study based on multiscale geographically weighted regression. Remote Sens. 2021, 13, 4428. [Google Scholar] [CrossRef]
  28. Tomal, M. Exploring the meso-determinants of apartment prices in polish counties using spatial autoregressive multiscale geographically weighted regression. Appl. Econ. Lett. 2021, 29, 822–830. [Google Scholar] [CrossRef]
  29. Deng, M.; Yang, W.; Liu, Q. Geographically weighted extreme learning machine: A method for space–time prediction. Geogr. Anal. 2017, 49, 433–450. [Google Scholar] [CrossRef]
  30. Li, K.; Lam, N.S.N. Geographically weighted elastic net: A variable-selection and modeling method under the spatially nonstationary condition. Ann. Am. Assoc. Geogr. 2018, 108, 1582–1600. [Google Scholar] [CrossRef]
  31. Du, Z.; Wang, Z.; Wu, S.; Zhang, F.; Liu, R. Geographically neural network weighted regression for the accurate estimation of spatial non-stationarity. Int. J. Geogr. Inf. Sci. 2020, 34, 1353–1377. [Google Scholar] [CrossRef]
  32. Wu, S.; Wang, Z.; Du, Z.; Huang, B.; Zhang, F.; Liu, R. Geographically and temporally neural network weighted regression for modeling spatiotemporal non-stationary relationships. Int. J. Geogr. Inf. Sci. 2021, 35, 582–608. [Google Scholar] [CrossRef]
  33. Yang, W.; Deng, M.; Tang, J.; Luo, L. Geographically weighted regression with the integration of machine learning for spatial prediction. J. Geogr. Syst. 2022. [Google Scholar] [CrossRef]
  34. He, Y.; Zhao, Y.; Tsui, K.L. An adapted geographically weighted LASSO (Ada-GWL) model for predicting subway ridership. Transportation 2021, 48, 1185–1216. [Google Scholar] [CrossRef]
  35. Kopczewska, K. Spatial machine learning: New opportunities for regional science. Ann. Reg. Sci. 2022, 68, 713–755. [Google Scholar] [CrossRef]
  36. Li, Z.; Fotheringham, A.S. Computational improvements to multi-scale geographically weighted regression. Int. J. Geogr. Inf. Sci. 2020, 34, 1378–1397. [Google Scholar] [CrossRef]
  37. Wu, B.; Yan, J.; Lin, H. A cost-effective algorithm for calibrating multiscale geographically weighted regression models. Int. J. Geogr. Inf. Sci. 2022, 36, 893–917. [Google Scholar] [CrossRef]
  38. Wang, N.; Mei, C.L.; Yan, X.D. Local linear estimation of spatially varying coefficient models: An improvement on the geographically weighted regression technique. Environ. Plan. A 2008, 40, 986–1005. [Google Scholar] [CrossRef]
  39. Kelejian, H.H.; Prucha, I.R. A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J. Real. Eatate. Financ. 1998, 17, 99–121. [Google Scholar] [CrossRef]
  40. Kelejian, H.H.; Prucha, I.R. A generalized moments estimator for the autoregressive parameter in a spatial model. Int. Econ. Rev. 1999, 40, 509–533. [Google Scholar] [CrossRef]
  41. Mei, C.L.; Chen, F. Detection of spatial heterogeneity based on spatial autoregressive varying coefficient models. Spat. Stat. 2022, 51, 100666. [Google Scholar] [CrossRef]
  42. Gollini, I.; Lu, B.; Charlton, M.; Brunsdon, C.; Harris, P. GWmodel: An R package for exploring spatial heterogeneity using geographically weighted models. J. Stat. Softw. 2015, 63, 1–50. [Google Scholar] [CrossRef]
  43. Boots, B.; Tiefelsdorf, M. Global and local spatial autocorrelation in bounded regular tessellations. J. Geogr. Syst. 2000, 2, 319–348. [Google Scholar] [CrossRef]
  44. Fan, J.; Gijbels, I. Local Polynomial Modelling and Its Applications; Chapman and Hall: London, UK, 1996. [Google Scholar]
  45. Yu, H.; Fotheringham, A.S.; Li, Z.; Oshan, T.; Kang, W.; Wolf, L.J. Inference in multiscale geographically weighted regression. Geogr. Anal. 2020, 52, 87–106. [Google Scholar] [CrossRef]
  46. Fotheringham, A.S. A comment on “A route map for successful applications of geographically-weighted regression”: The alternative expressway to defensible regression-based local modeling. Geogr. Anal. 2023, 55, 191–197. [Google Scholar] [CrossRef]
  47. Li, D.K.; Mei, C.L.; Wang, N. Tests for spatial dependence and heterogeneity in spatially autoregressive varying coefficient models with application to Boston house price analysis. Reg. Sci. Urban Econ. 2019, 79, 103470. [Google Scholar] [CrossRef]
Figure 1. 400 sampling points used in the simulation study.
Figure 1. 400 sampling points used in the simulation study.
Entropy 25 00320 g001
Figure 2. True surfaces of the three regression coefficients.
Figure 2. True surfaces of the three regression coefficients.
Entropy 25 00320 g002
Figure 3. Estimated surfaces of the regression coefficients with c = 0.7 and ρ = 0.9 by the three multiscale estimation methods.
Figure 3. Estimated surfaces of the regression coefficients with c = 0.7 and ρ = 0.9 by the three multiscale estimation methods.
Entropy 25 00320 g003
Figure 4. Estimated surfaces of the regression coefficients with c = 0.7 and ρ = 0 by the three multiscale estimation methods.
Figure 4. Estimated surfaces of the regression coefficients with c = 0.7 and ρ = 0 by the three multiscale estimation methods.
Entropy 25 00320 g004
Figure 5. Estimated surfaces of the regression coefficients with c = 0.7 and ρ = 0.9 by the three multiscale estimation methods.
Figure 5. Estimated surfaces of the regression coefficients with c = 0.7 and ρ = 0.9 by the three multiscale estimation methods.
Entropy 25 00320 g005
Figure 6. Boxplots of the optimal bandwidth sizes of the regression coefficients selected in the 200 experiment replications in the three multiscale estimation methods in the cases of c = 0.7 and ρ = 0.9 , 0 , 0.9 .
Figure 6. Boxplots of the optimal bandwidth sizes of the regression coefficients selected in the 200 experiment replications in the three multiscale estimation methods in the cases of c = 0.7 and ρ = 0.9 , 0 , 0.9 .
Entropy 25 00320 g006
Figure 7. Boxplots of the optimal bandwidth sizes of the regression coefficients selected in the 200 experiment replications in the three multiscale estimation methods in the cases of c = 1 and ρ = 0.9 , 0 , 0.9 .
Figure 7. Boxplots of the optimal bandwidth sizes of the regression coefficients selected in the 200 experiment replications in the three multiscale estimation methods in the cases of c = 1 and ρ = 0.9 , 0 , 0.9 .
Entropy 25 00320 g007
Figure 8. (a) Heatmaps of the estimators of the regression coefficients Intercept, DiffAdd, and LARent by the three multiscale estimation methods. (b) Heatmaps of the estimators of the regression coefficients SC1, Unempl, and LowEduc by the three multiscale estimation methods. (c) Heatmaps of the estimators of the regression coefficients Age1, Age2, and Age3 by the three multiscale estimation methods.
Figure 8. (a) Heatmaps of the estimators of the regression coefficients Intercept, DiffAdd, and LARent by the three multiscale estimation methods. (b) Heatmaps of the estimators of the regression coefficients SC1, Unempl, and LowEduc by the three multiscale estimation methods. (c) Heatmaps of the estimators of the regression coefficients Age1, Age2, and Age3 by the three multiscale estimation methods.
Entropy 25 00320 g008aEntropy 25 00320 g008bEntropy 25 00320 g008c
Table 1. Values of the accuracy indices in the N = 200 experiment replications.
Table 1. Values of the accuracy indices in the N = 200 experiment replications.
ρ β 1 ( u , v ) β 2 ( u , v ) β 3 ( u , v )
ρ MethodcMeanRMSEARMSEARMSEARMSE
   −0.9SARGWR-LL0.6−0.94550.05410.08850.17700.1429
0.7−0.94600.05430.09170.17250.1376
0.8−0.94620.05430.09330.17080.1363
0.9−0.94620.05420.09180.17070.1368
1−0.94590.05400.08800.17030.1375
SARGWR-GL0.6−0.95640.06700.11340.22500.1710
0.7−0.95540.06540.11860.20710.1565
0.8−0.95550.06530.12160.19940.1499
0.9−0.95570.06540.12090.19950.1489
1−0.95630.06580.12100.20250.1489
SARGWR-BFNA−0.94080.05320.12550.20750.2340
   −0.6SARGWR-LL0.6−0.63380.04490.08870.17670.1408
0.7−0.63430.04500.09180.17200.1355
0.8−0.63460.04510.09350.17040.1342
0.9−0.63450.04490.09150.17040.1346
1−0.63420.04480.08750.16990.1353
SARGWR-GL0.6−0.64550.05900.11730.22450.1686
0.7−0.64490.05800.12220.20650.1542
0.8−0.64510.05800.12490.19890.1477
0.9−0.64530.05800.12470.19910.1467
1−0.64560.05810.12380.20200.1467
SARGWR-BFNA−0.63150.04770.12800.20730.2335
   −0.3SARGWR-LL0.6−0.32310.03730.08950.17640.1392
0.7−0.32360.03750.09250.17180.1339
0.8−0.32400.03750.09440.17020.1327
0.9−0.32390.03740.09310.17010.1330
1−0.32360.03730.08870.16970.1338
SARGWR-GL0.6−0.33410.05140.12230.22420.1670
0.7−0.33400.05080.12690.20610.1524
0.8−0.33430.05090.12960.19850.1458
0.9−0.33440.05080.12940.19870.1449
1−0.33450.05080.12860.20170.1448
SARGWR-BFNA−0.32230.04350.13260.20710.2330
   0SARGWR-LL0.6−0.01350.03140.09390.17630.1381
0.7−0.01390.03150.09590.17160.1327
0.8−0.01420.03140.09830.17000.1315
0.9−0.01410.03150.09630.16990.1319
1−0.01390.03130.09270.16950.1327
SARGWR-GL0.6−0.02230.04390.12990.22400.1656
0.7−0.02230.04350.13320.20580.1510
0.8−0.02260.04370.13620.19830.1444
0.9−0.02260.04350.13550.19850.1433
1−0.02270.04340.13500.20150.1434
SARGWR-BFNA−0.01300.03920.14010.20690.2329
   0.3SARGWR-LL0.60.29290.02740.10670.17620.1374
0.70.29270.02740.10800.17150.1320
0.80.29250.02740.10980.16990.1309
0.90.29250.02730.10820.16990.1312
10.29270.02730.10470.16940.1320
SARGWR-GL0.60.28850.03680.14360.22390.1646
0.70.28840.03670.14700.20560.1499
0.80.28810.03680.14980.19800.1432
0.90.28820.03660.14870.19840.1422
10.28810.03660.14810.20130.1423
SARGWR-BFNA0.29520.03490.15460.20680.2327
   0.6SARGWR-LL0.60.58600.02710.16320.17610.1372
0.70.58580.02720.16500.17160.1320
0.80.58570.02730.16630.16990.1307
0.90.58580.02730.16520.16990.1311
10.58580.02720.16290.16940.1317
SARGWR-GL0.60.58960.03060.18800.22370.1640
0.70.58950.03070.19050.20570.1493
0.80.58940.03080.19210.19800.1427
0.90.58940.03070.19130.19830.1419
10.58940.03070.19170.20110.1420
SARGWR-BFNA0.59330.02960.19350.20730.2328
   0.9SARGWR-LL0.60.82130.08351.73290.18350.1372
0.70.82130.08351.73340.17770.1338
0.80.82130.08351.73360.17700.1339
0.90.82130.08351.73350.17760.1344
10.82130.08351.73320.17730.1341
SARGWR-GL0.60.83760.06691.39310.22740.1640
0.70.83750.06701.39380.20660.1494
0.80.83740.06701.39490.19800.1441
0.90.83740.06701.39510.19850.1425
10.83740.06701.39470.20150.1418
SARGWR-BFNA0.83740.06701.39510.21550.2388
Note: NA means “not applicable”.
Table 2. Variable-specific optimal bandwidth sizes yielded by the three multiscale estimation methods with the shrinking parameter c = 0.7 and 1 for SARGWR-LL and SARGWR-GL.
Table 2. Variable-specific optimal bandwidth sizes yielded by the three multiscale estimation methods with the shrinking parameter c = 0.7 and 1 for SARGWR-LL and SARGWR-GL.
SARGWR-LLSARGWR-GL
VariableCoefficient c = 0.7 c = 1 c = 0.7 c = 1 SARGWR-BF
Intercept β 1 ( u , v ) 227227132132137
DiffAdd β 2 ( u , v ) 1671329711792
LARent β 3 ( u , v ) 297287322322322
SC1 β 4 ( u , v ) 312307322322202
Unempl β 5 ( u , v ) 137172117132107
LowEduc β 6 ( u , v ) 247247247247322
Age1 β 4 ( u , v ) 24225216216297
Age2 β 5 ( u , v ) 167192117127107
Age3 β 6 ( u , v ) 197207137162102
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gao, S.-J.; Mei, C.-L.; Xu, Q.-X.; Zhang, Z. Non-Iterative Multiscale Estimation for Spatial Autoregressive Geographically Weighted Regression Models. Entropy 2023, 25, 320. https://doi.org/10.3390/e25020320

AMA Style

Gao S-J, Mei C-L, Xu Q-X, Zhang Z. Non-Iterative Multiscale Estimation for Spatial Autoregressive Geographically Weighted Regression Models. Entropy. 2023; 25(2):320. https://doi.org/10.3390/e25020320

Chicago/Turabian Style

Gao, Shi-Jie, Chang-Lin Mei, Qiu-Xia Xu, and Zhi Zhang. 2023. "Non-Iterative Multiscale Estimation for Spatial Autoregressive Geographically Weighted Regression Models" Entropy 25, no. 2: 320. https://doi.org/10.3390/e25020320

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop