Next Article in Journal
Has the Market Started to Collapse or Will It Resist?
Next Article in Special Issue
Bootstrap Assessment of Crop Area Estimates Using Satellite Pixels Counting
Previous Article in Journal
ordinalbayes: Fitting Ordinal Bayesian Regression Models to High-Dimensional Data Using R
Previous Article in Special Issue
A Bootstrap Variance Estimation Method for Multistage Sampling and Two-Phase Sampling When Poisson Sampling Is Used at the Second Phase
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Some Empirical Results on Nearest-Neighbour Pseudo-populations for Resampling from Spatial Populations

by
Sara Franceschi
1,*,
Rosa Maria Di Biase
2,
Agnese Marcelli
3,4 and
Lorenzo Fattorini
1
1
Department of Economic and Statistics, University of Siena, 53100 Siena, Italy
2
Department of Sociology and Social Research, University of Milano Bicocca, 20126 Milan, Italy
3
Department for Innovation in Biological, Agro-Food and Forest Systems, University of Tuscia, 01100 Viterbo, Italy
4
Department of Sustainable Agro-Ecosystems and Bioresources, Fondazione Edmund Mach, Research and Innovation Centre, 38098 San Michele all’Adige, Italy
*
Author to whom correspondence should be addressed.
Stats 2022, 5(2), 385-400; https://doi.org/10.3390/stats5020022
Submission received: 20 March 2022 / Revised: 11 April 2022 / Accepted: 12 April 2022 / Published: 15 April 2022
(This article belongs to the Special Issue Re-sampling Methods for Statistical Inference of the 2020s)

Abstract

:
In finite populations, pseudo-population bootstrap is the sole method preserving the spirit of the original bootstrap performed from iid observations. In spatial sampling, theoretical results about the convergence of bootstrap distributions to the actual distributions of estimators are lacking, owing to the failure of spatially balanced sampling designs to converge to the maximum entropy design. In addition, the issue of creating pseudo-populations able to mimic the characteristics of real populations is challenging in spatial frameworks where spatial trends, relationships, and similarities among neighbouring locations are invariably present. In this paper, we propose the use of the nearest-neighbour interpolation of spatial populations for constructing pseudo-populations that converge to real populations under mild conditions. The effectiveness of these proposals with respect to traditional pseudo-populations is empirically checked by a simulation study.

1. Introduction

The original bootstrap method [1] is not directly applicable without replacement sampling from finite populations because the values of the survey variable Y in the samples are not iid. Ref. [2] provides an extended survey of the bootstrap methods for finite populations, classifying them into three main groups: (i) the pseudo-population bootstrap (PPB) methods, where a pseudo-population (PP) that mimics the true one is first created from the sample and bootstrap samples are then selected from the resulting PP by a resampling design; (ii) direct bootstrap methods, where bootstrap samples are selected from the original sample or a rescaled version of it using a replacement sampling different from the original sampling design; (iii) bootstrap weights methods, where instead of creating bootstrap samples, the sample remains fixed and bootstrap weights are generated instead. In our opinion, the PPB techniques that adopt as resampling scheme the same scheme used to select the original sample are the most convincing (see also [3]). Even if PBB techniques may be computationally intensive with large samples, so that official statistical offices usually adopt weighted bootstraps [2], PBB techniques preserve the original spirit of the classical bootstrap [1] where samples are iid data from an unknown distribution and bootstrap samples, like the original ones, are iid data from the empirical distribution.
At least to our knowledge (but see also [2]), for many years the literature on PPB only dealt with the properties of bootstrap variance estimators, neglecting asymptotic considerations involving the whole bootstrap distribution. This is in contrast to the original spirit by [4] that point out the asymptotic coincidence of the distribution of the bootstrapped estimator with the actual one is an essential requirement (see also [5], chapter 3). To meet this requirement, Ref. [3] assume a superpopulation model generating an infinite sequence of Y values with a vector of design variables associated to each of them. Conditional to the realised sequence, a sequence of finite populations of increasing size is constructed and sampled by a sequence of designs with inclusion probabilities proportional to a function of the design variables. As the population size increases, the sequence of designs is requested to converge to the rejective sampling design of maximum entropy (e.g., [6]), usually referred to as conditional Poisson sampling (CPS). In this scenario, the authors consider population parameters that are Hadamard differentiable functionals of the population distribution function and plug-in estimators of these parameters achieved by substituting the Hajék estimator of the distribution function into the functional, then proving a functional central limit theorem for these estimators. As discussed in the next section, various forms of central limit theorem have a long-standing in finite population sampling. What is really novel in [3] is that the authors give conditions on PPs, ensuring that the bootstrap distributions of the plug-in estimators resampled from these PPs by means of suitable resampling schemes (including that adopted to select the original sample) asymptotically coincide with the actual distributions of these estimators. At least to our knowledge, the paper by [3] is the sole one providing a unified framework that theoretically justifies the use of PPB in a wide range of situations, including the most common sampling schemes and a large collection of parameters (e.g., totals, averages, quantiles, Lorenz curves, inequality indexes, etc.).
Unfortunately, results by [3] cannot be exploited in spatial surveys, where populations to be sampled are collections of units located onto a study region with a Y value attached to each location. The crucial issue is that the most effective designs usually adopted in spatial sampling are aimed at achieving spatial balance, i.e., samples of units evenly spread over the study region (e.g., [7]). To this purpose, locations play the role of design variables, and spatial balance is usually obtained by schemes that avoid the selection of contiguous units. As a consequence, spatial designs give rise to a large portion of samples (those including neighbourhoods) that cannot be selected. Therefore, spatial designs do not generally converge to CPS as requested by [3]. That is exemplified in the Appendix A, where it is proven that two sampling schemes that in different ways exclude the selection of contiguous or neighbouring units, fail to converge to CPS. In addition, the issue of creating PPs able to mimic the characteristics of the real populations is challenging in spatial framework where spatial trends, relationships, and similarities among neighbouring locations are invariably present.
Based on these considerations, in this paper we propose the use of PPs suitable to work in spatial surveys. For some aspects, the proposed PPs are similar to the hot-deck PPs by [3] in which, given the set of sampled units, the Y value assigned at any other location is the value of the sample unit that is nearest in the space of the values of a size variable X. In an analogous way, but involving locations, we here propose to assign at any unsampled unit the value observed at the sampled unit that is nearest in the geographical space.
Owing to its simplicity, the principle here proposed for constructing spatial PPs constitutes a widely extended practice for mapping spatial populations, and it is commonly referred to as the nearest-neighbour interpolator (NNI). In the past, NNI has been invariably adopted as a descriptive technique. In a model-dependent perspective, Ref. [8] (Section 5.9) classified NNI among “non stochastic methods of spatial prediction”, i.e., descriptive mapping techniques for which no stochastic model is assumed and no inference is attempted. Recently, Ref. [9] approached NNI from a design-based point of view, proving its design-based consistency under mild assumptions about Y values. Under these conditions, for populations and sample sizes sufficiently large, the resulting PPs are likely to be good pictures of the true spatial populations.
The main purposes of this paper are essentially two: (i) to propose the use of PPs based on NNI that are able to converge to the true populations under mild conditions required for the Y values and for the spatial sampling schemes, so that one can expect intuitively that the bootstrap distribution of an estimator should converge to the actual distribution; (ii) to empirically evidence by means of a simulation study the superiority of the PPs based on the NNI in fitting the true populations with respect to PPs commonly used in the literature and the subsequent superiority in the convergence of the bootstrap distributions toward the actual distributions of the involved estimators. The paper is organised as follows: Section 2 is devoted to the problem of estimating totals and averages in spatial surveys pointing out the reasons and the opportunity of adopting PPB for performing a reliable design-based inference on the distributions of the Horvitz–Thompson (HT) estimators. In Section 3, some criteria to construct PPs suitable to work with spatial populations are considered, including the NNI criterion. In Section 4, the PPs considered in Section 3 are compared by a simulation study in terms of the performance of the bootstrap distributions to fit the actual distribution of the HT estimators of totals under two schemes usually adopted in spatial sampling, as well as in terms of the coverage of bootstrap confidence intervals. Concluding remarks are provided in Section 5.

2. Statement of the Problem

Design-based inference has been widely adopted for estimating totals and averages of finite spatial populations, especially in agricultural, forest, and environmental studies. Examples include estimation of totals of agricultural productions for an area frame of agricultural holdings [10], estimation of total or average of volumes for a population of trees settled in a forest stand [11], estimation of the total timber volume for a population of forest compartments in an administrative district [12], and estimation of the average of eutrophication indexes in the population of small lakes of United States [13]. In these cases, estimation is customarily performed via the HT criterion or suitable model-assisted modifications able to exploit auxiliary information, such as generalised regression estimators or ratio estimators that are, however, functions of HT estimators (e.g., [14], chapter 6).
With this goal in mind, consider a study region A which is supposed to be a connected and compact set of R 2 . As it is customary in the finite population asymptotic framework (e.g., [15]), in spatial surveys we suppose V = p 1 , p 2 , to be an infinite sequence of locations onto A and y ( V ) = y 1 , y 2 , to be the corresponding sequence of Y values, where for brevity y j = y ( p j ) . Moreover, let x ( V ) = x 1 , x 2 , be the corresponding sequence of a strictly positive size variable X, where for brevity x j = x ( p j ) .
By a little abuse of notation, we denote locations by their labels. A sequence U k of spatial populations is considered, where U 1 consists of the first N 1 locations from V , U 2 consists of the first N 2 > N 1 locations from V , and so on, in such a way that U k turns out to be a sequence of nested populations of increasing sizes within A . Finally, suppose a sequence of spatial designs P k to select spatially balanced samples S k from U k of fixed and increasing size n k = p N k for a fixed sampling fraction 0 < p < 1 . For each k and for each location j U k , denote by π j ( k ) the first-order inclusion probability induced by the kth design P k that is taken proportional to the size variable X. The key motivation for this choice is the efficiency of the HT estimator of population totals and averages when there is a strong direct relationship between X and Y. For simplicity we assume
π j ( k ) = n k x j l U k x l , j U k
Note, however, that equal probability designs are included in (1) when the size variable X is invariably equal to one. For each k and for each pair of locations j , h U k , denote by π j , h ( k ) the second-order inclusion probability induced by the kth design. As pointed out in the introduction, for most spatial schemes these probabilities vanish for pairs of neighbouring locations for the purpose of achieving spatially balanced samples.
Spatially balanced schemes have a long standing in spatial sampling. Examples are the generalised random-tessellation stratified sampling (GRTSS) by [13], the spatially correlated Poisson sampling (SCPS) by [16], the local pivotal method (LPM) by [11], and the doubly balanced spatial sampling (DBSS) by [17] that not only ensures spatial balance but also ensures balance with respect to a set of auxiliary variables besides those adopted to determine the inclusion probabilities. More recently, additional schemes were proposed such as the spatial determinantal sampling designs by [18] (see also [19]), the weakly associated vector sampling by [20], and the tessellation pivotal method by [21].
By using spatial schemes that spread samples, the selection of neighbouring units is avoided, which means that a large portion of second-order inclusion probabilities is null. As these sampling designs are nonmeasurable, it is impossible to estimate the variance of the HT estimator of total without bias. As stated by [16], variance estimation is “a bit tricky” in spatial sampling.
In addition, once a variance estimate is achieved, the traditional construction of confidence intervals using the standard normal quantiles would require the existence of a finite-population central limit theorem. Convergence to normality has been usually derived for large entropy designs (e.g., [22] and references therein) and more recently extended to Adamard differentiable functionals of the distribution function, in turn estimated by HT and/or Hajék criteria, by [23] and by [3], but once again under designs approaching CPS. More general results are obtained by [24] for a quite large class of designs including high-entropy designs as special case, but at the cost of working in a superpopulation setup. A central limit theorem for negatively associated designs is given by [25], but at the cost of requiring that the variance of the HT estimator converges to the variance of Poisson sampling such that, as evidenced by [21], it is a quite difficult requirement for fixed-size designs. The problem has been even solved by the two authors that provide a suitable variance estimator and prove convergence to normality of the HT estimator, but only for the case of tessellation pivotal sampling as well as by [18] that prove a central limit theorem for determinantal sampling.
However, from a wide point of view, as stated in the introduction and argued in the Appendix A, the exclusion of neighbouring units in the samples decreases the entropy of spatial designs that do not generally converge to large entropy designs as the population and sample sizes increase, thus generally precluding the use of finite-population central limit theorems available in the literature. Therefore, in spatial surveys, the use of PPB may be a suitable solution for making inferences on the distribution of the HT estimator and for constructing confidence intervals.

3. Spatial Pseudo-populations

The effectiveness of any PPB rests on the capacity of constructing PPs able to well depict the real population from which the sample has been selected. However, as stated in the introduction, some PPs commonly adopted in PPB are not suitable to mimic spatial populations. First of all, PPs such as the Holmberg PP [26] or the double-calibrated PP [3] that provide random population sizes must be avoided. Indeed, if the resulting PP size is smaller than the size of the real population, we have to arbitrarily discard a set of locations without any insight about how to choose them; on the other hand, if the resulting size is greater, we have to arbitrarily add locations, also in this case without any insight on the way in which the new units should be located on A . Moreover, PPs in which the Y values of the sampled units are assigned to an established number of units in the populations, such as the conditional Poisson PP [3], should be avoided because we have to arbitrarily assign such values to units without any insight about the locations at which these values should be assigned.
The multinomial PP (MPP) by [27] is one among the “traditional” PPs that can also be applied in spatial sampling. For each location j U k , MPP independently assigns to j the values y ^ j = y h and x ^ j = x h with probability
Pr y ^ j = y h = x h 1 l S k x l 1 , h S k
The resulting PP has obviously the same size N k of the real population. Even if virtually applicable in spatial surveys, MPP takes into consideration only the information provided by the size variable X, completely neglecting the spatial locations, while in most cases the Y values strictly depend on locations. As such, MPP is likely to provide poor representations of real populations.
The hot-deck PP (HDPP) is another PP that can be applied in spatial surveys. HDPP has been proposed by [3] based on the idea that the values of the size variable are good proxies for the Y values. Then, for each location j U k , HDPP assigns to j the values x ^ j = x j and
y ^ j = Z k , j y j + 1 Z k , j y n n x j
where Z k , j is the sample indicator variable that is equal to 1 if j S k , and it is equal to 0 otherwise and n n x j = argmin h S k x j x h , i.e., HDPP predicts the Y value at any unsampled location j by the Y value of the sampled location that is nearest to j in the space of the X values. Practically speaking, prediction (2) constitutes a NNI in the X space. Even if HDPPs do not directly consider the information provided by locations, locations enter in the predictions by the fact that any x j = x ( p j ) is actually a function of its location.
Alternatively, we here propose the use of NNI in which predictions are completely based on the spatial coordinates. NNI exploits the well-known Tobler’s first law of geography, for which the Y values at locations that are close in space tend to be more similar than those at locations that are far apart [28]. Therefore, for each location j U k , this criterion, henceforth referred to as NNPP, assigns to j the values x ^ j = x j and
y ^ j = Z k , j y j + 1 Z k , j y n n g j
where in this case n n g j = argmin h S k p j p h , i.e., we predict the Y value at any unsampled location j by the Y value of the sampled location that is nearest to j in the geographical space. Practically speaking, NNPP assigns the value of a sampled unit to each unsampled unit inside the Voronoi cell constructed around the sampled unit.
Ref. [9] approach the NNI (3) from a design-based perspective. Based on the sole assumption that the function y, defined on V is measurable and bounded by L, they derive the following inequality for any location j U k and any δ > 0 (see Theorem 1 in [9])
E k y ^ j y j Δ k p j , δ + L Pr h S k p j p h > δ
where B k p j , δ = h : h U k , p j p h δ is the set of locations in U k within the δ -ball around p j , Δ k p j , δ = sup h B k p j , δ y j y h is the largest difference from y j within the δ -ball around p j , the intersection event in the second term of (4) is the event that no sample location is within the δ -ball around p j , and E k denotes expectation with respect to the k-th design P k . Inequality (4) establishes that the design-based expectations of the absolute NNI errors are bounded by the sum of two terms: the first depending on the roughness of Y values around p j and the second depending on the sampling design. Therefore, consistency of NNI holds if both terms approach 0 as k . If y is continuous at p j then the first terms approach 0 as δ 0 in such a way that consistency depends on the ability of the sampling schemes to select locations around p j . Practically speaking, the sampling scheme adopted should be able to asymptotically provide spatial balance, i.e., to evenly spread sampled locations in such a way that, as k increases, each location of U k is likely to have neighbouring locations sampled. It is worth noting that this consistency condition is satisfied by elementary schemes such as simple random sampling without replacement (SRSWOR) and stratified spatial sampling with proportional allocation. Unfortunately, owing to their complexity, we cannot prove that the common spatial schemes mentioned in Section 2 satisfy this requirement. However, stated the effectiveness of these schemes in providing spatial balance and their empirical performance invariably superior to that of SRSWOR (e.g., [29]), consistency of NNI should also hold with these schemes. Owing to consistency, for population and sample sizes sufficiently large, NNPPs are likely to provide a precise representation of the real spatial population. It is also worth noting that as any model, Tobler’s law may be wrong. That is not frequent, but may appear in nature when competitions for resources occurs (e.g., light/water availability for trees). However, the great appeal of design-based NNI is that, even when Tobler’s law is not suitable, consistency of NNI is ensured by the continuity of y joined with the use of sampling schemes able to provide an asymptotic spatial balance.
On the other hand, it should be pointed out that consistency cannot be proven for the HDPPs in which the nearest locations are identified in the X space. Indeed, it is very natural to think the Y values linked with the X values by a deterministic function t plus a term e that depends on p j and quantifies the error achieved by deterministically predicting y j by x j .In practice, a very general scenario for the Y values is
y j = t ( x j ) + e j , j U k
It is worth noting that (5) is not an assumption but just a reckoning relationship that invariably holds once we choose a deterministic function t suitable to link Y with X. Owing to (5), it is apparent that the first term of (4) cannot approach 0 even if δ approaches 0. In practice, even if the distances in the X space become smaller and smaller, i.e., if x h approaches x j , then t ( x h ) may approach t( x j ) , nothing ensures that y h approaches y j , owing to the presence of the spatial component that is not accounted for. These considerations evidence the theoretical appeal of NNPPs over HDPPs.
Finally, a relevant difference between the MPPs with respect to HDPPs and NNPPs is that the last two only depend on the selected sample S k , in such a way that once the sample is selected there is only one PP from which bootstrap samples are selected (conditional approach). On the other hand, MPPs depend on the sample S k as well as on the random assignment of the sample values to the population. In this case, we can construct a PP at any resampling occasion from which a bootstrap sample is selected (unconditional approach). Obviously, the unconditional approach is more computationally intensive and time consuming [3]. For this reason, as well as for performing a fair comparison of the PP criteria, only the conditional approach is considered.

4. Simulation Study

The purpose of this study is to empirically evaluate the performance of the different criteria for constructing PPs from which resampling is performed to approximate the distribution of the HT estimators of totals in spatial sampling. Comparison is performed with respect to several spatial populations showing different spatial patterns, several spatially balanced sampling schemes, and several population sizes whose increase mimics the sequence of nested populations theoretically supposed throughout the paper.

4.1. Artificial Populations

To generate finite and nested spatial populations, an artificial surface on the unit square was considered, where for any point p = p 1 , p 2 the surface was defined by
y p = C s i n ( 3 p 1 ) s i n 2 ( 3 p 2 )
where the constant C = 10 / s i n 3 1.5 ensured a maximum Y value of 10 (see Figure 1). The surface (6) was chosen to represent the major characteristics of spatial populations. It was continuous, in such a way that the Y values in neighbouring locations tended to be similar, thus entailing a positive spatial autocorrelation in the resulting populations. Moreover, it varied relevantly throughout the unit square showing an increasing trend toward the centre of the square, thus entailing a spatial stratification with different values of the survey variable in different parts of the square.
From (6), three nested populations of N = 250 , 500 , and 1000 points were located in the unit square in accordance with four spatial patterns referred to as regular (RE), random (RA), trended (TR), and clustered (CL) patterns.
For RE pattern, the nested populations were constructed generating the first 250 points completely at random on the unit square but discarding those having distances smaller than 1 / 250 to those previously generated, then adding a further 250 points completely at random on the unit square but discarding those having distances smaller than 1 / 500 to those previously generated, and finally adding a further 500 points completely at random, discarding those having distances smaller than 1 / 1000 . For the RA pattern, the nested populations were constructed by simply generating 1000 points completely at random in the unit square and then assigning the first 250 to the first population, the first 500 to the second population, and all of them to the third.
For the TR pattern, the nested populations were constructed generating 1000 points of coordinates 1 u 1 2 , 1 u 2 2 ) where u 1 , u 2 were independent random numbers uniformly distributed on ( 0 , 1 ) . Then, the first 250 points were assigned to the first population, the first 500 to the second population, and all of them to the third. Finally, for the CL pattern, the nested populations were constructed generating 10 cluster centres completely at random on the unit square and then assigning 25 points to each cluster generated from a spherical normal distribution centred on the cluster centre with variances 0.025 , then adding a further 25 points to each cluster from the same normal distribution and finally adding a further 50 points to each cluster from the same distribution. Points falling outside the unit square were discarded and newly generated. For each spatial pattern, Figure 2 shows the resulting populations of maximum size 1000.
For each pattern, the Y values were given by (6) at the generated points, while the X values where generated from the Y values by
x j = y j + k u j , j = 1 , , 1000
where the u j s were independent random variables uniformly distributed on 0 , 1 and k is a multiplicative constant that ensured a correlation of 0.7 between the 1000 values of X and Y. Population totals T y obviously changed with spatial patterns and population sizes.

4.2. Sampling, Estimation, and Resampling

For any spatial population arising from the combination of spatial patterns and population sizes, R = 10 , 000 samples of fixed size n = 0.1 N were independently selected by means of LPM and DBSS with first order inclusion probabilities given by (1). Moreover, the DBSS is balanced also with respect to the horizontal coordinates ( p j , 1 ), the vertical coordinates ( p j , 2 ), and their squared values ( p j , 1 2 , p j , 2 2 ), in such a way that the resulting samples were not only spread on the unit square, but they had approximately the same barycentre and the same dispersion of the whole population [17]. LPM and DBSS were performed using the R functions lpm1 and lcube, respectively, from the BalancedSampling library.
Then, for each sample S i selected at the i-th simulation run ( i = 1 , , R ), the HT estimate of the population total T y was computed by means of
T ^ i = T x n j S i y j x j
where T x was the total of the size variable in the population. Moreover, from the sample S i , a PP ( y ^ i , j , x ^ i , j , j = 1 , , N ) was created in accordance with each of the three criteria considered in Section 3, i.e., MPP, HDPP, and NNPP. Because NN techniques require efficient algorithms to perform the many controls necessary for identifying closest units in both geographical or X spaces, we used the NND.hotdeck function in the R StatMatch library for constructing HDPPs and NNPPs. For constructing MPP, we used the basic R function sample.
Then, the ability of the resulting PP in fitting the real population was determined by means of the root of the average squared error
R A S E i = 1 N j = 1 N ( y j y ^ i , j ) 2 1 / 2 .
From each PP, B = 1000 bootstrap samples
S i , 1 * , , S i , B *
were selected adopting the same sampling scheme adopted to select the original sample S i , and for each bootstrap sample the HT estimate of the population total was computed by means of
T ^ i , b * = T ^ i , x n j S i , b y ^ i , j x ^ i , j , b = 1 , , B
where T ^ i , x was the total of the size variable in the i-th PP.
Finally, the bootstrap estimate of the relative standard error of the HT estimator of total was achieved by means of
R S E ^ i , B * = S i , B * T ^ i × 100
where
S i , B 2 * = 1 B 1 b = 1 B T ^ i , b * T ¯ i , B * 2
was the bootstrap estimate of variance and
T ¯ i , B * = 1 B b = 1 B T ^ i , b *
was the bootstrap mean. The bootstrap distribution of the HT estimator of total was approximated by
F ^ i , B * ( t ) = 1 B b = 1 B I ( T ^ i , b * t )
and the 95% bootstrap confidence interval for T y was given by T ^ i , B ( 25 ) * T ^ i , B ( 975 ) * , where T ^ i , B ( m ) * is the m-th order statistic from T ^ i , 1 * , , T ^ i , B * . The interval length l i , B = T ^ i , B ( 975 ) * T ^ i , B ( 25 ) * was also considered.

4.3. Performance Indicators

For each combination of spatial patterns, population sizes, and PP criteria, the Monte Carlo distributions of the root of average squared error R A S E 1 , , R A S E R were adopted to empirically determine the fitting performance of the criteria
F I T = 1 R i = 1 R R A S E i .
For each combination of spatial patterns and population sizes, the Monte Carlo distributions of the HT estimators of total T ^ 1 , , T ^ R were adopted to empirically determine the actual distribution of the estimator
F ( t ) = 1 R i = 1 R I ( T ^ i t )
and the relative standard error
R S E = V T y × 100
where
V 2 = 1 R 1 i = 1 R ( T ^ i T y ) 2
was the Monte Carlo variance. Once the distribution and the precision of the HT estimator were empirically determined, the ability of PPB to mimic the actual distribution of the HT estimator was determined for each of the four PP criteria by means of their worst-fitting performance
W F B = max i = 1 , , R K S i , B
where, for each simulation run i and each PP criteria, the fitting performance of the bootstrap distribution was quantified by the two-sample Kolmogorov–Simrnov (KS) statistic
K S i , B = sup t R F t F ^ i , B * ( t )
Moreover, the mimic ability of PPs was quantified by the capacity of the 95% bootstrap confidence intervals to approach the nominal level of 95% that was determined by means of their empirical coverage
C 95 B = 1 R i = 1 R I T ^ i , B ( 25 ) * T y T ^ i , B ( 975 ) *
associated with their average length
L 95 B = 1 R i = 1 R l i , B
Finally, the capacity of the bootstrap distributions to reproduce the actual precision of the HT estimators was determined comparing the empirical expectations of the bootstrap relative standard error estimators with the actual relative standard errors. Because the actual relative standard error and their bootstrap estimates were likely to approach 0 as population and sample sizes increased, the ratio
R A T B = E ( R S E ^ B * ) R S E
was adopted, where
E R S E ^ B * = 1 R i = 1 R R S E ^ i , B *

4.4. Results

Table 1 and Table 2 report the simulation results for the two spatial schemes (LPM and DBSS) adopted in the simulation. The results empirically confirmed the theoretical and practical considerations argued in the paper.
For both schemes and for all the spatial patterns, the RSEs quickly decreased as the population sizes increased, showing the presumable consistency of the HT estimator of population totals under these schemes. These findings agreed with [30], which theoretically proved the consistency of the HT estimation in spatial populations under very simple schemes such as SRSWOR but without proving the consistency under more complex spatially balanced schemes such as LPM and DBSS, owing to the lack of analytical expressions of the second-order inclusion probabilities. However, stated the superiority of these schemes in providing spatial balance with respect to SRSWOR, they concluded that “consistency presumably holds also for these schemes”.
For both schemes and for all spatial patterns, the fitting performance (FIT) of the NNPPs quickly improved as the population sizes increased, showing, also in this case, the presumable consistency of the NNI under these spatial schemes, as argued in Section 3 but not theoretically proven, once again owing to the complexity of the schemes and the lack of analytical expressions of second-order inclusion probabilities. As also theoretically argued in Section 3, consistency did not hold for the HDPPs. Indeed, as the population sizes increased, FIT indexes remain quite the same, or even increased or weakly decreased. The fitting performance was even worse for MPPs with FIT values that were about two times those achieved by HDPPs.
The fitting performance of PPs heavily impacted the ability of PPB distributions to fit the actual distributions of HT estimators. RAT values achieved under NNPPs are always greater than 1 but invariably smaller than those achieved under HDPPs and MPPs and quickly approach 1 as population sizes decrease, showing a tendency to be moderately conservative. On the other hand, RAT values achieved by HDPPs and MPPs showed a tendency to a large overestimation that unsuitability masked the actual precision of the two spatial strategies and that did not decrease, but sometimes even increased, as the population sizes increased.
The superiority of NNPPs was also demonstrated by the performance of bootstrap confidence intervals that for all the PP criteria showed coverages similar to or greater than the nominal level but with average lengths that in the case of NNPPs are much smaller, in some cases even two–three times smaller than those achieved by HDPPs and MPPs.
The same conclusion held for maximum values of the two-sample Kolmogorov–Smirnov statistic, as the WF values achieved under NNPPs were invariably smaller than those achieved under HDPPs and MPPs, and they quickly decreased as the population sizes increased while those achieved under HDPPs and MPPs remained quite stable.
Finally, in comparative terms, DBSS invariably outperformed LPM for all the performance criteria adopted in the study. That was a quite obvious result, as DBSS exploited the information provided by the spatial coordinates in addition to information provided by the size variable that was the sole exploited by LPM.

5. Final Remarks

Spatial surveys, and especially agricultural, forest, and environmental surveys, are usually aimed at estimating aggregate resources of ecological importance (e.g., crop, biomass, abundance, extent, and species richness) with quantifiable precision, in accordance with the purposes of the so-called descriptive inference without accomplishing the purpose of the so-called analytic inference, i.e., reconstructing—and hence explaining—the spatial process generating these aggregates [31]. Probably for this reason, spatial surveys have been traditionally approached from a design-based perspective, bypassing the complex task of modelling spatial phenomena, viewing these phenomena as fixed, and attributing uncertainty only to sampling (e.g., [32,33]). In this scenario, national forest inventories constitute the most prominent examples, as design-based inference has been adopted in Scandinavian countries from the beginning of the past century to estimate forest attributes at the country level (e.g., [34]). In the last years, even the mapping of ecological resources, traditionally approached in the realm of the model-based geostatistical procedures (e.g., [8]), has been approached in a design-based perspective by [9] that derives the design-based properties of the NNI.
Because in a design-based framework properties of any estimator are completely determined by the sampling design, the design choice is then crucial in this context. Regarding the sampling schemes usually adopted in spatial surveys, in this paper we have emphasised the importance of spatial balance, i.e., the capacity of the sampling schemes to evenly spread locations over the study region in such a way that no portion of the region is over- or under-represented. At the same time, we have also outlined the drawbacks involved by the use of spatially balanced schemes, i.e., (a) the lack of convergence of these schemes to CPS that precludes the exploitation of finite population central limit theorems, and (b) the presence of some second-order inclusion probabilities equal to 0 that preclude the unbiased estimation of variance.
Because the use of PPB seems to be a viable solution to overcome both the issues, the focus of the paper has switched to the choice of PPs capable of providing good representations of the spatial populations from which balanced samples are selected. At this step, the results by [9] have been crucial. The authors proved the design-based consistency of the NNI under mild conditions regarding the spatial populations and the sampling schemes. Conditions about populations simply require the smoothness of the Y values in neighbouring locations that well approach the theoretical condition of local continuity, while conditions on the sampling scheme simply require an asymptotical spatial balance that is satisfied even by SRSWOR. Therefore, when using spatial schemes explicitly tailored for achieving spatial balance, the consistency of NNI should hold a fortiori. For these reasons, the NNI of real populations has been proposed as a criterion for constructing PPs in spatial surveys, referred to as NNPPs. The obvious intuition behind this proposal is that if the NNPPs converge to the true populations, bootstrap distributions arising from these maps should converge to the actual distributions of the estimators. In practice, the consistency of NNPP causes one to think that, as the population and sample size increase, a convergence similar to that proven by [3] should occur, i.e., the PPB distributions should converge to the actual distribution at the cost of requiring an asymptotical spatial balance of the design adopted to select the original and bootstrap samples, instead of requiring the convergence to CPS. The theoretical proof of this conjecture, which has been empirically confirmed by the simulation study of Section 4, is one target of our future work.
On the other hand, the achievement of a central limit theorem universally valid for low-entropy spatially balanced schemes seems to be too wide an objective. Probably, the solution would be to prove central limit theorems for single designs as recently done by [18] for the determinantal sampling designs and by [21] for the tessellation pivotal method.

Author Contributions

Conceptualisation, L.F.; methodology, L.F.; software, R.M.D.B. and A.M.; validation, R.M.D.B., A.M. and S.F.; formal analysis, L.F.; investigation, L.F., S.F., R.M.D.B. and A.M.; data curation, R.M.D.B. and A.M.; writing—original draft preparation, L.F.; writing—review and editing, R.M.D.B., A.M. and S.F.; visualisation, A.M.; supervision, S.F.; project administration, L.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Let U N be a spatial population of N > 2 units and S n be the family of the N n possible samples of size n that can be selected from U N . Denote by P S R S the probability measure induced by SRSWOR onto S n , in such a way that P S R S S = N n 1 for each S S n .
Now, consider a successive scheme that performs n sequential drawings by excluding at each drawing the nearest neighbour of the previously selected units [35]. Obviously, in this case, the family of possible samples reduces to S n ( 0 ) S n . If n < N / 2 , from straightforward combinatorial considerations it follows that
c a r d S n ( 0 ) = N N 2 N 2 ( n 1 ) n !
in such a way that P S R S ( 0 ) S = 1 / c a r d S n ( 0 ) for each sample S S n ( 0 ) , where P S R S ( 0 ) denotes the probability measure induced by the rejective scheme onto S n ( 0 ) .
Accordingly, the Hellinger distance between the two designs is given by
d H P S R S , P S R S 0 = S S n P S R S ( S ) P S R S ( 0 ) ( S ) 2 = 2 1 S S n ( 0 ) P S R S ( S ) P S R S ( 0 ) ( S )
= 2 1 S S n 0 1 N n c a r d S n 0 = 2 1 c a r d S n 0 N n
= 2 1 N N 2 N 2 ( n 1 ) N N 1 N ( n 1 )
Because
N N 2 N 2 ( n 1 ) N N 1 N ( n 1 ) N 2 n 1 N n 1 = N 1 2 p + 2 N 1 p + 1
it follows that
d H P S R S , P S R S 0 2 1 N 1 2 p + 2 N 1 p + 1 1 N 1 2 p + 2 N 1 p + 1 = N p 1 N 1 p + 1
where p = n / N denotes the sampling fraction.
Therefore,
lim N d H P S R S , P S R S 0 lim N N p 1 N 1 p + 1 = p 1 p
i.e., as the population size increases, the sequence of designs P S R S 0 selecting a constant fraction p of units cannot converge to SRSWOR for any 0 < p < 1 / 2 . Therefore, because SRSWOR converges to CPS, for the triangular property of distances it follows that P S R S 0 does not converge to a CPS with equal first-order inclusion probability.
Similarly, consider the one-per-stratum sampling (OPSS) scheme in which, supposing N be a multiple of n, the population U N is stratified into n clusters of neighbouring N / n locations and a location is randomly selected within each cluster [36]. In this case, the family of possible samples S n ( 0 ) S n reduces to the samples of size n with a unique location per cluster. Therefore c a r d S n ( 0 ) = ( N / n ) n and P O P S S S = 1 / c a r d S n ( 0 ) for each sample S S n ( 0 ) , where P O P S S denotes the probability measure induced by OPSS onto S n ( 0 ) . Accordingly, the Hellinger distance between the two designs is given by
d H P S R S , P O P S S = 2 1 c a r d S n 0 N n = 2 1 N n n N n 1
For N and n sufficiently large, owing to the Stirling formula
N n n N n 1 2 π n ( N n ) N N n N n = 2 π p 1 p N ( 1 p ) N ( 1 p )
from which it follows that
lim N c a r d S n 0 N n 2 π p 1 p lim N N ( 1 p ) N ( 1 p ) = 0
in such a way that
lim N d H P S R S , P S R S 0 2
i.e., as the population size increases, the sequence of OPSS designs selecting a constant fraction p of units cannot converge to SRSWOR for any 0 < p < 1 . Therefore, because SRSWOR converges to CPS, for the triangular property of distances it follows that P O P S S does not converge to CPS with equal first-order inclusion probability.

References

  1. Efron, B. Bootstrap methods: Another look at the jackknife. Ann. Stat. 1979, 7, 1–26. [Google Scholar] [CrossRef]
  2. Mashreghi, Z.; Haziza, D.; Léger, C. A survey of bootstrap methods in finite population sampling. Stat. Surv. 2016, 10, 1–52. [Google Scholar] [CrossRef]
  3. Conti, P.; Marella, D.; Mecatti, F.; Andreis, F. A unified principled framework for resampling based on pseudo-populations: Asymptotic theory. Bernoulli 2020, 26, 1044–1069. [Google Scholar] [CrossRef] [Green Version]
  4. Bickel, P.J.; Freedman, D.A. Some asymptotic theory for the bootstrap. Ann. Stat. 1981, 9, 1196–1217. [Google Scholar] [CrossRef]
  5. Shao, J.; Tu, D. The Jackknife and Bootstrap; Springer: New York, NY, USA, 1995. [Google Scholar]
  6. Hajék, J. Asymptotic theory of rejective sampling with varying probabilities from a finite population. Ann. Math. Stat. 1964, 35, 1491–1523. [Google Scholar] [CrossRef]
  7. Brown, J.A.; Robertson, B.L.; McDonald, T. Spatially balanced sampling: Applications to environmental surveys. Procedia Environ. Sci. 2015, 27, 6–9. [Google Scholar] [CrossRef] [Green Version]
  8. Cressie, N.A.C. Statistics for Spatial Data; Revised ed.; Wiley: New York, NY, USA, 1993. [Google Scholar]
  9. Fattorini, L.; Marcheselli, M.; Pisani, C.; Pratelli, L. Design-based properties of the nearest neighbour spatial interpolator and its bootstrap mean squared error estimator. Biometrics 2021. Online ahead of print. [Google Scholar] [CrossRef]
  10. Nusser, S.M.; House, C.C. Sampling, data collection, and estimation in agricultural surveys. In Handbook of Statistics 29A. Sample Surveys: Designs, Methods and Applications; Pfefferman, D., Rao, C.R., Eds.; Elsevier: Amsterdam, The Netherlands, 2009; pp. 471–486. [Google Scholar]
  11. Grafström, A.; Lundström, N.L.P.; Schelin, L. Spatially balanced sampling through the pivotal method. Biometrics 2012, 68, 514–520. [Google Scholar] [CrossRef]
  12. Baffetta, F.; Fattorini, L.; Franceschi, S.; Corona, P. Design-based approach to k-nearest neighbours technique for coupling field and remotely sensed data in forest surveys. Remote Sens. Environ. 2009, 113, 463–475. [Google Scholar] [CrossRef] [Green Version]
  13. Stevens, D.J.; Olsen, A.R. Spatially balanced sampling of natural resources. J. Am. Stat. Assoc. 2004, 99, 262–278. [Google Scholar] [CrossRef]
  14. Särndal, C.E.; Swensson, B.; Wretman, J. Model Assisted Survey Sampling; Springer: New York, NY, USA, 1992. [Google Scholar]
  15. Isaki, C.T.; Fuller, W.A. Survey design under the regression superpopulation model. J. Am. Stat. Assoc. 1982, 77, 89–96. [Google Scholar] [CrossRef]
  16. Grafström, A. Spatial correlated Poisson samplings. J. Stat. Plan. Inference 2012, 14, 139–147. [Google Scholar] [CrossRef] [Green Version]
  17. Grafström, A.; Tillé, Y. Doubly balanced spatial sampling with spreading and restitution of auxiliary totals. Environmetrics 2013, 24, 120–131. [Google Scholar] [CrossRef] [Green Version]
  18. Loonis, V.; Mary, X. Determinantal sampling designs. J. Stat. Plan. Inference 2019, 199, 60–88. [Google Scholar] [CrossRef] [Green Version]
  19. INSEE. Handbook of Spatial Analysis; Institut National de la Statistique et des Études Économiques: Montrouge, France, 2018.
  20. Jauslin, R.; Tillé, Y. Spatial spread sampling using weakly associated vectors. J. Agric. Biol. Environ. Stat. 2020, 25, 431–451. [Google Scholar] [CrossRef]
  21. Chauvet, G.; Le Gleut, R. Inference under pivotal sampling: Properties, variance estimation and application to tessellation for spatial sampling. Scand. J. Stat. 2021, 48, 108–131. [Google Scholar] [CrossRef]
  22. Berger, Y. Rate of convergence to normal distribution for the Horvitz–Thompson estimator. J. Stat. Plan. Inference 1998, 67, 209–226. [Google Scholar] [CrossRef]
  23. Bertail, P.; Chautru, E.; Clémençon, S. Empirical processes in survey sampling with (conditional) Poisson designs. Scand. J. Stat. 2017, 44, 97–111. [Google Scholar] [CrossRef]
  24. Boistard, H.; Lopuhaä, H.P.; Ruiz-Gazen, A. Functional central limit theorems for single-stage sampling designs. Ann. Stat. 2017, 45, 1728–1758. [Google Scholar] [CrossRef] [Green Version]
  25. Brändén, P.; Jonasson, J. Negative dependence in sampling. Scand. J. Stat. 2012, 39, 830–838. [Google Scholar] [CrossRef] [Green Version]
  26. Holmberg, A. A bootstrap approach to probability proportional to size sampling. In Proceedings of the Section on Survey Research Methods, Dallas, VA, USA, 9–13 August 1998; pp. 378–383. [Google Scholar]
  27. Sverchkov, M.; Pfefferman, D. Prediction of finite population totals based on the sample distribution. Surv. Methodol. 2004, 30, 79–92. [Google Scholar]
  28. Tobler, W.R. A Computer Movie Simulating Urban Growth in the Detroit Region. J. Econ. Geogr. 1970, 46, 234–240. [Google Scholar] [CrossRef]
  29. Fattorini, L.; Corona, P.; Chirici, G.; Pagliarella, M.C. Design-based strategies for sampling spatial units from regular grids with applications to forest surveys, land use and land cover estimation. Environmetrics 2015, 26, 216–228. [Google Scholar] [CrossRef] [Green Version]
  30. Fattorini, L.; Marcheselli, M.; Pisani, C.; Pratelli, L. Design-based consistency of the Horvitz–Thompson estimator under spatial sampling with applications to environmental surveys. Spat. Stat. 2020, 35, 100404. [Google Scholar] [CrossRef]
  31. Smith, T.M.F. Biometrika Centenary: Sample surveys. Biometrika 2001, 88, 67–134. [Google Scholar] [CrossRef]
  32. Thompson, S.K. Sampling, 2nd ed.; Wiley: New York, NY, USA, 2002. [Google Scholar]
  33. Gregoire, T.G.; Valentine, H.T. Sampling Strategies for Natural Resources and the Environment; Chapman & Hall/CRC: Boca Raton, FL, USA, 2008. [Google Scholar]
  34. Tomppo, L.M.; Gschwantner, R.E.; McRoberts, R.E. National Forest Inventories: Pathways for Common Reporting; Springer: Heidelberg, Gremany, 2010. [Google Scholar]
  35. Fattorini, L. Applying the Horvitz–Thompson criterion in complex designs: A computer intensive perspective for estimating inclusion probabilities. Biometrika 2006, 93, 269–278. [Google Scholar] [CrossRef]
  36. Breidt, F.J. Markov chain designs for one-per-stratum sampling. Surv. Methodol. 1995, 21, 63–70. [Google Scholar]
Figure 1. Map of the artificial surface adopted to generate the finite spatial populations of points.
Figure 1. Map of the artificial surface adopted to generate the finite spatial populations of points.
Stats 05 00022 g001
Figure 2. Maps of the largest spatial populations of 1000 points adopted in the simulation.
Figure 2. Maps of the largest spatial populations of 1000 points adopted in the simulation.
Stats 05 00022 g002
Table 1. Values of relative standard errors (RSE) of the Horvitz–Thompson estimator of totals, pseudo-population fitting indexes (FIT), ratios of expectations of bootstrap RSE estimators to the actual values (RAT B ), coverages of the 0.95 bootstrap confidence intervals (C 95 B ) and expectations (in parentheses) of their lengths (L 95 B ), and the worst of the two-sample Kolmogorov–Smirnov statistics (WF B ). The values are computed from R = 10 , 000 samples and B = 1000 bootstrap samples independently selected by means of the local pivotal method (LPM) for each spatial pattern (SP), each pseudo-population (PP) and each populations size N = 250 , 500 , and 1000 with a sampling fraction of 0.1.
Table 1. Values of relative standard errors (RSE) of the Horvitz–Thompson estimator of totals, pseudo-population fitting indexes (FIT), ratios of expectations of bootstrap RSE estimators to the actual values (RAT B ), coverages of the 0.95 bootstrap confidence intervals (C 95 B ) and expectations (in parentheses) of their lengths (L 95 B ), and the worst of the two-sample Kolmogorov–Smirnov statistics (WF B ). The values are computed from R = 10 , 000 samples and B = 1000 bootstrap samples independently selected by means of the local pivotal method (LPM) for each spatial pattern (SP), each pseudo-population (PP) and each populations size N = 250 , 500 , and 1000 with a sampling fraction of 0.1.
SPNRSEPPFITRAT B C 95 B  (L 95 B )WF B
Regular2507.68MPP4.021.5190.67 (373.77)1.00
HDPP2.801.5592.8 (386.22)0.99
NNPP1.391.5099.47 (372.31)0.82
5004.96MPP4.101.6492.26 (538.57)1.00
HDPP2.731.6593.47 (543.8)0.99
NNPP0.971.3399.99 (436.74)0.72
10003.45MPP4.231.6691.76 (791.25)1.00
HDPP2.781.6793.73 (793.61)1.00
NNPP0.671.18100 (559.54)0.56
Random2507.82MPP3.961.4891.72 (371.86)1.00
HDPP2.651.6492.11 (386.42)1.00
NNPP1.361.5498.9 (359.55)0.85
5005.13MPP4.041.5992.33 (527.48)1.00
HDPP2.651.6893.5 (543.99)0.99
NNPP0.921.3100 (425.09)0.67
10003.58MPP3.981.6593.75 (757.8)1.00
HDPP2.671.6993.97 (765.29)0.99
NNPP0.641.20100 (536.11)0.54
Cluster2506.15MPP3.401.7392.04 (348.86)1.00
HDPP2.341.7794.53 (353.22)1.00
NNPP0.451.04100 (209.73)0.48
5004.33MPP3.381.6891.40 (479.39)1.00
HDPP2.231.7094.25 (476.90)0.99
NNPP0.311.01100 (287.67)0.22
10003.01MPP3.391.7390.47 (685.49)1.00
HDPP2.261.7395.24 (684.41)0.98
NNPP0.231.02100 (403.87)0.17
Trended2508.35MPP4.121.7391.76 (364.24)1.00
HDPP2.451.6792.08 (355.36)0.99
NNPP1.301.4698.9 (307.95)0.85
5005.71MPP3.781.8392.96 (479.22)1.00
HDPP2.311.7492.80 (457.10)1.00
NNPP0.841.2599.99 (326.84)0.72
10003.85MPP3.791.9193.66 (695.12)1.00
HDPP2.391.8593.9 (672.29)1.00
NNPP0.591.17100 (426.69)0.53
Table 2. Values of relative standard errors (RSE) of the Horvitz–Thompson estimator of totals, pseudo-population fitting indexes (FIT), ratios of expectations of bootstrap RSE estimators to the actual values (RAT B ), coverages of the 0.95 bootstrap confidence intervals (C 95 B ) and expectations (in parentheses) of their lengths (L 95 B ), and the worst of the two-sample Kolmogorov–Smirnov statistics (WF B ). The values are computed from R = 10 , 000 samples and B = 1000 bootstrap samples independently selected by means of the doubly balanced spatial sampling (DBSS) for each spatial pattern (SP), each pseudo-population (PP) and each populations size N = 250 , 500 , and 1000 with a sampling fraction of 0.1.
Table 2. Values of relative standard errors (RSE) of the Horvitz–Thompson estimator of totals, pseudo-population fitting indexes (FIT), ratios of expectations of bootstrap RSE estimators to the actual values (RAT B ), coverages of the 0.95 bootstrap confidence intervals (C 95 B ) and expectations (in parentheses) of their lengths (L 95 B ), and the worst of the two-sample Kolmogorov–Smirnov statistics (WF B ). The values are computed from R = 10 , 000 samples and B = 1000 bootstrap samples independently selected by means of the doubly balanced spatial sampling (DBSS) for each spatial pattern (SP), each pseudo-population (PP) and each populations size N = 250 , 500 , and 1000 with a sampling fraction of 0.1.
SPNRSEPPFITRAT B C 95 B  (L 95 B )WF B
Regular2505.48MPP4.022.0493.15 (362.55)1.00
HDPP2.802.1197.36 (374)0.98
NNPP1.421.7795.06 (313.69)0.93
5003.00MPP4.102.6495.19 (524.86)1.00
HDPP2.732.6498.05 (526.33)0.98
NNPP0.991.6797.32 (332.87)0.89
10001.67MPP4.233.3896.33 (779.86)1.00
HDPP2.783.3898.45 (775.83)0.99
NNPP0.691.5499.17 (351.90)0.85
Random2505.19MPP3.962.1694.61 (362.74)1.00
HDPP2.642.3396.93 (366.47)0.99
NNPP1.402.0194.29 (300.56)0.97
5003.02MPP4.042.6395.33 (514.21)1.00
HDPP2.652.7497.86 (523.63)0.99
NNPP0.951.6897.92 (316.92)0.91
10001.7MPP3.983.4296.68 (747.25)1.00
HDPP2.673.4698.39 (748.44)1.00
NNPP0.661.6699.70 (338.19)0.97
Cluster2503.98MPP3.402.5693.16 (336.18)1.00
HDPP2.332.6197.72 (339.91)0.99
NNPP0.501.1897.17 (152.54)0.94
5002.10MPP3.393.3495.25 (463.40)1.00
HDPP2.233.3998.39 (464.74)0.99
NNPP0.311.08100 (149.2)0.46
10001.06MPP3.394.8295.06 (670.18)1.00
HDPP2.264.8599.03 (672.88)1.00
NNPP0.231.13100 (157.43)0.38
Trended2506.10MPP4.122.2193.36 (341.83)1.00
HDPP2.462.2197.29 (343.09)0.99
NNPP1.351.7993.52 (276.32)0.99
5003.70MPP3.772.7196.02 (460.72)1.00
HDPP2.312.6297.50 (445.38)0.99
NNPP0.871.5399.36 (260.24)0.93
10002.05MPP3.793.4896.89 (675.68)1.00
HDPP2.393.3998.33 (658.15)0.99
NNPP0.611.5099.90 (291.60)0.79
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Franceschi, S.; Di Biase, R.M.; Marcelli, A.; Fattorini, L. Some Empirical Results on Nearest-Neighbour Pseudo-populations for Resampling from Spatial Populations. Stats 2022, 5, 385-400. https://doi.org/10.3390/stats5020022

AMA Style

Franceschi S, Di Biase RM, Marcelli A, Fattorini L. Some Empirical Results on Nearest-Neighbour Pseudo-populations for Resampling from Spatial Populations. Stats. 2022; 5(2):385-400. https://doi.org/10.3390/stats5020022

Chicago/Turabian Style

Franceschi, Sara, Rosa Maria Di Biase, Agnese Marcelli, and Lorenzo Fattorini. 2022. "Some Empirical Results on Nearest-Neighbour Pseudo-populations for Resampling from Spatial Populations" Stats 5, no. 2: 385-400. https://doi.org/10.3390/stats5020022

Article Metrics

Back to TopTop