Next Article in Journal
Kinetic Models of Wealth Distribution with Extreme Inequality: Numerical Study of Their Stability against Random Exchanges
Previous Article in Journal
Kernel-Free Quadratic Surface Regression for Multi-Class Classification
Previous Article in Special Issue
A Novel Trajectory Feature-Boosting Network for Trajectory Prediction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multivariate Modeling for Spatio-Temporal Radon Flux Predictions

1
National Future Center of Biodiversity, 90133 Palermo, Italy
2
DES-Sect. of Mathematics and Statistics, University of Salento, 73100 Lecce, Italy
3
National Center of High Performance Computing, Big Data and Quantum Computing, 40121 Bologna, Italy
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(7), 1104; https://doi.org/10.3390/e25071104
Submission received: 9 February 2023 / Revised: 16 June 2023 / Accepted: 27 June 2023 / Published: 24 July 2023

Abstract

:
Nowadays, various fields in environmental sciences require the availability of appropriate techniques to exploit the information given by multivariate spatial or spatio-temporal observations. In particular, radon flux data which are of high interest to monitor greenhouse gas emissions and to assess human exposure to indoor radon are determined by the deposit of uranium and radio (precursor elements). Furthermore, they are also affected by various atmospheric variables, such as humidity, temperature, precipitation and evapotranspiration. To this aim, a significant role can be recognized to the tools of multivariate geostatistics which supports the modeling and prediction of variables under study. In this paper, the spatio-temporal distribution of radon flux densities over the Veneto Region (Italy) and its estimation at unsampled points in space and time are discussed. In particular, the spatio-temporal linear coregionalization model is identified on the basis of the joint diagonalization of the empirical covariance matrices evaluated at different spatio-temporal lags and is used to produce predicted radon flux maps for different months. Probability maps, that the radon flux density in the upcoming months is greater than three historical statistics, are then built. This might be of interest especially in summer months when the risk of radon exhalation is higher. Moreover, a comparison with respect to alternative models in the univariate and multivariate context is provided.

1. Introduction

Radon is a colorless, odorless, tasteless, inert radioactive gas, that is derived from the decay of uranium, which is a radioactive element found in small quantities in all sediments and rocks. According to the IARC (International Agency for Research on Cancer) and the WHO (World Health Organization), radon pollution represents the second leading cause of lung cancer after smoking. Since radon is present in the depths of the Earth in gaseous phase, it reaches the surface after interacting with other natural elements, such as uranium and radio (precursor elements). Moreover, various atmospheric variables, such as humidity, temperature and precipitation [1,2], also affect the transport of radon on the surface. In the literature, different variables related to radon migration from the soil to the atmosphere were used and various methods based on the spatial regression techniques (i.e., Geographical Weighted Regression, Empirical Bayesian Regression Kriging, Machine Learning and Forecast Regression) were proposed [3,4,5,6,7]. However, these contributions focus on the analysis of spatial radon distribution, while disregarding the opportunity to propose a spatio-temporal modeling where the spatial and temporal dimensions of the investigated phenomenon, as well as their possible interactions are considered. In this context, geostatistics can offer appreciable techniques and tools to face estimation problems in space and time, not only for the univariate case, but also for the multivariate one, which represents an innovative approach for the study of radon data. For this last, it is crucial to estimate the spatio-temporal multiple covariance function and define an apt multivariate correlation model which can ensure reliable inference on the radon flux variable. Indeed, efforts are focused on the estimation and modeling of the matrix-valued covariance function, which explains the direct and cross linear dependence in space or in space–time among the variables.
Besides models for multivariate spatial data which were extensively explored [8,9,10,11], various contributions can be found in the literature regarding multivariate spatio-temporal data modeling starting from the early nineties [12,13,14,15,16,17,18]. However, the linear coregionalization model (LCM) developed in space and in space–time (ST-LCM) is sufficiently flexible computationally speaking to be applied extensively for a large variety of scientific fields [9,19,20,21,22,23]. Li et al. [24] proposed a methodology to evaluate the appropriateness of several common assumptions, such as symmetry, separability and linear model of coregionalization, on multivariate covariance functions in the spatio-temporal context, while Choi et al. [13] proposed an ST-LCM where the multivariate spatio-temporal process was expressed as a linear combination of independent Gaussian processes in space–time with mean zero and a separable spatio-temporal covariance. Apanasovich & Genton [25] considered some solutions to the symmetry problem; moreover, they proposed a class of cross-covariance functions for multivariate random fields based on the work of Gneiting [26]. The maximum likelihood estimation of heterotopic spatio-temporal models with spatial LCM components and temporal dynamics was developed by Fassó and Finazzi [27]. A GSLib [28] routine for cokriging was properly modified in De Iaco et al. [29] to incorporate the ST-LCM, previously developed [15] using the generalized product–sum variogram model. In [30,31,32], an automatic procedure for fitting the ST-LCM was presented and some computational aspects, analytically described by a main flow-chart, were discussed. Simultaneous diagonalization of the sample matrix variograms or the sample covariance matrices was used to isolate the basic components of an ST-LCM and it has been illustrated how nearly simultaneous diagonalization of the covariance matrices simplifies their modeling.
This paper aims at presenting a spatio-temporal multivariate geostatistical modeling approach, based on joint diagonalization of the empirical covariance matrices evaluated at different spatio-temporal lags. Thus, the possibility to consider a reduced number of uncorrelated variables (lower than the number of observed variables) and to separately model the spatio-temporal evolution of these uncorrelated components represents a substantial simplification in multivariate modeling. A space–time linear coregionalization model (ST-LCM) with proper parametric models for the latent components was fitted to the matrix-valued covariance function estimated for the radon flux and three relevant atmospheric variables, which include evapotranspiration, minimum humidity and mean temperature. The analysis highlighted how to identify the space–time components and choose the corresponding model by evaluating some characteristics of the components, such as separability and type of non-separability. Apart from the practical importance of this study in the specific field of application, it empirically proves the flexibility of modeling the matrix-valued covariance structure using the ST-LCM when more than two or three variables are involved. This is made possible through an approach based on the joint diagonalization of sample covariance matrices at different lags. Indeed, it enables analysts to overcome the complexity of fitting the ST-LCM, particularly when the number of variables to be analyzed increases, as it does not require modeling all direct and cross-covariance functions. Furthermore, through the aforementioned procedure, analysts can easily identify the basic components of the ST-LCM and model each component according to its empirical characteristics, which may necessitate the use of different classes of covariance models featuring various types of non-separability [33].
In the following, after an introduction of the theoretical framework of the multivariate spatio-temporal random function and its features (Section 2), the ST-LCM, its assumptions and appropriate statistical tests are presented (Section 3), then techniques for prediction and risk assessment maps are introduced (Section 4). Finally, the spatio-temporal multivariate exploratory analysis concerning radon flux and three atmospheric variables (temperature, humidity and evapotranspiration) in the Veneto Region (Italy), and the subsequent modeling step are provided (Section 5 and Section 6). Predictions of the primary variable (radon flux) are obtained through spatio-temporal cokriging (Section 7) for some future months. Then, risk maps showing the probability that the radon flux in a summer month exceeds the value of some chosen statistics (25th percentile, average, median), computed on the corresponding historical data, are produced (Section 8). The choice of summer month is justified by the need to investigate the risk of radon exhalation when warmer climatic conditions in the study area favor the increase of radon flux.
It is worth pointing out that the prediction results of this multivariate study might be of interest for their reflections in public health and for planning consequent remediation strategies.

2. Multivariate Spatio-Temporal Random Function

Let Z ( u ) = [ Z 1 ( u ) , , Z p ( u ) ] T , be a vector of p spatio-temporal random functions (STRF) defined on the domain D × T R d + 1 , with  ( d 3 ) , then
{ Z ( u ) , u = ( s , t ) D × T R d + 1 } ,
represents a multivariate spatio-temporal random function (MSTRF), where s = ( s 1 , , s d ) are the coordinates of the spatial domain D R d and t the coordinate of the temporal domain T R .
Afterwards, the MSTRF will be denoted by Z and its components by Z i . The p STRF Z i , i = 1 , , p , are the components of Z and they are associated to the spatio-temporal variables under study; these components are called coregionalized variables [34]. The observations z i ( u α ) , i = 1 , , p , α = 1 , , N i , of the p variables Z i , at the points u α D × T , are considered a finite realization of an MSTRF  Z and N i is the number of spatio-temporal points for the variable Z i .
An MSTRF  Z , with p components, is second-order stationary if:
  • for any STRF  Z i , i , , p ,
    E [ Z i ( u ) ] = m i , u D × T , i = 1 , , p ;
  • for any pair of STRF  Z i and Z j , i , j = 1 , , p , the cross-covariance C i j depends only on the spatio-temporal separation vector h = ( h s , h t ) between the points u and u + h :
    C i j ( h ) = E [ ( Z i ( u + h ) m i ) ( Z j ( u ) m j ) ] = = E [ Z i ( u + h ) Z j ( u ) ] m i m j ,
    where u , u + h D × T , i , j = 1 , , p .
    The function C i j ( · ) is also called direct covariogram, if  i = j , or cross-covariogram, if  i j .
There exist several physical phenomena for which neither variance, nor the covariance exist, however it is possible to assume the existence of the variogram.

2.1. Separability for an MSTRF

The cross-covariance C i j for a second-order stationary MSTRF Z is separable if:
C i j ( h ) = ρ ( h ) a i j , h = ( h s , h t ) D × T , i , j = 1 , , p ,
where a i j are the elements of a ( p × p ) positive definite matrix and ρ ( · ) is a correlation function. In this case, it results in:
C i j ( h ) C i j ( h ) = ρ ( h ) ρ ( h ) , h , h D × T , i , j = 1 , , p ,
hence the changes of the cross-covariances, with respect to the changes of the vector h , do not depend on the pair of the STRF Z i , Z j .
The cross-covariance C i j for a second-order stationary MSTRF Z is fully separable if:
C i j ( h s , h t ) = ρ S ( h s ) ρ T ( h t ) a i j , ( h s , h t ) D × T , i , j = 1 , , p ,
where a i j are the elements of a ( p × p ) positive definite matrix, ρ S ( · ) is a spatial correlation function and ρ T ( · ) is a temporal correlation function. In the literature, many statistical tests for separability have been proposed and are based on parametric models [35,36,37], likelihood ratio tests and subsampling [38] or spectral methods [39,40].

2.2. Symmetry for an MSTRF

The cross-covariance C i j of a second-order stationary MSTRF Z , with p components, is symmetric if:
C i j ( h ) = C i j ( h ) , h D × T , i , j = 1 , , p ,
or, equivalently, if:
C i j ( h ) = C j i ( h ) , h D × T , i , j = 1 , , p .
The cross-covariance C i j of a second-order stationary MSTRF Z , with p components, is fully symmetric if:
C i j ( h s , h t ) = C i j ( h s , h t ) , ( h s , h t ) D × T , i , j = 1 , , p ,
or, equivalently,
C i j ( h s , h t ) = C i j ( h s , h t ) , ( h s , h t ) D × T , i , j = 1 , , p .
Atmospheric, environmental and geophysical processes are often under the influence of prevailing air or water flows, resulting in a lack of full symmetry [26,41,42]. According to the relationships between separability, symmetry, stationarity and the LCM in the general class of cross-covariances of an MSTRF  Z , it is worth recalling that if a cross-covariance is separable, then it is symmetric. However, in general, the converse is not true. Moreover, the hypothesis of full separability is a special case of full symmetry. Several tests to check the symmetry and separability of direct and cross-covariances can be found in the literature [24,40,43,44,45].

2.3. Direct and Cross-Covariance Estimators

Structural analysis requires covariance and variogram estimation, starting from the observed values of the variables under study. Hence, it is of basic importance to introduce the estimators of the direct and cross-covariances, respectively, defined afterwards.
Let A i , i = 1 , , p , be the sets of points of the domain D × T R d + 1 , where the p variables have been observed: A i = { u α = ( s , t ) α , α = 1 , , N i } , i = 1 , , p , then the estimators of the direct and cross-covariances are built as follows:
C ^ i j ( r s , r t ) = 1 | L i j ( r s , r t ) | L i j ( r s , r t ) [ Z i ( s + h s , t + h t ) m ^ i ] [ Z j ( s , t ) m ^ j ] ,
with i , j = 1 , , p ( i = j for the sample direct covariance and i j for the sample cross-covariance), where
m ^ i = 1 N i α = 1 N i Z i ( s , t ) α , i = 1 , , p ,
while | L i j ( r s , r t ) | is the cardinality of the following set:
L i j ( r s , r t ) = ( s + h s , t + h t ) A i , ( s , t ) A j : r s h s < T o l ( r s )   and   | r t h t | < T o l ( r t ) ,
where r s is the spatial separation vector (lag) with tolerance T o l ( r s ) and r t is the temporal lag with tolerance T o l ( r t ) , then T o l ( r s ) and T o l ( r t ) are some specific regions of tolerance around r s and r t , respectively.

3. Linear Coregionalization Model in Space–Time

ST-LCM is based on the hypothesis that each direct or cross-covariance function can be represented as a linear combination of some basic models and each direct or cross-covariance function must be built using the same basic models. Let Z be a second-order stationary MSTRF Z with p components, Z i , i = 1 , , p . The matrix C for the second-order stationary Z is built as follows:
C ( h ) = l = 1 L B l c l ( h ) .
where h = ( h s , h t ) , u = ( s , t ) , u + h = ( s + h s , t + h t ) , c l are covariance functions, called basic structures, and  the matrices B l = b i j l , called coregionalization matrices, must be positive definite, where the coefficients b i j l satisfy the following property:
b i j l = b j i l , i , j = 1 , , p .
Thus, on the basis of this assumption C i j ( h ) = C i j ( h ) and C i j ( h ) = C j i ( h ) , with i , j = 1 , , p , i j .
The basic structures c l ( h ) = c l ( h s , h t ) of the ST-LCM (5) can be modelled by using several space–time covariance models known in the literature, according to its empirical characteristics, which may necessitate the use of different classes of covariance models featuring various types of non-separability [33].
Fitting an ST-LCM to the data requires the identification of the space–time basic covariance functions and the corresponding positive definite coregionalization matrices. However, this is often a hard step to tackle. An approach based on the joint diagonalization of a set of covariance matrices computed for several spatio-temporal lags, allows us to determine the ST-LCM parameters in a very simple way.

3.1. Checking the Model Assumptions

In several environmental applications [10], the cross-covariance function is not symmetric, as for example, in time series in the presence of a delay effect, as well as in hydrology, for the cross-correlation between a variable and its derivative, such as water head and transmissivity [46]. Hence, this assumption should be tested before fitting an ST-LCM.
The appropriateness of the assumption of symmetry of an ST-LCM can be tested by using the methodology proposed by Li et al. [24] and discussed in De Iaco et al. [43]. This is based on the asymptotic joint normality of the sample space–time cross-covariances estimators. Given a set Λ of user-chosen spatio-temporal lags and the cardinality c of Λ , let G n = { C i j ( h s , h t ) : ( h s , h t ) Λ , i , j = 1 , , p } be a vector of c p 2 cross-covariances at spatio-temporal lags k = ( h s , h t ) in Λ . Moreover, let C ^ i j ( h s , h t ) be the estimator of C i j ( h s , h t ) based on the sample data in the spatio-temporal domain D × T n , where D represents the spatial domain and T n = { 1 , , n } the temporal one, and  define { C ^ i j ( h s , h t ) : ( h s , h t ) Λ , i , j = 1 , , p } . Under the assumptions given in Li et al. [24], | T n | 1 / 2 ( G ^ n G ) d N c p 2 ( 0 , Σ ) , where | T n | Σ converges to C o v ( G ^ n , G ^ n ) . The tests for symmetry properties can then be based on the following statistics
T S = | T n | ( A G ^ n ) T ( A Σ A T ) 1 ( A G ^ n ) d χ a 2 ,
where a is the row rank of the matrix A , which is such that A G = 0 under the null hypothesis.
Moreover, the choice of modeling the MSTRF  Z by an ST-LCM is based on the prior assumption that the multivariate correlation structure of the variables under study is characterized by L ( L 2 ) scales of spatio-temporal variability. On the other hand, if the multivariate correlation of a set of variables does not present different scales of variability ( L = 1 ), then the cross-covariance functions are separable. (3) Hence, as in the spatial context [10], a space–time intrinsic coregionalization model can be considered. Obviously, this last model is just a particular case ( L = 1 ) of the ST-LCM defined in (5) and it is much more restrictive than the linear model [34] since it requires that all the variables have the same correlation function, with possible changes in the sill values. Note that, if a cross-covariance is separable, then it is symmetric.
Once the basic components c l , l = 1 , , L , are estimated, it is necessary to proceed with their modeling. The choice of a reasonable class of models to be fitted to each empirical component c ^ l can be supported by analyzing the characteristics of the empirical basic covariance surfaces [47], such as the type of non-separability through the computation of the sample non-separability ratios [14] or by applying a statistical test [48,49].
Remark 1. 
In the ST-LCM, each component is represented as a linear combination of latent, uncorrelated univariate spatio-temporal processes. However, the smoothness of any component defaults to that of the roughest latent process, and thus the standard approach does not admit individually distinct smoothness properties, unless structural zeros are imposed on the latent process coefficients [9].

3.2. Some Computational Aspects on the ST-LCM Fitting

The model in (5) can be fitted by following a simplified procedure. First, the  empirical basic covariance functions are detected through the use of the joint diagonalization of the sample ( p × p ) covariance matrices C ^ ( h s , h t ) k = [ C ^ i j ( h s , h t ) k ] evaluated at different space–time lags ( h s , h t ) k = ( h s k , h t k ) , with  k = 1 , , K . In particular, after determining the ( p × p ) orthogonal matrix Ψ , such that
Ψ C ^ ( h s , h t ) k Ψ T = Δ ( h s , h t ) k , k = 1 , , K ,
where Δ k are the diagonal ( p × p ) matrices, the sample basic uncorrelated components c ^ l (estimates of c l , l = 1 , , p ) are obtained by extracting all the diagonal entries across the K matrices Δ k [50,51]. Joint diagonalization with respect to the lags implies that the matrix Ψ does not depend on the lags. For the purpose of joint diagonalization many algorithms exist, see Illner et al. [52] for an overview. In this study, the algorithm based on Jacobi rotations [53] and included in the R package JADE [54] is recalled. Note that only the L p basic components c ^ l , l = 1 , , L , are included in the ST-LCM; the selected components are the ones that exhibit distinct spatio-temporal scales of variability (corresponding to the lag where the surface decays).
These selected basic components are then modelled by adopting appropriate classes of models (according to the empirical characteristics of each basic component). At the end, the  coregionalization matrices are computed and their admissibility is checked. A reasonable class of models to be fitted to each component can be assessed, according to some characteristics, such as full symmetry ( c l ( h s , h t ) = c l ( h s , h t ) = c l ( h s , h t ) ) and full separability ( c l ( h s , h t ) = c l ( h s ) c l ( h t ) / c l ( 0 , 0 ) ), which the sample covariance surfaces might satisfy. In the case of non-separability, the type of non-separability can be studied through computation of the sample non-separability ratios, as in [14], which measure the discrepancy between the sample covariance function (supposed non-separable) and the one corresponding to the separable case (i.e., the product of the spatial and temporal marginals). Some statistical tests given in [48,49] can be easily applied toward this aim, without assuming any specific distribution for the data.
At the end, the elements b i j l of B l , l = 1 , , L , of the model in (5) are estimated. They correspond to the ratio between the contributions of C ^ i j at the l-th scale of variability, by  [ c l ( 0 , 0 ) ] , i.e.,
b i j l = [ C ^ i j ( h s , h t ) l 1 ] [ C ^ i j ( h s , h t ) l ] [ c l ( 0 , 0 ) ] , l = 1 , , L ,
where C ^ i j ( h s , h t ) 0 = C ^ i j ( 0 , 0 ) , with i , j = 1 , , p , i j .
The positive definiteness condition of the matrices B l , l = 1 , , L , is verified by checking that their eigenvalues are non negative. In particular, after performing the spectral decomposition of these matrices,
B l = V l Λ l V l T , l = 1 , , L ,
where V l are the eigenvector matrices and Λ l the diagonal matrices of the eigenvalues, if there are some negative eigenvalues, they are set equal to zero. In this case, the transformed coregionalization matrix B l is derived through the following expression
B l = V l Λ l V l T l = 1 , , L ,
where the diagonal matrix of the eigenvalues Λ l is modified with respect to the original Λ l since zeros are in place of the negative eigenvalues.
Further details on computational aspects can be found in [55].

4. Prediction and Risk Assessment in Space–Time

For prediction purposes, various cokriging algorithms can be found in the literature [56,57]. As a natural extension of spatial ordinary cokriging to the spatio-temporal context, the linear space–time predictor can be written as
Z ^ i ( u ) = j = 1 p α = 1 N j w α i j ( u ) Z j ( u α ) , i = 1 , , p
where u = ( s , t ) D × T is a point in the space–time domain, u α = ( s , t ) α D × T ,   α = 1 , , N i , are the data points in the same domain and w α i j ( u ) are the weights assigned to the value of the jth variable, j = 1 , , p , at the α th data point, to predict the ith variable, i = 1 , , p , at the point u D × T .
The predicted space–time random vector Z ^ ( u ) at u D × T , is such that each component Z ^ i ( u ) , i = 1 , , p , is obtained by using all information at the data points u α = ( s , t ) α D × T , α = 1 , , N i .
The weights w α i j ( u ) , α = 1 , , N i , are determined by ensuring the unbiased condition for the predictor Z i ^ ( u ) and the efficiency condition, obtained by minimizing the error variance [34].
Similarly, for environmental risk assessment, the  formalism of multivariate spatio-temporal indicator random function ( M S T I R F ) and corresponding predictor, have to be introduced. Let
I ( u , z ) = [ I 1 ( u , z 1 ) , , I p ( u , z p ) ] T ,
be a vector of p spatio-temporal indicator random functions (STIRF) defined on the domain D × T R d + 1 , with  ( d 3 ) , as follows
I i ( u , z i ) = 1 if   Z i is   not   greater   ( or   not   less )   than   the   threshold   z i , 0 otherwise
where z = [ z 1 , , z p ] T . Then
{ I ( u , z ) , u = ( s , t ) D × T R d + 1 } ,
represents an MSTIRF. In other words, for each coregionalized variable Z i , with i = 1 , , p , an STIRF I i can be appropriately defined. Then the linear space–time predictor (10) can be easily written in terms of the indicator random variables I i , i = 1 , , p . If the spatio-temporal correlation structure of an M S T I R F is modelled by using the ST-LCM, the cokriging can be used to produce risk assessment maps, for one or all the variables under study. If  p = 1 , the dependence of the indicator variable is characterized by the corresponding indicator covariance of I: C S T ( h ) , which depends solely on the lag vector h = ( h s , h t ) , for any pair of points ( s , t ) and ( s + h s , t + h t ) where ( s , s + h s ) D 2 and ( t , t + h t ) T 2 . The fitted model for C S T must satisfy an admissibility condition in order to be valid and ordinary kriging can be used to generate the environmental risk assessment maps.
The G S L i b routine “COK2ST” [29] can be used to produce multivariate predictions in space–time, for one or all the variables under study, using the ST-LCM (5).

5. Space–Time Multivariate Analysis for Radon Flux Measurements

In this section, after a brief description of the geographical area under study, in terms of geological characteristics and meteorological conditions, the spatio-temporal multivariate data set related to the variables radon flux, average temperature, minimum humidity and evapotranspiration is presented. Finally, a description of the spatial and temporal profile of the variables is proposed.

5.1. Study Area

The Veneto Region is located in the north-eastern part of Italy and occupies an area of about 18,400 km 2 . It is divided into seven provinces (Belluno, Padua, Rovigo, Treviso, Venice, Verona and Vicenza) as shown in Figure 1.
Veneto’s geomorphology setting is very composite: 57% of its surface is covered by plain, 29% is mountainous (i.e., the Carnic Alps, eastern Dolomites and Venetian Prealps, in the northern zone), and the remaining 14% is constituted by hills. The Veneto Region is crossed by several of the most important rivers in Italy (i.e., the Adige, Brenta, Piave, Po and Tagliamento) and it possesses the eastern shore of the Lake Garda. The permeability of the lithologies varies from low and moderately low in the plain area, characterized by sandy and silty-clay deposits, to moderately high and high in the south-west and north of the region, due to the presence of limestone, sandstone and calcarenite soils. Finally, the large number of faults indicates intense tectonic activity in the region.
With regards to the climate, in Veneto two climatic zones can be detected: the Alpine region, which is characterized by mild summers and cold temperatures in winter with frequent snowfalls, on the other hand hilly and plain areas have a continental climate, with hot summers and very cold winters.
A wide variety of geological features and meteorological conditions which characterize the Veneto Region, strongly contribute to the exhalation of the radon from the soil, hence studies of radon flux over this region might be very interesting. To the best of our knowledge, up to now only few studies have referred to indoor radon concentrations in Veneto [58,59,60], based on the monitoring campaigns conducted starting from 1989, with the aim of assessing human health risks. However, these monitoring campaigns were discontinuous over time, and mainly regarded the risk areas identified in 1989 and only a few buildings. Although atmospheric radon flux does not represent a health risk, for smaller magnitudes than those of indoor radon, an accurate evaluation of its exhalation rate could be relevant for epidemiological studies and to determine emission strategies of greenhouse gases. Furthermore, outdoor radon and radon flux represent an important factor for the correction of indoor values [1,2].
In the present paper, a geostatistical analysis of the spatio-temporal behavior of radon flux jointly with three meteorological variables, which affect the gas dispersion [61,62], has been proposed. As pointed out in [63,64], many meteorological variables can affect radon exhalation from soil; for this reason, the average air temperature, minimum humidity and evapotranspiration have been considered. In the following spatio-temporal multivariate analysis of radon flux, the prediction maps of environmental radioactivity over the Veneto Region might contribute to an accurate assessment of the indoor radon levels, as well as to the identification of radon priority areas [1,2] and might be used by local or regional authorities for land-use planning and urban development. The latter aspect is very important especially considering that, ranking fourth in Italy, Veneto has a population of over 4.8 million inhabitants, with a yearly increase in population growth rate of 0.6‰ (higher with respect to the Italian population growth rate equal to 0.4‰).

5.2. Spatio-Temporal Multivariate Data Set

The multivariate spatio-temporal data set involves monthly values of radon flux (Rn flux, in KBqm 2 s 1 ), which is a measure of radon exhaled per surface unit and per time unit, as well as monthly averages of evapotranspiration ( E T 0 , in mm), minimum humidity ( H m , in %) and mean temperature ( T M , in °C) observed over Veneto Region (Italy), from January 2006 to April 2022 (i.e., 196 temporal observations). As can be seen in Figure 1, the meteorological variables ( T M , H m and E T 0 ) were measured at 72 survey stations belonging to the agency for environmental protection and downloaded from https://www.arpa.veneto.it/ ((accessed on 25 June 2023). On the other hand, the monthly Rn flux data referred to 69 spatial points regularly distributed over a grid with 15 km × 15 km cell size. It is worth noting that the Rn exhalation rate is freely downloadable from https://meta.icos-cp.eu/objects/5-Z-zRaqFgddALv0ohLonzWD (accessed on 25 June 2023) for the whole European land surface. Rn flux data have been computed by using the measurements of soil properties, uranium content as well as model-derived soil moisture and water-table depth, as described in [65].

5.3. Exploratory Data Analysis

In Figure 2, the spatial representation of Rn flux and of the three meteorological variables has been provided, by computing the monthly averages in January, April, July and October for the analyzed 17-year span. The represented values for the fixed four months can be considered representative of the behavior of the variables of interest over the domain during the four seasons of a year.
The Rn flux areas with high gas exhalation rates, for the fixed four months, are located in the Province of Belluno and in the west of Verona. Moreover, some isolated areas with high radon levels have also been identified in the Province of Treviso, bordering Friuli Venezia Giulia, and in the Province of Padua (i.e., the Euganean Hills area, which is characterized by a complex geological context). Note that, as pointed out in [58,60] the same areas are also characterized by high indoor radon concentrations. The highest average values have been observed in July (mean and median equal to 22.10 and 23.62 KBqm 2 s 1 , respectively, and coefficient of variation equal to 53.86%). On the other hand, the lowest values (from 0.18 to 20.56 KBqm 2 s 1 ) have been observed in January (mean and median values equal to 11.53 and 12.21 KBqm 2 s 1 , respectively, and coefficient of variation equal to 40.01%). It is worth noting that the exhalation rates increase during the summer period, when T M and E T 0 increase and H m decreases [2].
Regarding the meteorological variables ( T M , H m and E T 0 ), Figure 2 shows their spatial behavior that changes significantly from the plain to the mountains areas.
The differences in terms of T M between the high mountains and the plain and coastal areas are quite large. The  T M in January varies from 2.7 to 4.92 °C along the plain and coast, while in the mountainous zones it can be very low during winters (from 6.19 to 3.97 °C); wider differences are also evident in July.
The Veneto Region climate is characterized by high humidity levels, indeed the percentage of H m is very high in the 73 monitoring stations, especially in January and October, with a minimum percentage between 38.38 and 48.61, respectively, and maximum percentage ranging between 64.88 (in October) and 75.36 (in January).
Looking at the colour maps of the E T 0 (Figure 2), which represents a combination of evaporation and transpiration processes of water from the soil to the air, low levels have been observed in the northern and north-eastern part of the Veneto Region and high in the plain. In January, the values vary from 0.29 mm on the mountainous interior zones to 0.62 mm in the Treviso Province. On the other hand, in July the E T 0 levels are high all over the domain with peaks in the central part (plain areas). Moreover, the well-known positive correlation between air temperature and evapotranspiration, as well as the negative correlation between air temperature (or evapotranspiration) and humidity were confirmed via the spatial profile analyses. In addition, in the central part of the study area the geological characteristics of the soil as well as meteorological and climatic condition affect Rn exhalations from the soil to the atmosphere.
The temporal profiles of the analyzed variables have been evaluated through box plots of the monthly values at the sample points (Figure 3a–d). All the four variables exhibit a seasonal behavior: Rn flux, T M and E T 0 are characterized by increasing values during the spring and summer periods, on the other hand the H m denotes an opposite behavior, with increasing values during autumn/winter and decreasing values during spring/summer. Moreover, as previously pointed out for the spatial profile analyses, it is evident that low (high) values of Rn flux are associated to low (high) values of the T M and E T 0 and to high (low) H m [2].
Since the time series of the analyzed variables exhibit a seasonal component, the twelve-month averages have been computed for each station and the periodic component of each variable has been removed from the monthly values by subtracting the average seasonality. In Figure 3e–h the deseasonalized values of the four variables, grouped by months, are shown.
Then, as described in Section 6, the spatio-temporal direct and cross-covariance functions of the residuals of Rn flux, T M , H m and E T 0 were analyzed and an appropriate ST-LCM was selected.
Space–time modeling and prediction techniques were applied in order to forecast the Rn flux values over the area of interest for the period May–December 2022 (i.e., the months after the last available time point in the analyzed data set). In particular, the following aspects were considered:
(1)
estimating and modeling space–time correlation among the residuals’ variables; in the ST-LCM fitting stage the procedure proposed in [30] and based on the joint diagonalization of several sample covariance matrices was performed and the most apt covariance model [66] was fitted for each basic component;
(2)
predicting Rn flux during the period May–December 2022 by using spatio-temporal cokriging based on the estimated ST-LCM;
(3)
producing risk maps showing the probability that Rn flux in a summer month exceeds the value of some chosen statistics, by using indicator kriging [67].

6. Modeling the ST-LCM for the Study Variables

Modeling the spatio-temporal correlation among the variables under study by using the ST-LCM, requires first checking the adequacy of such a model. In particular, the symmetry assumption was checked by using the methodology proposed by [24] and mentioned in Section 3.1. Therefore, after selecting three pairs of spatial points and the temporal lag equal to 1 month, which is the lag corresponding to the largest empirical cross-correlations for all variables combinations, the test statistic T S (6) was equal to 0.34 with a corresponding p-value equal to 0.99. On the basis of this result, it is reasonable to consider the ST-LCM a suitable model for the data set under study.
Through the fitting procedure developed by [33] and starting from the estimation of the sample covariance matrices computed for a set of spatio-temporal lags and by using the outputs from the joint diagonalization of these covariance matrices, the uncorrelated basic components underlying the investigated phenomenon were identified. In the present case study, according to the geometry of the spatio-temporal domain, 8 spatial lags and 15 temporal lags for a total of 120 spatio-temporal lags, were fixed to estimate the 4 direct covariance functions and 6 cross-covariance functions, whose surfaces are shown in Figure 4. Note that all the empirical direct covariance surfaces show a strong linear relationship for short and medium lags in space–time, while decaying otherwise; from the empirical cross-covariance surfaces, it is evident the presence of a negative linear relationship between Rn flux and H m for short and medium lags as well as between H m and E T 0 for short lags, while it is positive in the other cases, as can be reasonably justified from their natural characteristics.
Successively, the 120 symmetric matrices of the sample direct and cross-covariances (matrices with dimension 4 × 4 ) have been jointly diagonalized (as previously mentioned, the diagonalization was carried out using the R package JADE) and 120 diagonal matrices plus the following orthogonal matrix Ψ were found out
Ψ = 0.92644863 0.34811734 0.14315315 0.003798338 0.36544978 0.91846172 0.12952119 0.078094773 0.08630179 0.17383656 0.98088864 0.013795947 0.02629437 0.07087482 0.02426757 0.996843243 .
From the obtained 120 diagonal matrices, the sample basic uncorrelated components c ^ l , which correspond to the estimates of c l ( l = 1 , , 4 ) , were determined by extracting all the diagonal entries across the 120 matrices. Through a graphical check of the c ^ l surfaces (the 3D plots shown in Figure 5), the following distinct scales of spatio-temporal variability, i.e., the distance in space and time at which the surface decays, have been detected:
  • 20 km in space and 2 months in time (very small scale),
  • 30 km in space and 3 months in time (small scale),
  • 55 km in space and 7 months in time (medium scale),
  • 120 km in space and 12 months in time (large scale).
The different behaviors in space and time of the basic components have suggested to retain all of them and proceed to construct the ST-LCM with four uncorrelated components, as follows:
C ( h ) = B 1 c 1 ( h ) + B 2 c 2 ( h ) + B 3 c 3 ( h ) + B 4 c 4 ( h ) ,
where the coregionalization matrices B l , l = 1 , , 4 , have to be computed as indicated in (8) and the basic components c l ( h ) , l = 1 , , 4 , have to be modelled after identifying the most apt class of covariance models with respect to some features (full symmetry, non-separability and type of non-separability) of the sample basic covariance surfaces. For this last aim, the statistical tests for symmetry and separability, based on the asymptotic joint normality of the sample space–time covariance estimators [68], were carried out according to the procedure in [48,49], for each basic component. In the same papers, the details on the test statistics, denoted by T S 1 and T S 2 , and the corresponding probability distribution were given.
From these tests’ results, it was possible to conclude that at 5% significance level, the null hypothesis of full symmetry cannot be rejected for all basic components and the null hypothesis of separability can be rejected for all basic components. With regard to the type of non-separability, the non-separability ratios, defined in [14], have been calculated for the spatial and temporal lags for which correlation in space and time is stronger, and the corresponding values have been summarized through the box and whisker plots shown in Figure 6. These graphs are very useful tools to establish the type of non-separability: in this case study the non-separability ratios grouped by spatial and temporal lags have been always smaller than one, hence a uniform negative non-separability assumption is reasonable for the four basic components.
From the obtained results, the class of fully symmetric and uniform negative non-separable covariance models has been the most appropriate class of models for each c l . In particular, the product–sum covariance function has been adopted, i.e.,
c l ( h s , h t ) = k 1 l C s l ( h s ) C t l ( h t ) + k 2 l C s l ( h s ) + k 3 l C t l ( h t ) , l = 1 , , 4 ,
with C s l = E x p ( | | h s | | ; a l ) the spatial exponential covariance model in R d with practical range a l , C t l = E x p ( | h t | ; b l ) the temporal exponential covariance model in R , with practical range b l , and parameters k 1 l , k 2 l and k 3 l , l = 1 , , 4 , as reported in Table 1. This kind of covariance model is widely used not only in environmental sciences but also in other scientific fields, such as in Demography [69]. These estimates ensure the strict positive definiteness of the basic models [70].
Finally, the matrices B l , l = 1 , , 4 , whose entries were computed by the expression in (8), include the following:
B 1 = 663.8366 86.7824 121.4861 8.5663 86.7824 109.8702 89.6217 9.3176 121.4861 89.6217 1214.1283 21.6994 8.5663 9.3176 21.6994 1.8972 , B 2 = 0.1100 0.0236 0.0443 0.0029 0.0236 0.0380 0.0057 0.0037 0.0443 0.0057 0.0601 0.0005 0.0029 0.0037 0.0005 0.0004 , B 3 = 0.2043 0.0537 0.0640 0.0012 0.0537 0.0485 0.0156 0.0029 0.0640 0.0156 0.0520 0.0020 0.0012 0.0029 0.0020 0.0040 , B 4 = 0.4020 0.2880 0.1704 0.0200 0.2880 0.5900 0.2110 0.0100 0.1704 0.2110 0.4000 0.0116 0.0200 0.0100 0.0116 0.0011 .
Note that the above coregionalization matrices are all positive definite, i.e., the corresponding eigenvalues are non-negative, thus satisfying the admissibility condition for the fitted ST-LCM.
In the next stage of the analysis, the adequacy of model (11) will be evaluated, and then the same model will be used to produce spatio-temporal predictions of the Rn flux.
The detection of the uncorrelated components, the identification of an apt covariance model for each component, as well as the computation of coregionalization matrices can be realized in the R environment, recalling properly defined functions, which are available upon request from the corresponding author.

Adequacy of the Fitted Model

The suitability of the fitted ST-LCM was assessed using a three-fold procedure, i.e.,
(a)
a comparative analysis with respect to the intrinsic coregionalization model, defined from the ST-LCM by neglecting the presence of different scales of spatio-temporal variability,
(b)
the leave-one-out cross-validation technique and the computation of the linear correlation coefficient among the available data for the Rn flux and their estimates,
(c)
the jackknife prediction of Rn flux for the last four available months (January, February, March and April 2022), whose data have not been used in the previous structural analysis, and the comparison of the predicted values with respect to the true ones.
With regard to point (a), an intrinsic coregionalization model, namely an ST-LCM with only one basic component, has been chosen as alternative contender model. In particular, it has been assumed that the study data set did not present different scales of variability in space and time, and only the basic component with the largest scale of spatio-temporal variability (20 km in space and 12 months in time) was common to the investigated variables. Hence, the following model has been considered:
C ( h ) = B c ( h ) ,
where the unique basic component has been modelled by the product–sum covariance function with a spatial exponential covariance model whose practical range is equal to 120 km, a temporal exponential covariance model with practical range equal to 12 months, and the parameter k 1 , k 2 and k 3 equal, respectively, to 0.010, 17.0644 and 2.9732. The following coregionalization matrix
B = 2.8031 0.1150 0.4013 0.0122 0.1150 1.0793 0.0464 0.0501 0.4013 0.0464 3.2720 0.0378 0.0122 0.0501 0.0378 0.0160 ,
has been estimated to complete the contender model in (13).
The fitting goodness of the two ST-LCM has been measured on the basis of the errors between the sample covariance values and the theoretical ones, for the first 5 spatial lags and 8 temporal lags, which represent the lags where the correlation is reasonably stronger. In particular, the Root Average Error (RAE), defined by [71] as the square root of the ratio between the sum of the squared errors and the sum of the squared sample values, as well as the Relative Mean Absolute Error (RMAE), computed as the ratio between the sum of the absolute errors and the sum of the absolute sample values, were produced and the results are reported in Table 2.
The error metrics’ values are almost always greater in the case of the model with a unique basic component; these results have highlighted the adequacy of the ST-LCM with four basic components, while model (13) has determined the worst fitting for the study variables. In other words, the ST-LCM with four basic product–sum models has better described the multivariate spatio–temporal correlation which characterized the investigated phenomenon.
The second check of the suitability of the fitted model in (11), performed through the leave-one-out cross-validation technique, required the computation of the cokriging estimations for the Rn flux residuals. Spatio-temporal cokriging was implemented by alternatively using the ST-LCMs (11) and (13), and then calculating the correlation coefficients between the Rn flux estimates and the recorded values. Thus, in the case of model (11) the correlation coefficient was equal to 0.972 (significant at 1% level). On the other hand, for the model (13) a correlation coefficient equal to 0.773 was found. Evidently, the adequacy of the ST-LCM with four basic components fitted by the product–sum covariance models was also confirmed by the cross-validation results.
Finally, the last procedure to assess the adequacy of the ST-LCM (11) was developed in order to make jackknife predictions of the Rn flux levels in the last four months (January, February, March and April 2022), where the available data have been used as a test set. The predictions have been computed through, alternatively
  • the spatio-temporal cokriging based on the ST-LCM in (11),
  • the spatio-temporal cokriging based on the intrinsic coregionalization model in (13),
  • the spatio-temporal kriging based on the product–sum covariance model of the Rn flux residuals, already included in the ST-LCM (11).
Rn flux residuals were first obtained through the above interpolation techniques and, successively, the Rn flux seasonal components were added to the forecasts.
On one hand, the final results were then compared on the basis of the correlation coefficient calculated among true values and predicted ones, and on the other hand, by the error measures RAE and RMAE since they furnish a measure of the relative average discrepancy between the true values of Rn flux and its predictions. In Table 3, all these statistics have been summarized, from which it is evident the better performance of the cokriging based on the ST-LCM (11) with respect to the other two proposed prediction techniques.
From the comparison with respect to the kriging results, it is clear that the direct and cross-correlations of the primary and secondary variables contribute to enhance the weights of the cokriging estimator and then to improve the predictions. In addition, the comparison with respect to the use of the intrinsic coregionalization model has highlighted the positive effect of retaining four scales of variability in the construction of the final ST-LCM and in the production of better predictions. The suitability of the ST-LCM (11), confirmed by all the procedures above discussed, has allowed the adoption of this model to predict Rn flux values over the Veneto Region at unobserved time points, as detailed in the next section.

7. Rn Flux Prediction Maps

The fitted and validated model (11) has been considered to produce predictions of the Rn flux from May 2022 (the month after the last one available in the data set) up to December 2022, over the study area. For this aim, space–time cokriging has been applied, by using the GSLib routine “COK2ST” [29] whose parameter file was implemented with all required information concerning, among others, the basic components’ models and the neighbourhood of sample data to be used in the cokriging system for both the primary variable (Rn flux) and the auxiliary variables (the three meteorological variables herein analyzed). In this way, residuals of monthly Rn flux have been first predicted, then the seasonal component, previously calculated point-by-point as described in Section 5.3, has been added to predictions in order to obtain the Rn flux monthly forecasts. In Figure 7, the contour maps of the predicted Rn flux values for May, August and November 2022 are shown: such months selection is justified by the particular behavior of the Rn flux during the year with the highest values during the summer time and the lowest ones during the winter. Indeed, as highlighted in [72] and already discussed in Section 5.3, in general a temperature increase favors the flux of radon from soil to atmosphere, therefore the Rn flux tends to increase in the spring–summer season and to decrease in the winter, due to the interaction with the meteorological conditions which characterize the study area during the warmest and the coldest months of the year.
On the other hand, regarding the spatial profile, the prediction maps clearly show those territories with a very high exposure to Rn exhalation from the ground (Figure 7). In particular, the territories located in the north-eastern part of the Veneto Region, namely in the Province of Treviso and Belluno, in the center part belonging to the Province of Padua and in the eastern one of the Region over the Province of Verona. This portion of land, which crosses the study area from northeast to southwest, is characterized by complex geo-lithological conditions which may favor the exhalation of Rn gas from the soil, as was pointed out in previous research [58,60,72] that focused on the investigation of Rn outdoor and indoor concentrations over the Veneto Region.

8. Rn Flux Risk Maps

Risk assessment maps have been associated to the prediction maps. Indicator kriging has been applied to assess the probability that Rn flux predictions for August 2022 exceed, alternatively,
(a)
the 25th percentile (19.326 KBqm 2 s 1 ),
(b)
the average value (22.75 KBqm 2 s 1 ),
(c)
the median (23.994 KBqm 2 s 1 )
of the distribution of the historical data measured in August from 2006 to 2021 at all spatial points. The choice of this month has been justified by the need to investigate the risk of radon exhalation when warmer climatic conditions in the study area favor the increase of Rn flux. Thus, three threshold values z 1 = 19.326 , z 2 = 22.75 and z 3 = 23.994 have been considered as the target thresholds in order to define the following indicator variables
I 1 ( s , t ; z 1 ) = 1 i f R n > z 1 , 0 otherwise , I 2 ( s , t ; z 2 ) = 1 i f R n > z 2 , 0 otherwise , I 3 ( s , t ; z 3 ) = 1 i f R n > z 3 , 0 otherwise ,
where s D , t T , and z 1 , z 2 , z 3 represent the thresholds above listed.
Successively, three spatio-temporal indicator kriging procedures were conducted to obtain the risk maps (Figure 8), referred to August 2022, of the exceeding the chosen Rn flux threshold levels over the study area [73]. Note that these probability maps have been produced by considering only the variable Rn flux, taking into account that the kriging performance in Table 3 was satisfactory. However, the multivariate indicator approach might be also used in order to include the behavior of the meteorological conditions.
All maps have highlighted the presence of hazardous areas in the south-western part of the Veneto Region over the Province of Verona, as well as in the north-eastern sub-area inside the Province of Treviso, where there is a very high probability level that even the largest fixed threshold (the median) is overcome. Apart from the map in Figure 8a constructed for a moderate threshold value, it is important to highlight that the risk areas emerging in the last two maps (red territories in Figure 8b,c) include several districts that are among the most populated ones in the region. Therefore, any possible plans of urban expansion will require an accurate evaluation taking into account the high risk of Rn gas exhalation from the soil.
In conclusion, the obtained probability maps represent very useful tools for detecting areas in need of strong controls, especially under the consideration that Rn flux measurements could be a fine proxy of outdoor and indoor Rn concentrations which have been proved to be highly dangerous to human health.

9. Summary

In the present paper, a critical review of the tools of the multivariate geostatistics for spatio-temporal modeling and predictions was provided. In addition, the ST-LCM fitting procedure, based on the joint diagonalization of the sample covariance matrices computed at different lags, was applied to describe the spatio-temporal correlation among Rn flux and some meteo-climatic monthly conditions (mean air temperature, minimum humidity and evapotranspiration) measured over the Veneto Region from 2006 to 2022. After model validation, a comparative analysis of the prediction performances obtained from the use of cokriging based on an alternative multivariate model and of kriging based only on the product–sum fitted exclusively for the Rn flux, confirmed the superiority of the ST-LCM. Thus, the proposed analysis pointed out that the efforts required for the identification of the multivariate spatio-temporal correlation model are crucial for the final results and are rewarded with reliable estimations.
In conclusion, the results obtained in this paper can be considered particularly important since they might contribute to an accurate assessment of indoor radon levels, as well as to the identification of radon priority areas. These might then be used by local or regional authorities for land-use planning and urban development. In the future, other multivariate approaches can be used for comparative purposes [74].

Author Contributions

Conceptualization, S.D.I.; Methodology, S.D.I. and M.P.; Validation, M.P.; Data curation, C.C.; Writing—original draft, C.C., A.C. and M.P.; Writing—review & editing, C.C., A.C. and M.P.; Visualization, A.C.; Supervision, S.D.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Publicly available datasets were analyzed in this study. Rn flux data are available at https://meta.icos-cp.eu/objects/5-Z-zRaqFgddALv0ohLonzWD. Raw data on T M , H m and E T 0 are available at https://www.arpa.veneto.it.

Acknowledgments

The authors are grateful to the anonymous reviewers for their helpful and constructive comments on earlier versions of the manuscript. The authors would like to thank Giorgia Cinelli at the National Agency for New Technologies, Energy, and Sustainable Economic Development (ENEA) for sharing with us her knowledge on the availability of Rn flux data used in this paper and supporting us on their interpretation.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Čeliković, I.; Pantelić, G.; Vukanac, I.; Krneta Nikolić, J.; Živanović, M.; Cinelli, G.; Gruber, V.; Baumann, S.; Quindos Poncela, L.S.; Rabago, D. Outdoor Radon as a Tool to Estimate Radon Priority Areas—A Literature Overview. Int. J. Environ. Res. Public Health 2022, 19, 662. [Google Scholar] [CrossRef]
  2. Čeliković, I.; Pantelić, G.; Vukanac, I.; Nikolić, J.K.; Živanović, M.; Cinelli, G.; Gruber, V.; Baumann, S.; Ciotoli, G.; Poncela, L.S.Q.; et al. Overview of Radon Flux Characteristics, Measurements, Models and Its Potential Use for the Estimation of Radon Priority Areas. Atmosphere 2022, 13, 2005. [Google Scholar] [CrossRef]
  3. Bossew, P. Mapping the geogenic radon potential and estimation of radon prone areas in Germany. Radiat. Emerg. Med. 2015, 4, 13–20. [Google Scholar]
  4. Ciotoli, G.; Voltaggio, M.; Tuccimei, P.; Soligo, M.; Pasculli, A.; Beaubien, S.E. Geographically weighted regression and geostatistical techniques to construct the geogenic radon potential map of the Lazio region: A methodological proposal for the European Atlas of Natural Radiation. J. Environ. Radioact. 2017, 166, 355–375. [Google Scholar] [CrossRef] [PubMed]
  5. Fernández, A.; Sainz, C.; Celaya, S.; Quindós, L.; Rábago, D.; Fuente, I. A new methodology for defining radon priority areas in Spain. Int. J. Environ. Res. Public Health 2021, 18, 1352. [Google Scholar] [CrossRef]
  6. Giustini, F.; Ciotoli, G.; Rinaldini, A.; Ruggiero, L.; Voltaggio, M. Mapping the geogenic radon potential and radon risk by using Empirical Bayesian Kriging regression: A case study from a volcanic area of central Italy. Sci. Total Environ. 2019, 661, 449–464. [Google Scholar] [CrossRef]
  7. Petermann, E.; Meyer, H.; Nussbaum, M.; Bossew, P. Mapping the geogenic radon potential in Germany using machine learning. Sci. Total Environ. 2021, 754, 142291. [Google Scholar] [CrossRef]
  8. Gelfand, A.E.; Schmidt, A.M.; Banerjee, S.; Sirmans, C.E. Nonstationary multivariate process modeling through spatially varying coregionalization. Test 2004, 13, 263–312. [Google Scholar] [CrossRef]
  9. Gneiting, T.; Kleiber, W.; Schlather, M. Matérn cross-covariance functions for multivariate random fields. J. Am. Stat. Assoc. 2010, 105, 1167–1177. [Google Scholar] [CrossRef]
  10. Wackernagel, H. Multivariate Geostatistics: An Introduction with Applications; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
  11. Ver Hoef, J.M.; Barry, R.P. Constructing and fitting models for cokriging and multivariable spatial prediction. J. Stat. Plan. Inference 1998, 69, 275–294. [Google Scholar] [CrossRef]
  12. Berrocal, V.; Gelf, A.E.; Holl, D.M. A bivariate space–time downscaler under space and time misalignment. Ann. Appl. Stat. 2010, 4, 1942–1975. [Google Scholar] [CrossRef]
  13. Choi, J.; Fuentes, M.; Reich, B.J.; Davis, J.M. Multivariate spatial-temporal modeling and prediction of speciated fine particles. J. Stat. Theory Pract. 2009, 3, 407–418. [Google Scholar] [CrossRef] [PubMed]
  14. De Iaco, S.; Posa, D. Positive and negative non-separability for space–time covariance models. J. Stat. Plan. Inference 2013, 143, 378–391. [Google Scholar] [CrossRef]
  15. De Iaco, S.; Myers, D.E.; Posa, D. The linear coregionalization model and the product-sum space–time variogram. Math. Geol. 2003, 35, 25–38. [Google Scholar] [CrossRef]
  16. Goovearts, P.; Sonnet, P. Study of spatial and temporal variations of hydrogeochimical variables using factorial kriging analysis. In Geostatistics Troia ’92; Soares, A., Ed.; Springer: Dordrecht, The Netherlands, 1993; Volume 24, pp. 269–286. [Google Scholar]
  17. Krupskii, P.; Genton, M.G. Factor copula models for data with spatio-temporal dependence. Spat. Stat. 2017, 22, 180–195. [Google Scholar] [CrossRef] [Green Version]
  18. Rouhani, S.; Wackernagel, H. Multivariate geostatistical approach to space–time data analysis. Water Resour. Res. 1990, 26, 585–591. [Google Scholar] [CrossRef] [Green Version]
  19. Babak, O.; Deutsch, C.V. An intrinsic model of coregionalization that solves variance inflation in collocated cokriging. Comput. Geosci. 2009, 35, 603–614. [Google Scholar] [CrossRef]
  20. Bevilacqua, M.; Hering, A.S.; Porcu, E. On the flexibility of multivariate covariance models: Comment on the paper by Genton and Kleiber. Stat. Sci. 2015, 30, 167–169. [Google Scholar] [CrossRef]
  21. De Iaco, S.; Palma, M.; Posa, D. Modeling and prediction of multivariate space–time random fields. Comput. Stat. Data Anal. 2005, 48, 525–547. [Google Scholar] [CrossRef]
  22. Emery, X. Interactive algorithms for fitting a linear model of coregionalization. Comput. Geosci. 2010, 36, 1150–1160. [Google Scholar] [CrossRef]
  23. Genton, M.G.; Kleiber, W. Cross-covariance functions for multivariate geostatistics. Stat. Sci. 2015, 30, 147–163. [Google Scholar] [CrossRef]
  24. Li, B.; Genton, M.G.; Sherman, M. Testing the covariance structure of multivariate random fields. Biometrika 2008, 95, 813–829. [Google Scholar] [CrossRef]
  25. Apanasovich, T.V.; Genton, M.G. Cross-covariance functions for multivariate random fields based on latent dimensions. Biometrika 2010, 97, 15–30. [Google Scholar] [CrossRef]
  26. Gneiting, T. Nonseparable, stationary covariance functions for space–time data. J. Am. Stat. Assoc. 2002, 97, 590–600. [Google Scholar] [CrossRef]
  27. Fassó, A.; Finazzi, F. Maximum likelihood estimation of the dynamic coregionalization model with heterotopic data. Environmentrics 2011, 22, 735–748. [Google Scholar] [CrossRef] [Green Version]
  28. Deutsch, C.V.; Journel, A.G. GSLib: Geostatistical Software Library and User’s Guide; Oxford University Press: New York, NY, USA, 1998. [Google Scholar]
  29. De Iaco, S.; Myers, D.E.; Palma, M.; Posa, D. FORTRAN programs for space–time multivariate modeling and prediction. Comput. Geosci. 2010, 36, 636–646. [Google Scholar] [CrossRef]
  30. De Iaco, S.; Maggio, S.; Palma, M.; Posa, D. Towards an automatic procedure for modeling multivariate space–time data. Comput. Geosci. 2012, 41, 1–11. [Google Scholar] [CrossRef]
  31. De Iaco, S.; Myers, D.E.; Palma, M.; Posa, D. Using Simultaneous Diagonalization to Identify a Space–Time Linear Coregionalization Model. Math. Geosci. 2013, 45, 69–86. [Google Scholar] [CrossRef]
  32. De Iaco, S.; Posa, D.; Cappello, C.; Maggio, S. Isotropy, symmetry, separability and strict positive definiteness for covariance functions: A critical review. Spat. Stat. 2019, 29, 89–108. [Google Scholar] [CrossRef]
  33. De Iaco, S.; Palma, M.; Posa, D. Choosing suitable linear coregionalization models for spatio-temporal data. Stoch. Environ. Res. Risk Assess. 2019, 33, 1419–1434. [Google Scholar] [CrossRef]
  34. Goovaerts, P. Geostatistics for Natural Resources Evaluation; Oxford University Press: New York, NY, USA, 1997. [Google Scholar]
  35. Brown, P.; Karesen, K.; Tonellato, G.O.R.S. Blur-generated nonseparable space–time models. J. R. Stat. Soc. Ser. B 2000, 62, 847–860. [Google Scholar] [CrossRef]
  36. Guo, J.H.; Billard, L. Some inference results for causal autoregressive processes on a plane. J. Time Ser. Anal. 1998, 19, 681–691. [Google Scholar] [CrossRef]
  37. Shitan, M.; Brockwell, P. An asymptotic test for separability of a spatial autoregressive model. Commun. Stat. Theory Method 1995, 24, 2027–2040. [Google Scholar] [CrossRef]
  38. Mitchell, M.W.; Genton, M.G.; Gumpertz, M.L. Testing for separability of space–time covariances. Environmetrics 2005, 16, 819–831. [Google Scholar] [CrossRef]
  39. Fuentes, M. Testing for separability of spatial-temporal covariance functions. J. Stat. Plan. Inference 2006, 136, 447–466. [Google Scholar] [CrossRef] [Green Version]
  40. Scaccia, L.; Martin, R.J. Testing axial symmetry and separability of lattice processes. J. Stat. Plan. Inference 2005, 131, 19–39. [Google Scholar] [CrossRef] [Green Version]
  41. de Luna, X.; Genton, M.G. Predictive spatio-temporal models for spatially sparse environmental data. Stat. Sin. 2005, 15, 547–568. [Google Scholar]
  42. Stein, M. Space–time covariance functions. J. Am. Stat. Assoc. 2005, 100, 310–321. [Google Scholar] [CrossRef]
  43. De Iaco, S.; Palma, M.; Posa, D. A general procedure for selecting a class of fully symmetric space–time covariance functions. Environmentrics 2016, 27, 212–224. [Google Scholar] [CrossRef]
  44. Lu, N.; Zimmerman, D.L. The likelihood ratio test for a separable covariance matrix. Stat. Probab. Lett. 2005, 73, 449–457. [Google Scholar] [CrossRef]
  45. Lu, N.; Zimmerman, D.L. Testing for directional symmetry in spatial dependence using the periodogram. J. Stat. Plan. Inference 2005, 129, 369–385. [Google Scholar] [CrossRef]
  46. Thiebaux, H.J. Spatial objective analysis. In Encyclopedia of Physical Science and Technology, 1990 Yearbook; Academic Press: New York, NY, USA, 1990; pp. 535–540. [Google Scholar]
  47. De Iaco, S.; Posa, D.; Myers, D.E. Characteristics of some classes of space–time covariance functions. J. Stat. Plan. Inference 2013, 143, 2002–2015. [Google Scholar] [CrossRef]
  48. Cappello, C.; De Iaco, S.; Posa, D. Testing the type of non-separability and some classes of space–time covariance function models. Stoch. Environ. Res. Risk Assess. 2018, 32, 17–35. [Google Scholar] [CrossRef]
  49. Cappello, C.; De Iaco, S.; Posa, D. covatest: An R Package for selecting a class of space–time covariance functions. J. Stat. Softw. 2020, 94, 1–42. [Google Scholar] [CrossRef]
  50. Myers, D.E. The linear coregionalization and simultaneous diagonalization of the variogram matrix function. Sci. Terre 1995, 32, 125–139. [Google Scholar]
  51. Xie, T.; Myers, D.E. Fitting matrix-valued variogram models by simultaneous diagonalization: (Part I: Theory). Math. Geol. 1995, 27, 867–876. [Google Scholar] [CrossRef]
  52. Illner, K.; Miettinen, J.; Fuchs, C.; Taskinen, S.; Nordhausen, K.; Oja, H.; Theis, F.J. Model selection using limiting distributions of second-order source separation slgorithms. Signal Process. 2015, 113, 95–103. [Google Scholar] [CrossRef]
  53. Cardoso, J.F.; Souloumiac, A. Jacobi angles for simultaneous diagonalization. SIAM J. Matrix Anal. Appl. 1996, 17, 161–164. [Google Scholar] [CrossRef] [Green Version]
  54. Miettinen, J.; Nordhausen, K.; Taskinen, S. Blind Source Separation Based on Joint Diagonalization in R: The Packages JADE and BSSasymp. J. Stat. Softw. 2017, 76, 1–31. [Google Scholar] [CrossRef] [Green Version]
  55. Cappello, C.; De Iaco, S.; Palma, M. Computational advances for spatio-temporal multivariate environmental models. Comput. Stat. 2022, 37, 651–670. [Google Scholar] [CrossRef]
  56. Chilés, J.; Delfiner, P. Geostatistics—Modeling Spatial Uncertainty; Wiley: Hoboken, NJ, USA, 1999. [Google Scholar]
  57. Journel, A.G.; Huijbregts, C.J. Mining Geostatistics; Academic Press: London, UK, 1981. [Google Scholar]
  58. Coletti, C.; Ciotoli, G.; Benà, E.; Brattich, E.; Cinelli, G.; Galgaro, A.; Massironi, M.; Mazzoli, C.; Mostacci, D.; Morozzi, P.; et al. The assessment of local geological factors for the construction of a Geogenic Radon Potential map using regression kriging. A case study from the Euganean Hills volcanic district (Italy). Sci. Total Environ. 2022, 808, 152064. [Google Scholar] [CrossRef] [PubMed]
  59. Strati, V.; Baldoncini, M.; Bezzon, G.P.; Broggini, C.; Buso, G.P.; Caciolli, A.; Callegari, I.; Carmignani, L.; Colonna, T.; Fiorentini, G.; et al. Total natural radioactivity, Veneto (Italy). J. Maps 2015, 11, 545–551. [Google Scholar] [CrossRef] [Green Version]
  60. Trotti, F.; Tanferi, A.; Lanciai, M.; Mozzo, P.; Panepinto, V.; Poli, S.; Predicatori, F.; Righetti, F.; Tacconi, A.; Zorzin, R. Mapping of Areas with Elevated Indoor Radon Levels in Veneto. Radiat. Prot. Dosim. 1998, 78, 11–14. [Google Scholar] [CrossRef]
  61. Li, T.Y. Diurnal variations of radon and meteorological variables near the ground. Bound.-Layer Meteorol. 1974, 7, 185–198. [Google Scholar] [CrossRef]
  62. Singh, M.; Ramola, R.C.; Singh, S.; Virk, H.S. The influence of meteorological parameters on soil gas radon. J. Assoc. Explor. Geophys. 1988, 9, 85–90. [Google Scholar]
  63. Porstendörfer, J. Properties and behaviour of radon and thoron and their decay products in the air. J. Aerosol Sci. 1994, 25, 219–263. [Google Scholar] [CrossRef]
  64. Yang, J.; Busen, H.; Scherb, H.; Hürkamp, K.; Guo, Q.; Tschiersch, J. Modeling of radon exhalation from soil influenced by environmental parameters. Sci. Total Environ. 2019, 656, 1304–1311. [Google Scholar] [CrossRef]
  65. Karstens, U.; Schwingshackl, C.; Schmithüsen, D.; Levin, I. A process-based 222radon flux map for Europe and its comparison to long-term observations. Atmos. Chem. Phys. 2015, 15, 12845–12865. [Google Scholar] [CrossRef] [Green Version]
  66. De Iaco, S.; Myers, D.E.; Posa, D. Space–time analysis using a general product-sum model. Stat. Probab. Lett. 2001, 52, 21–28. [Google Scholar] [CrossRef]
  67. Journel, A.G. Non-parametric estimation of spatial distribution. Math. Geol. 1983, 15, 445–468. [Google Scholar] [CrossRef]
  68. Li, B.; Genton, M.G.; Sherman, M. A nonparametric assessment of properties of space–time covariance functions. J. Am. Stat. Assoc. 2007, 102, 736–744. [Google Scholar] [CrossRef]
  69. De Iaco, S.; Palma, M.; Posa, D. Spatio-temporal geostatistical modeling for French fertility predictions. Spat. Stat. 2015, 14 Pt C, 546–562. [Google Scholar] [CrossRef]
  70. De Iaco, S.; Posa, D. Strict positive definiteness in geostatistics. Stoch. Environ. Res. Risk Assess. 2018, 32, 577–590. [Google Scholar] [CrossRef]
  71. Theil, H. Economic Forecasts and Policy; North-Holland: Amsterdam, The Netherlands, 1958; 567p. [Google Scholar]
  72. Cinelli, G.; De Cort, M.; Tollefsen, T. (Eds.) European Atlas of Natural Radiation; Publications Office of the European Union: Luxembourg, 2019. [Google Scholar]
  73. De Iaco, S.; Posa, D. Predicting spatio-temporal random fields: Some computational aspects. Comput. Geosci. 2012, 41, 12–24. [Google Scholar] [CrossRef]
  74. Muehlmann, C.; De Iaco, S.; Nordhausen, K. Blind recovery of sources for multivariate space–time environmental data. Stoch. Environ. Res. Risk Assess. 2023, 37, 1593–1613. [Google Scholar] [CrossRef]
Figure 1. (Left panel): map of Italian regions (study area in orange). (Middle panel): Veneto Provinces. (Right panel): location map of meteorological and radon sample points over the study area.
Figure 1. (Left panel): map of Italian regions (study area in orange). (Middle panel): Veneto Provinces. (Right panel): location map of meteorological and radon sample points over the study area.
Entropy 25 01104 g001
Figure 2. Colour maps of Rn flux (in KBqm 2 s 1 ), T M (in °C), H m (in %) and E T 0 (in mm) monthly averages calculated for (a) January, (b) April, (c) July and (d) October.
Figure 2. Colour maps of Rn flux (in KBqm 2 s 1 ), T M (in °C), H m (in %) and E T 0 (in mm) monthly averages calculated for (a) January, (b) April, (c) July and (d) October.
Entropy 25 01104 g002
Figure 3. Box and whisker plots showing (a) Rn flux (in KBqm 2 s 1 ), (b) T M (in °C), (c) H m (in %) and (d) E T 0 (in mm) and their corresponding residual values (eh), grouped by month. The symbol ∘ indicates values which lie more than 1.5 times the interquartile range from the first and third quartile.
Figure 3. Box and whisker plots showing (a) Rn flux (in KBqm 2 s 1 ), (b) T M (in °C), (c) H m (in %) and (d) E T 0 (in mm) and their corresponding residual values (eh), grouped by month. The symbol ∘ indicates values which lie more than 1.5 times the interquartile range from the first and third quartile.
Entropy 25 01104 g003
Figure 4. Sample space–time direct covariance surfaces for the residuals of (a) Rn flux, (e) T M , (h) H m and (j) E T 0 together with the cross-covariance surfaces of (b) Rn flux vs. T M , (c) Rn flux vs. H m , (d) Rn flux vs. E T 0 , (f) T M vs. H m , (g) T M vs. E T 0 , (h) H m and (i) H m vs. E T 0 , computed on the basis of the estimator in (4).
Figure 4. Sample space–time direct covariance surfaces for the residuals of (a) Rn flux, (e) T M , (h) H m and (j) E T 0 together with the cross-covariance surfaces of (b) Rn flux vs. T M , (c) Rn flux vs. H m , (d) Rn flux vs. E T 0 , (f) T M vs. H m , (g) T M vs. E T 0 , (h) H m and (i) H m vs. E T 0 , computed on the basis of the estimator in (4).
Entropy 25 01104 g004
Figure 5. Sample spatio-temporal covariance surfaces (on the left) and fitted models (on the right), for the basic components at (a) very small, (b) small, (c) medium and (d) large scale of spatio-temporal variability.
Figure 5. Sample spatio-temporal covariance surfaces (on the left) and fitted models (on the right), for the basic components at (a) very small, (b) small, (c) medium and (d) large scale of spatio-temporal variability.
Entropy 25 01104 g005
Figure 6. Box and whisker plots of sample non-separability ratios classified by spatial (on the left) and temporal (on the right) lags, computed for the basic components at (a) very small (b) small, (c) medium and (d) large scale of spatio-temporal variability. The symbol * indicates values which lie more than 1.5 times the interquartile range from the first and third quartile.
Figure 6. Box and whisker plots of sample non-separability ratios classified by spatial (on the left) and temporal (on the right) lags, computed for the basic components at (a) very small (b) small, (c) medium and (d) large scale of spatio-temporal variability. The symbol * indicates values which lie more than 1.5 times the interquartile range from the first and third quartile.
Entropy 25 01104 g006
Figure 7. Prediction maps of Rn flux (in KBqm 2 s 1 ) monthly averages for (a) May, (b) August and (c) November 2022.
Figure 7. Prediction maps of Rn flux (in KBqm 2 s 1 ) monthly averages for (a) May, (b) August and (c) November 2022.
Entropy 25 01104 g007
Figure 8. Risk maps of the probability that Rn flux predicted in August 2022 exceeds (a) the 25th percentile ( z 1 = 19.326 KBqm 2 s 1 ), (b) the mean ( z 2 = 22.75 KBqm 2 s 1 ), (c) the median value ( z 3 = 23.994 KBqm 2 s 1 ) of the corresponding historical measurements.
Figure 8. Risk maps of the probability that Rn flux predicted in August 2022 exceeds (a) the 25th percentile ( z 1 = 19.326 KBqm 2 s 1 ), (b) the mean ( z 2 = 22.75 KBqm 2 s 1 ), (c) the median value ( z 3 = 23.994 KBqm 2 s 1 ) of the corresponding historical measurements.
Entropy 25 01104 g008
Table 1. Covariance model parameters estimated for basic components in (12).
Table 1. Covariance model parameters estimated for basic components in (12).
l = 1 l = 2 l = 3 l = 4
k 1 1 = 0.0047 k 1 2 = 15.0756 k 1 3 = 15.1367 k 1 4 = 0.01
k 2 1 = 0.0358 k 2 2 = 10.6321 k 2 3 = 30.703 k 2 4 = 17.0644
k 3 1 = 0.0081 k 3 2 = 33.4605 k 3 3 = 0.01 k 3 4 = 2.9732
a 1 = 20 km a 2 = 30 km a 3 = 55 km a 4 = 120 km
b 1 = 2 months b 2 = 3 months b 3 = 7 months b 4 = 12 months
Table 2. Statistics for models’ performance assessment.
Table 2. Statistics for models’ performance assessment.
ST-LCM with 4 Basic ModelsST-LCM with 1 Basic Model
RAERMAERAERMAE
Rn flux0.1560.1520.8391.018
Rn flux vs. T M 0.6230.5441.2801.419
Rn flux vs. H m 0.4540.4260.4220.377
Rn flux vs. E T 0 0.6970.7171.3531.588
T M 0.3040.2230.6410.637
T M vs. H m 0.3920.3780.8510.825
T M vs. E T 0 0.4950.4970.1540.130
H m 0.8570.8861.5722.219
H m vs. E T 0 0.9160.8982.2492.377
E T 0 0.9690.7482.1102.302
Table 3. Statistics of the performances of prediction methods.
Table 3. Statistics of the performances of prediction methods.
Cokriging with ST-LCM in (11)Cokriging with ST-LCM in (13)Kriging
Correlation coefficient     0.909 *     0.792 *    0.787 *    
RAE0.2220.3110.325
RMAE0.1240.2460.145
* Correlation is significant at 1% level.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

De Iaco, S.; Cappello, C.; Congedi, A.; Palma, M. Multivariate Modeling for Spatio-Temporal Radon Flux Predictions. Entropy 2023, 25, 1104. https://doi.org/10.3390/e25071104

AMA Style

De Iaco S, Cappello C, Congedi A, Palma M. Multivariate Modeling for Spatio-Temporal Radon Flux Predictions. Entropy. 2023; 25(7):1104. https://doi.org/10.3390/e25071104

Chicago/Turabian Style

De Iaco, Sandra, Claudia Cappello, Antonella Congedi, and Monica Palma. 2023. "Multivariate Modeling for Spatio-Temporal Radon Flux Predictions" Entropy 25, no. 7: 1104. https://doi.org/10.3390/e25071104

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop