1. Introduction
The problem of determining the location and intensity of a radiation source arises in several settings including emergency response to mitigate nuclear threats, structural and nuclear health monitoring in nuclear reactors, and environmental cleanup of biomedical and industrial nuclear waste. In this paper, we consider the development of a robust sensor network design for determining the location and intensity of a radiation source in a simulated urban environment. Specifically, we consider source localization in a simulated 250 m × 180 m block in downtown Washington, DC.
There are several difficulties that are intrinsic to this source localization problem. The first is that inverse problems of this nature are inherently ill-posed and require some form of regularization to obtain reasonable approximate solutions [
1]. This difficulty is exacerbated by the fact that sensor observations are often coarsely spaced, which dictates that one cannot estimate source attributes that are more oscillatory than the grid spacing. As detailed in [
2], this can yield erroneous results if ignored.
The computational complexity of deterministic [
3] and stochastic, Monte Carlo [
4] radiation transport models poses a second challenge since it limits the number of model realizations that can be obtained for optimization, or Bayesian or frequentist inference. This has led to the development of alternative parameterizations or surrogate models. For example, in [
5] the authors modeled the radiation source as a point gamma source and employed a physics-based parameterization of gamma particle transport. A fast radiation transport model is also available as a component of Synth, a gamma-ray simulation code written by Pacific Northwest National Laboratory [
6,
7]. In [
8], the authors employ a Gaussian mixture to model the radiation field.
In [
9], we addressed challenges associated with optimization and Bayesian inference as a prelude to the robust sensor design problem considered in this paper. Specifically, we implemented a fast piecewise-continuously differentiable radiation transport model and solved the associated inverse problem using combined global [
10,
11] and local [
12] optimization algorithms, and Bayesian inference techniques [
13,
14]. As in [
9], we assume here that the threat is a point source and that the model accounts only for photons that travel directly from source to detector, with no intervening collisions. The radiation source is parameterized with three components: its 2-D location coordinates and intensity.
To improve computational efficiency and permit gradient-based optimization and the implementation of a robust design algorithm [
15], we implement and verify a continuously differentiable surrogate model based on radial basis functions to approximate the response for all possible detector locations. We employ this surrogate for subsequent optimization and robust design.
There are three different main strategies for taking measurements for general applications. The first searches for a given number of stationary sensors, the second one relies on moving sensors, whereas the third method, entitled scanning, activates only a subset of a given number of stationary sensors at a given moment in time. The existing methods for identification of sensor locations usually employ random fields analysis [
16], information theory [
17] and optimum experimental design theory [
15]. Moreover, sensor placement algorithms can be classified into discrete and continuous depending on the nature of the search space. In the context of nuclear source identification, Michaud [
18] used the Gaussian process optimization [
19] to solve a continuous detector placement problem and Schmidt [
20] applied Shannon entropy [
21,
22,
23] to guide mobile sensors over a discrete grid of possible measurement sites.
To form a basis for comparing different networks, a quantitative measure of efficiency is required. In this study, we explore criteria applied in optimum experimental design [
15] to solve a discrete stationary detector placement problem. These criteria are defined in terms of the Fisher information matrix associated with the unknown characteristics of the source. One of the main difficulties associated with optimization of sensor locations is the dependence of the optimal solutions on unknown true values of the source characteristics or prior approximations. To remove this dependency, we employ a robust design strategy based on maximizing the expectation of the corresponding local optimality criterion over the source characteristics domain. We then transform the resulting stochastic optimization problem into a combinatorial optimization problem by generating a finite set of possible detectors locations. We solve the combinatorial optimization problem and test the obtained optimal network of sensors against randomly selected networks. The surrogate model implementation allows solving the combinatorial optimization problem otherwise being computationally infeasible.
The remainder of the paper is organized as follows. In
Section 2, we discuss the radiation transport model and radial basis function surrogate along with associated statistical models. We also describe the domain geometry. The inverse problem based on the surrogate models is formulated in
Section 3. In
Section 4, we present the theoretical framework of the robust design in the average sense employed in this paper. In
Section 5, we present the numerical solution of the robust design problem and compare the optimal network performance to randomly selected networks. We draw conclusions in
Section 6. We summarize in the
Appendix A the Particle Swarm algorithm used to solve the inverse problem.
2. Radiation Transport Model and Surrogate Formulations
Gamma transport phenomena, as derived from Boltzmann transport theory, can be modeled by the partial differential equation (PDE)
Here
I and
S respectively denote the gamma intensity per unit area and external gamma source in the medium characterized by the position vector
$\mathbf{r}$, energy
E, and unit vector in the direction of the gamma
$\widehat{\Omega}$. The parameters include the total macroscopic cross-section for gamma interactions
${\Sigma}_{t}$, and the double-differential macroscopic scattering cross-section
${\Sigma}_{s}$, which depends on the change in gamma energy from incident energy
${E}^{\prime}$ to emergent energy
E (i.e.,
${E}^{\prime}\to E$) and the change in gamma direction from incident direction
${\Omega}^{\prime}$ to
$\Omega $ (i.e.,
${\Omega}^{\prime}\to \Omega $). We refer readers to Shultis and Faw [
3] for a more detailed treatment of transport theory.
The problem of inferring the radiation source location and intensity from sensor measurements requires the evaluation of the Boltzmann radiation transport model (
1) at various points in the admissible parameter space. Numerically solving the PDE (
1) is computationally expensive even on HPC systems. Solving an inverse problem constrained by Equation (
1) or forward propagating uncertainties using Monte Carlo simulations are not computationally feasible since require solving Equation (
1) for many times.
Instead, we employ a model that only considers gamma rays that travel directly from source to detectors, without taking into account photons that incur collisions. This approach relies on the assumption that photons undergoing interactions in the medium have a very small probability of ever arriving at a detector. We also assume that the physical scale of our problem is sufficiently large so that both the source and detectors can be localized to points inside the domain. We will denote the location of the source as
${\mathbf{r}}_{s}$ and associated intensity by
${S}_{0}$.
${S}_{0}$ can be treated as time-independent. Most radionuclides of interest for source search have half-lives on the order of several years to tens-of-thousands of years. Consequently, radioactive decay of the source is insignificant during the measurement. Under these assumptions, Equation (
1) can be simplified to
where
$\widehat{\Omega}$ is a unit vector pointing in the traveling direction of the gamma rays and
${E}_{0}$ is the source emission energy and delta denotes the Dirac delta function; see [
3] for more details. Equation (
2) can be solved to determine the intensity of photons arriving at any point
$\mathbf{r}$ inside domain. This enables the computation of the count rate measured by the
i-th detector
${D}_{i}$ assuming that detectors are point detectors with face area
${A}_{i}$ and dwell time
$\Delta {t}_{i}$. The detector intrinsic efficiency
${\u03f5}_{i}\in [0,1]$ is usually known in practice.
If the
$ith$ detector is located at point
${\mathbf{r}}_{d}^{i}$, the solution
of Equation (
2) predicts the number of counts observed by the sensor given the location and intensity
$\mathit{\theta}=({\mathbf{r}}_{s},{S}_{0})$ of the source. Here we denoted by
$\mathcal{X}$ the space of all possible sources and
$\mathbb{R}$ is the one-dimensional real coordinate space. The derivation of model response (
3) follows in a manner similar to that shown in Shultis and Faw [
3], (Chapter 10.1.3), where the resulting solution is evaluated at the detector location
${\mathbf{r}}_{d}^{i}$.
2.1. Model Geometry
To provide an example of an urban area, we selected a
$250\phantom{\rule{3.33333pt}{0ex}}\mathrm{m}\times 180\phantom{\rule{3.33333pt}{0ex}}\mathrm{m}$ block in downtown Washington, D.C., located at approximately
${38}^{\circ}54$′48″ N by
${77}^{\circ}1$′60″ W (Johnson Avenue NW) to serve as our domain. Buildings in this area are primarily brick and concrete residential housing and are generally 1–5 stories in height. Using data from the OpenStreetMaps database (
https://www.openstreetmap.org/), we constructed a 2-D representation of the area to serve as the test geometry. Our implementation treats the buildings as a set of disjoint polygons
${P}_{j},\phantom{\rule{3.33333pt}{0ex}}j=1,2,\dots ,{N}_{g}$, each of which is assigned a corresponding macroscopic cross-section
${\Sigma}_{t}$. A satellite photo of the area with an overlay of the constructed representation is provided in
Figure 1.
Approximate calculations indicate that wood and concrete buildings correspond to an optical thickness of around 3 mean free paths (MFPs), where the mean free path denotes the mean distance traveled by the photons between collisions with atoms of the building. Consequently, we randomly selected cross-sections for each building so that their optical thickness is between 1 and 5 MFPs. The random sampling was also weighted according to the volume of each building, so that smaller buildings were biased towards smaller optical thicknesses and vice versa. The regions between buildings were treated as dry air at standard temperature and pressure, with cross-sections taken from the NIST XCOM database (
http://www.nist.gov/pml/data/xcom/).
For this geometry, the admissible parameter space is
The first two dimensions define the spatial location representing the simulated 250 × 180 m urban block. The third dimension restricts the source intensity to vary between 5 ×
${10}^{8}$ and 5 ×
${10}^{10}$ Bq.
2.2. Numerical Model for Detector Response
To determine the intensity of photons arriving at a given detector location
${\mathbf{r}}_{d}^{i}$, the algorithm employs a simple ray-tracing scheme. Starting at the location of the source
${\mathbf{r}}_{s}$, we draw a ray from
${\mathbf{r}}_{s}$ to
${\mathbf{r}}_{d}^{i}$. We then compute the intersection of this ray with the disjoint polygons
${P}_{j},\phantom{\rule{3.33333pt}{0ex}}j=1,2,\dots ,{N}_{g},$ representing the set of buildings in our domain. This yields a series of line segments expressing the path traversed in each region. We assume that a given ray intersects
${N}_{\ell}$ polygons,
${N}_{\ell}<{N}_{g}$, and let
$\mathcal{L}={\left\{({\ell}_{j},{\Sigma}_{T}^{(j)})\right\}}_{j=1}^{{N}_{\ell}}$ be the set of all intersecting segments, where
${\ell}_{j}$ is the Euclidean length of the
j-th segment and
${\Sigma}_{T}^{(j)}$ is the corresponding value for the macroscopic total cross-section. With this assumption, Equation (
3) takes the form
Equation (
5) provides an analytic expression estimating the expected detector response, and its computation primarily requires the intersection of lines with the model geometry. Equation (
5) represents a significant simplification to the solution of (
1), a nonlinear PDE with seven independent variables whose solution in complex geometries can require many hours even on a supercomputer. We implemented the numerical model (
5) in a short Python code. It employs the Shapely library (
https://pypi.python.org/pypi/Shapely) for performing the computational geometry calculations. The model takes as input a specification of polygons representing the different regions of the domain, cross-section data, detector locations, source intensity, and source location.
2.3. Statistical Model
To construct statistical models associated with
N detectors, we consider a background with constant expected intensity
B. We denote by
${\mathit{\theta}}_{0}$, the true source location and intensity of a radiation source. It is well known that radioactive decay and detection are Poisson random processes. By including Poisson random effects and assuming that
N detectors are available, we obtain the statistical model
associated with the
$ith$ detector response,
$i=1,\dots ,N$. The Poisson distribution with mean
is denoted by
$\mathtt{P}$. For large numbers (>30) of observed photons, the Poisson distribution is adequately approximated by a normal distribution, yielding the approximate statistical model
where
${({\sigma}_{i}^{o})}^{2}={F}_{i}({\mathit{\theta}}_{0})$; i.e., with variance equal to the mean. This is equivalent to
In this manner, we model the observations associated with each detector as random variables ${\mathrm{Y}}_{i},\phantom{\rule{3.33333pt}{0ex}}i=1,\dots ,N$.
2.4. Radial Basis Function Surrogate Model
Due to the presence of the buildings, the model response (
7) is non-differentiable with respect to both position and intensity. To apply sensitivity analysis to determine an optimal sensor configuration, smoothness of the model responses must be assured. To address these issues and reduce computational times, we used radial basis functions to provide continuously differentiable approximations of the model responses (
7).
Radial basis function methods provide interpolants to sampled values associated with irregularly positioned points inside the input domain. A radial basis function approximation of the model response
${F}_{i}(\mathit{\theta})$ has the formulation
where
θ denotes a source in the domain
$\mathcal{X}$,
$\psi :\mathbb{R}\to \mathbb{R}$ is a radial basis function and
$\epsilon $ is a shape parameter. Possible choices of radial basis functions
$\psi $ include multiquadrics and their inverse formulations, Gaussian functions, and thin plate splines. A more comprehensive list can be found in [
14,
24]. We employ Gaussian radial basis functions. The coefficients
${\lambda}_{k}$ are computed by requiring that
${\tilde{F}}_{i}({\mathit{\theta}}_{k})={F}_{i}({\mathit{\theta}}_{k}),k=1,\dots ,\mathcal{L},$ where
${\mathit{\theta}}_{k}$ are selected to cover the entire domain
$\mathcal{X}$ and
$\mathcal{L}$ is the number of interpolation points. We employed the MATLAB radial basis function toolbox based on Cholesky factorization and Tikhonov regularization. We also tried other methods to approximate the model response based on Legendre and Lagrange polynomials and Gaussian process. Our results (not shown here) revealed that the radial basis functions approximation had the best accuracy for our application.
2.5. Surrogate Statistical Model
The analysis of interpolation error relies on smoothness properties of the map being approximated. In our case, such properties are not directly available nor are error bounds. Instead, we assume that the response surrogate models errors associated with the true source can be modeled as normal random variables
${\epsilon}_{i}^{m}\sim \mathcal{N}(0,{({\sigma}_{i}^{m})}^{2})$, yielding the statistical model
with
${\sigma}_{i}^{m}$ being the standard deviation of the response surrogate models errors.
A statistical model incorporating both model errors
${\epsilon}_{i}^{m}$ and observation errors
${\epsilon}_{i}^{o}$ introduced in (
9), is
Assuming the independence of model errors and observation errors, and exploiting the fact that the sum of independent normal random variables is also a normal random variable, (
12) can be expressed as
where
${\tilde{\epsilon}}_{i}\sim \mathcal{N}(0,{\sigma}_{i}^{2}),$ and
${\sigma}_{i}^{2}={({\sigma}_{i}^{o})}^{2}+{({\sigma}_{i}^{m})}^{2}$.
5. Numerical Examples
As detailed in
Section 3, the problem under investigation consists of identifying the location and intensity of a radiation source in a simulated downtown Washington, DC block with minimum error with respect to the true source location and intensity. The solution to this problem can be obtained by applying optimal sensor location strategies [
15]. Instead of applying a local optimal design method, whose solution depends on some
a priori estimate of the true source, we propose a robust design strategy to remove this dependency. Specifically, we propose a ‘compromise’ design where the obtained network is good enough (in a least-error sense) to identify any possible source from the admissible domain
$\mathcal{X}$.
As detailed in (
4), we take the admissible parameter space to be
$\mathcal{X}=[0,250]\phantom{\rule{3.33333pt}{0ex}}\mathrm{m}\times [0,180]\phantom{\rule{3.33333pt}{0ex}}\mathrm{m}\times [5\xb7{10}^{8},5\xb7{10}^{10}]\phantom{\rule{3.33333pt}{0ex}}\mathrm{Bq}$. Next we specify the set of all possible detector locations to be a discrete set of 30 spatial positions. By sampling from a uniform distribution, we generate the possible locations of detectors in the domain denoted by diamond marks in
Figure 2. In this way, we avoid the problem of overlapping sensors encountered for a continuous formulation. The specific dispersal pattern was selected to spread the detectors evenly throughout the area. We assume that detectors have facial areas
${A}_{i}$, with 3-inch diameters and 3-inch lengths, for incident gamma energy of 662 KeV. This is standard packaging for sodium iodide (NaI) scintillators that possess intrinsic efficiency of
${\epsilon}_{i}=62\%$ for 662 keV gammas. The dwell time
$\Delta {t}_{i}$ for all detectors was chosen to be 1 s.
Finally, we set the size of the network to 10 detectors and formulate the robust design problem in the average sense:
Find the network${\mathit{\xi}}_{10}^{*}=\{{\mathit{D}}_{i},\phantom{\rule{3.33333pt}{0ex}}i=1,\dots ,10\}$consisting of 10 detectors out of the 30 possible detectors locations depicted in Figure 2 that solves By using a sufficiently large number of sources, the integral in (
27) can be accurately approximated. To evaluate the efficiency of the robust design network, we employ the metric
where
${\mathit{\theta}}_{0}^{\ell},\phantom{\rule{3.33333pt}{0ex}}\ell =1,\cdots \mathcal{M}$, are
$\mathcal{M}$ distinct true radiation sources. For each true source
${\mathit{\theta}}_{0}^{\ell}$, we can compute the associated estimate
${\widehat{\tilde{\mathit{\theta}}}}_{{\mathit{\xi}}_{10}}$ as the solution of the problem (
15) using the network
${\mathit{\xi}}_{10}$. The score (
28) corresponding to the robust design
${\mathit{\xi}}_{10}^{*}$ will be tested against scores obtained by randomly selected networks
${\mathit{\xi}}_{10}$ of 10 detectors. The smallest score should be obtained by the robust design
${\mathit{\xi}}_{10}^{*}$ thus validating the approach.
The discrete nature of the space of all possible detectors locations transforms (
27) into a combinatorial optimization problem. The number of possible networks is 30,045,015 as given by (30 choose 10) which is equivalent to combination of 30 possible sensor locations taken 10. By imposing that each possible network
${\mathit{\xi}}_{10}$ contains only one detector out of the three possible choices from each of the ten rectangular areas shown in
Figure 2, we decrease the number of possible networks to 59,049. This makes the combinatorial problem computationally feasible.
In
Figure 3, we plot the model response
${F}_{8}$ (
7) corresponding to the sensor location
${\mathit{D}}_{8}$ in
Figure 2 and a source intensity of
$3.5\times {10}^{9}$ Bq. The source location is varied inside the domain and the non-smooth nature of the model response is observed. The Fisher information matrix requires that the model response be differentiable with respect to the source location and intensity. This motivates replacing the model responses for all 30 possible locations with the differentiable radial basis function surrogate model described in
Section 2.4.
We employ radial basis interpolation (
10) to generate 30 surrogate model responses for all the possible detectors locations. The number of interpolation points is selected at 29,791 distributed inside the domain
$\mathcal{X}$. Specifically, for each dimension, we selected 31 points evenly distributed inside the interval. For each possible detector location and source
${\mathit{\theta}}_{k}$, the response model
${F}_{i}$ (
7) was used to calculate the corresponding interpolation points
$({\mathit{\theta}}_{k},{F}_{i}({\mathit{\theta}}_{k})),\phantom{\rule{3.33333pt}{0ex}}k=1,\dots $,29,791,
$i=1,\dots ,30$. We tested several values of the shape parameter
$\epsilon $ and the most accurate surrogate models were obtained for
${\epsilon}^{p}=8.06$.
In
Figure 4a, we compare the surrogate model predictions against the outputs of model response
${F}_{8}$ for
${\mathit{D}}_{8}$ and 100 different sources uniformly randomly sampled from
$\mathcal{X}$. These sources were not included in the training set. The red curve corresponds to the model response outputs whereas the blue curve denotes the surrogate model predictions.
Figure 4b illustrates the relative errors of the predicted intensities.
To compute accurate variances for each of the 30 surrogate model responses, we generated a data set of
${10}^{5}$ different sources uniform randomly spread inside domain
$\mathcal{X}$. The root mean square errors (RMSE) are shown in
Figure 5 for all 30 surrogate models. The largest error is observed for the surrogate model response associated with
${\mathit{D}}_{3}$. We note that whereas the source intensity ranges between
$5\times {10}^{8}$ and
$5\times {10}^{10}$ Bq, the largest RMSE is on the order of
$4.2\times {10}^{6}$ counts per second (cps). The discrepancies between the outputs of the models
${\tilde{F}}_{i}$ and
${F}_{i}$ are then used to compute variances
${({\sigma}_{i}^{m})}^{2},\phantom{\rule{3.33333pt}{0ex}}i=1,\dots ,30$.
Next we generated observations for all possible networks
${\mathit{\xi}}_{10}^{\ell},\ell =1,\dots $, 59,049, using statistical model (
9) based on the model response discussed in
Section 2.2. The observation errors associated with each detector and source
θ are normally distributed with mean 0 and variance
${({\sigma}_{i}^{o})}^{2}={F}_{i}(\mathit{\theta}),\phantom{\rule{3.33333pt}{0ex}}i=1,\dots ,30$.
The robust design problem solution is given by the network
${\mathit{\xi}}_{10}$ associated with the largest score
${\Gamma}_{ED}$ (
27). To determine its maximum value, we calculate the associated Fisher information matrix for all possible networks
${\mathit{\xi}}_{10}^{\ell},\phantom{\rule{3.33333pt}{0ex}}\ell =1,\dots $, 59,049 and a collection of possible sources
${\mathit{\theta}}_{\ell},\ell =1,\dots ,9880$ spread throughout the domain
$\mathcal{X}$. This allows us to estimate the integral in (
27). The dependencies associated with the Fisher information matrix are the derivatives of the surrogate models with respect to the source characteristics and variances
${({\sigma}_{i})}^{2}={({\sigma}_{i}^{o})}^{2}+{({\sigma}_{i}^{m})}^{2},\phantom{\rule{3.33333pt}{0ex}}i=1,\dots ,30$. The gradients
$\frac{\partial {\tilde{F}}_{i}}{\partial {\mathit{\theta}}_{l}}$ are computed from (
10) knowing that
$\psi $ is the Gaussian radial basis function and
${\parallel \xb7\parallel}_{2}$ is the Euclidean norm. These sources
${\mathit{\theta}}_{\ell},\ell =1,\dots ,9880$, differ from those used for constructing the surrogate models and are uniformly distributed over the entire domain. This spatial distribution was employed since we selected
$p(\theta )$ to be the uniform distribution over
$\mathcal{X}$ in score
${\Gamma}_{ED}$ (
27).
The values
${\Gamma}_{ED}({\mathit{\xi}}_{10}^{\ell})$ are computed for all possible networks and the results are shown in
Figure 6a. Allowing each network to include only one detector out of the three possible choices over each of the ten rectangular areas—see
Figure 2—likely explains the periodic behavior. The optimization problem does not have a unique solution as seen from the expected values. Three different networks produce the largest score.
Figure 6b shows the detectors locations of one of these three networks corresponding to the index 35,714.
To test the obtained robust design network, we use the Formula (
28) with the 11 randomly selected networks of sensors plotted in
Figure 7. Next we set
$\mathcal{M}=50$, and uniform randomly select 50 true sources
${\mathit{\theta}}_{0}^{\ell}$ from
$\mathcal{X}$. Observations were then generated using the statistical model (
9) for each source
${\mathit{\theta}}_{0}^{\ell},\phantom{\rule{3.33333pt}{0ex}}\ell =1,\dots ,50$ and network of sensors including the optimal one. We then solve the inverse problem (
15) using the Particle Swarm algorithm [
10] detailed in the
Appendix A.
The Particle Swarm approach is a global, meta-heuristic optimization algorithm motivated by social-psychological principles [
27]. It was originally introduced in [
10] and it was designed to imitate a social behavior such as the movements of birds in a flock or fishes in a shoal. Later the algorithm was simplified and its performance for solving optimization problems were reported in [
28].
For our example, we set the inertia parameter to be $1.1$ and the neighborhood of each particle is set to 4. The self and social adjustment coefficients ${y}_{1}$ and ${y}_{2}$ are set to $1.49$. We select the swarm size to 70, and for each given source ${\mathit{\theta}}_{0}^{\ell},\phantom{\rule{3.33333pt}{0ex}}\ell =1,\dots ,50,$ and network out of the 11 randomly selected networks plus the optimal one, we compute the inverse problem solution.
The errors of the inverse problem solutions (i.e., the estimated sources characteristics) are shown in
Figure 8 for all possible networks and sources. We note that the solution obtained using the optimal network does not have the smallest error for all the sources. For example, for source number 28, the source characteristics errors obtained using the optimal network are larger than all the estimates errors associated with the random networks except Network 4. This is not unexpected, since the optimal design was obtained following an average sense formulation.
Next, the errors of the inverse problem solutions are averaged over the entire set of sources and the results of Formula (
28) are illustrated in
Figure 9 for all 12 considered networks. The index associated with the optimal network is 12 and corresponds to the smallest RMSE. This result suggests that we were able to identify the robust design in the average sense for the nuclear transport inverse problem.