A Probabilistic Approach for Stereo 3D Point Cloud Reconstruction from Airborne Single-Channel Multi-Aspect SAR Image Sequences

Zhang, Hanqing; Lin, Yun; Teng, Fei; Hong, Wen

doi:10.3390/rs14225715

Open AccessArticle

A Probabilistic Approach for Stereo 3D Point Cloud Reconstruction from Airborne Single-Channel Multi-Aspect SAR Image Sequences

by

Hanqing Zhang

^1,2,3

,

Yun Lin

^4,*

,

Fei Teng

^1,2 and

Wen Hong

^1,2

¹

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

²

Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Chinese Academy of Sciences, Beijing 100190, China

³

The School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China

⁴

School of Electronic Information Engineering, North China University of Technology, Beijing 100144, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(22), 5715; https://doi.org/10.3390/rs14225715

Submission received: 13 September 2022 / Revised: 5 November 2022 / Accepted: 9 November 2022 / Published: 12 November 2022

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

We investigate the problem of obtaining dense 3D reconstruction from airborne multi-aspect synthetic aperture radar (SAR) image sequences. Dense 3D reconstructions of multi-view SAR images are vulnerable to anisotropic scatters. To address this issue, we propose a probabilistic 3D reconstruction method based on jointly estimating the pixel’s height and degree of anisotropy. Specifically, we propose a mixture distribution model for the stereo-matching results, where the degree of anisotropy is modeled as an underlying error source. Then, a Bayesian filtering method is proposed for dense 3D point cloud generation. For real-time applications, redundancy in multi-aspect observations is further exploited in a probabilistic manner to accelerate the stereo-reconstruction process. To verify the effectiveness and reliability of the proposed method, 3D point cloud generation is tested on

Ku

-band drone SAR data for a domestic airport area.

Keywords:

dense 3D reconstruction; synthetic aperture radar (SAR); multi-aspect SAR; circular SAR; stereo radargrammetry; 3D point cloud; urban 3D mapping

1. Introduction

Over the last two decades, synthetic aperture radar (SAR) multi-aspect collection modes have provided a large number of unique features for new imaging techniques in the field of urban monitoring and 3D mapping, etc [1,2,3]. Conventional SAR data collected over linear apertures typically represent our 3D world with 2D images. Data collected over diverse azimuth and elevation angles, i.e., multi-aspect or multi-view SAR data, can be used to build a 3D representation of the imaged targets/scenes [4]. Image artifacts such as layover and foreshortening effects in the 2D images can be fixed in the 3D representation [5]. As a consequence, an accurate and dense 3D reconstruction can generally facilitate 3D visualization and computer-aided automatic target recognition (ATR) [6,7].

In this paper, we focus on dense 3D reconstruction methods for SAR data collected over a full 360° of the target’s aspect. Two main techniques can be used: interferometric or radargrammetric methods. Interferometric methods, including multi-baseline interferometric SAR (InSAR) [8], tomographic SAR (tomoSAR) [9], array-InSAR [10], etc., are currently the mainstream methods for 3D reconstruction of man-made targets/scenes, whose 3D resolutions are achieved by synthesizing a coherent aperture along the height direction. However, such techniques may not be suitable for reconnaissance unmanned aerial vehicle (UAV) uses, as most drones are not up to repeating several strictly-parallel flights or carrying large antenna arrays [11]. In contrast, stereo radargrammetric methods can be more easily applied to almost all radar systems and platforms, as only radar amplitude images are required [12]. However, due to the relatively low height accuracy, radargrammetry was less studied for urban 3D mapping.

Over the past decade, multi-aspect SAR observations have greatly enhanced the 3D positioning accuracy of radargrammetric methods [13]. With dozens to hundreds of SAR images collected over varying azimuth angles, radargrammetry may achieve dense 3D positioning accuracy close to that of InSAR measurements. In 2012, Palm et al. reported on an urban digital surface model (DSM) estimated from circular SAR (CSAR) data using stereo radargrammetry, which achieved a meter-level height accuracy [5]. Since then, the DSM estimation problem has been studied for airborne multi-aspect SAR data [14,15,16]. However, current studies are far from meeting the needs of reconnaissance uses in many aspects. Many crucial 3D structures are lost in 2.5D DSM representation, such as building facades, meter-level DSMs are not sufficient for good visualization of city blocks, too many outliers remain in the final results, etc. The main purpose of this paper was to design a dense stereo 3D reconstruction method with sub-meter height accuracy for commonly used multi-aspect SAR observation modes.

In stereo SAR methods, height information is extracted by correlating two (or more) SAR images acquired from different viewing angles. We believe that the ill-posedness of stereo-matching problems is mainly caused by the anisotropic phenomenon of SAR images, i.e., local image features of scatterers vary too much among certain viewing angles [17,18]. One well-known example is the azimuth glint phenomenon of building components. Therefore, an intuitive idea is that if we can know from which images the scatterers are characterized more similarly, we could better generate stereo estimation from those similar images. An elegant framework to achieve this goal is robust probabilistic inference [19]. In this paper, we model the anisotropy as an underlying factor affecting the correctness of each stereo-matching result and attempt to simultaneously estimate the pixel height and degree of anisotropy. The degree of anisotropy refers to the probability that the height of a pixel can be correctly estimated by the stereo-matching method under a given radar perspective. (Here, the concept of anisotropy may be a deviation from the terminology used in the literature [20]). Then, final 3D estimations were generated by probabilistic fusing multiple stereo height measurements, where inaccurate height measurements can be automatically down-weighted. In Section 2, the above process is achieved by a mixture distribution model for stereo height measurements and a Bayesian filtering method for the parameter estimation of this mixture distribution model. Moreover, in Section 3, we further design an algorithmic framework to speed up this probabilistic 3D point cloud reconstruction method. The main contributions of this paper are:

A probabilistic treatment of anisotropy robust radargrammetry.
The full incorporation and perception of the stereo measurement uncertainty.
The reduction of computational loads for the 3D reconstruction of consecutive SAR images.

This paper is organized as follows. Section 2 introduces the proposed 3D reconstruction method for a certain SAR image. Section 3 introduces an efficient 3D point cloud generation method for a consecutive SAR image sequence. Section 4 presents the experimental results. Experiments are conducted on a

Ku

-band drone SAR data collected over polygon flight tracks. The 3D point cloud reconstruction was tested over an airport area for buildings and aircraft, and the final results are quantitatively analyzed to verify the accuracy and efficiency of the proposed method.

2. Probabilistic Stereo Height Estimation

In this section, we introduce the proposed probabilistic stereo height estimation method. We first propose a Gaussian+Uniform mixture distribution model for stereo height measurements generated by the stereo matching of two SAR images, where an underlying variable

γ

is used to control the degree of anisotropy (degree of ill-posedness) for each pixel. We then develop a Bayesian inference method to estimate the posterior distribution for the pixel’s height and degree of ill-posedness, using multiple height measurements generated from multi-aspect SAR observations. Finally, we make a small simulation experiment to demonstrate the robustness of the proposed method.

To begin with, we introduce two common multi-aspect SAR data collection geometries.

2.1. Multi-Aspect SAR Imaging Geometry

In Figure 1, we give two sketches of the common multi-aspect SAR data collection geometries. Both circular and polygon flight trajectories are up to collect the complete 360° data in the azimuth angle domain. The main advantage of adopting a circular trajectory is the relatively high data collection efficiency, while difficulties lie in data processing and motion compensation (MC) for curvilinear SAR data [21]. In contrast, a polygon collection geometry often produces higher-quality SAR images, because both the image formation and MC methods have been systematically studied. The polygon flights are more popular with drone platforms, as drones are often not equipped with the high-precision positioning and orientation (POS) systems or flexible-pointing antennas. The main disadvantage of polygon geometry is the relatively inefficient data collection, but this problem can hopefully be remedied by adopting drone swarms. For the stereo algorithm using the complete 360° multi-aspect SAR data, there is usually no essential difference between the two collection modes, as CSAR typically employs a sub-aperture process to generate SAR images for radargrammetric applications. Therefore, we will not distinguish between them in the following text.

For the following content, we assume that all SAR images are projected to the ground plane, as pre-processes for stereo-radargrammetry. The image registration error is also considered negligible; thus, we can leverage the epipolar geometry [22] for the stereo matching process.

2.2. Probabilistic Stereo Measurement Model

In this subsection, we present a probabilistic interpretation of the stereo-matching result of two SAR images.

We first review a standard stereo-matching process between two SAR images. Consider the stereo SAR configuration in Figure 2. Let p be a pixel from a reference ortho-image

I

. For a particular height value

\dot{h}

, one can calculate the corresponding 3D point

x (\dot{h})

located

\dot{h}

meters away from

I

, through the following Range–Doppler (R-D) Equations [23]

\{\begin{matrix} |\vec{s x}| = |\vec{s p}| \\ (\vec{s x} - \vec{s p}) \cdot \vec{V_{s}} = \vec{0} \end{matrix}

(1)

where s is the position of the antenna phase center (APC) at a certain radar slow time,

\vec{V_{s}}

is platform velocity vector. Let

I^{'}

be the slave image and q be the projection of

x (\dot{h})

onto

I^{'}

, which is calculated by the R-D equations. During stereo matching, for each possible 3D locations

x (\dot{h})

, we will evaluate the similarity between pixel p and q by a certain similarity criterion, such as normalized cross-correlation (NCC). Typically, the height value

\tilde{h}

that maximizes those NCC scores is chosen as a measurement for the true height value h.

However, for man-made targets, stereo-matching results may be affected by the ill-posed region in the stereo pair image. For strongly anisotropic targets, stereo matching may give wrong results. Inspired by the mixture models used for the robust sensor fusion problems (cf. chapter 21 of [24]), we propose a novel probabilistic interpretation of the stereo-matching results.

First, the stereo SAR configuration in Figure 2 is viewed as a sensor that can generate a measurement for the pixel’s height. However, this sensor is two-fold, which is a mixture between a good sensor model and a bad one. The good sensor always generates height measurements that are fairly close to the true height value h, while measurements generated from the bad sensor tend to be randomly distributed among all height candidates. The final sensor model is a weighted combination of the good sensor and the bad one, with the weight controlled by the pixel’s degree of anisotropy.

We formulate the above ideas in terms of probability. A height measurement

\tilde{h}

is modeled with a Gaussian+Uniform (GU) mixture distribution model. The probabilistic stereo SAR sensor can produce two types of height measurements with a probability

γ

and

1 - γ

, which is: (1) a good height measurement

\tilde{h}

that is normally distributed around the correct height h with a probability

γ

and (2) an outlier height measurement

\tilde{h}

that is randomly distributed among all height candidates in the interval

[h^{m i n}, h^{m a x}]

, with a probability

1 - γ

:

p (\tilde{h} | h, γ) = γ \cdot \underset{inlier measurement}{\underset{︸}{N (\tilde{h} | h, τ^{2})}} + (1 - γ) \cdot \underset{outlier measurement}{\underset{︸}{U (\tilde{h} | h^{m i n}, h^{m a x})}}

(2)

with

N (\tilde{h} | h, τ^{2}) = \frac{1}{\sqrt{2 π} τ} \exp \{- \frac{{(\tilde{h} - h)}^{2}}{2 τ^{2}}\}

U (\tilde{h} | h^{m i n}, h^{m a x}) = \{\begin{matrix} 1 / (h^{m a x} - h^{m i n}) & , \tilde{h} \in [h^{m i n}, h^{m a x}] \\ 0 & , o t h e r w i s e . \end{matrix}

where

p (\tilde{h} | h, γ)

is the likelihood for the stereo height measurement

\tilde{h}

,

N (\cdot)

denotes the Gaussian distribution function for the inlier height measurement,

τ^{2}

is the variance of an inlier measurement,

U (\cdot)

denotes the Uniform distribution function,

h^{m i n}

and

h^{m a x}

are set by our prior knowledge about the scene boundaries. The likelihood in Equation (2) is a 2D distribution: one dimension is for height h, and the other dimension is for the inlier probability

γ

.

Height Measurement Uncertainty

Different stereo SAR configurations often produce height measurements with different precisions. In Equation (2), the variance of a good height measurement

τ^{2}

is calculated from the relative geometric configurations of the two flight tracks producing the stereo-pair images. A numerical method is used for the calculation of

τ^{2}

. We assume that for an inlier measurement model, the stereo-matching process can always find the correct homonymous point location with a fixed error variance

δ^{2}

along the epipolar lines, where

δ^{2}

is roughly caused by the Gaussian noises in the stereo-pair images. In other words, we assume that each inlier height measurement

\tilde{h}

has a fixed variance

δ^{2}

when projected to the SAR image plane. Thus,

δ^{2}

can be back-projected to the height direction to calculate

τ^{2}

: as depicted in Figure 2, a small feasible delta value

δ h

is manually added to the true height h, and the projection of

δ h

on

I^{'}

is calculated by the R-D equations as

δ l

, then

τ^{2}

is computed by

τ^{2} = {(\frac{δ h}{δ l})}^{2} \cdot δ^{2}

(3)

In practice, we do not know the true value of h, so h is approximated by its maximum a posterior (MAP) estimate value

\hat{h}

. It is worth noting that

τ^{2}

in Equation (3) can also be computed analytically, which seems to be more convenient for spaceborne images. For more details, we recommend a previous work of the authors [25].

By Equation (3),

τ^{2}

reflects the fact that stereo-pair images with larger intersection angles tend to produce more precise height measurements.

2.3. Bayesian Inference for Height

In the last subsection, we propose a probabilistic model for the stereo height measurements from two SAR images. The likelihood for a certain height measurement

\tilde{h}

follows the GU mixture distribution model. In this section, we introduce a Bayesian filtering method to estimate its parameters—height h and inlier probability

γ

, from multi-aspect SAR images.

In this subsection, we focus on the Bayesian methods for updating the posterior distribution of

p (h, γ)

. Figure 3 presents a simple schematic of the probabilistic height estimation process. First, a reference image is separately matched with N slave images within its neighborhood, generating N height measures

{\tilde{h}}_{1}, \dots, {\tilde{h}}_{N}

for each pixel p . The likelihood for

\tilde{h}

follows Equation (2). We assume that all stereo height measurements

{\tilde{h}}_{1}, \dots, {\tilde{h}}_{N}

are statistically independent. With N height measurements, the Bayesian posterior for h and

γ

can be sequentially updated by:

\underset{posterior PDF}{\underset{︸}{p (h, γ | {\tilde{h}}_{1}, \dots, {\tilde{h}}_{N})}} \propto \underset{priori PDF}{\underset{︸}{p (h, γ)}} \cdot \prod_{N} p ({\tilde{h}}_{k} | h, γ)

(4)

where

p (h, γ)

is the prior distribution for h and

γ

. Before the Bayesian inference, we manually set the prior distribution for h and

γ

, then calculated the posterior PDF from the stereo height measurements

{\tilde{h}}_{1}, \dots, {\tilde{h}}_{N}

. The final height estimation is generated from MAP estimation value of h.

2.3.1. Bayesian Filters for 2D Posterior Calculation

The posterior in Equation (4) can be generally calculated by some numerical methods, such as particle filters. However, because

p (h, γ)

is a 2D distribution, considering the memory and computation limits for dense 3D reconstruction, we adopt a parametric solution suggested in reference [26] to approximately calculate this 2D posterior.

As suggested in reference [26], we adopt a Beta× Gaussian distribution as a uni-modal approximation to the true posterior distribution of h and

γ

, which is motivated by the fact that Beta× Gaussian distribution is an approximating distribution that minimizes the Kullback–Leibler divergence for the posterior of Gaussian+Uniform mixture distribution. This approximation has been used in many vision problems [26,27]. The approximated posterior distribution follows:

p (h, γ | a, b, μ, σ^{2}) = B e t a (γ | a, b) \cdot N (h | μ, σ^{2})

(5)

with

B e t a (γ | a, b) = \frac{Γ (a + b)}{Γ (a) Γ (b)} \cdot γ^{a - 1} {(1 - γ)}^{b - 1}

where

Γ (\cdot)

denotes the Gamma function,

a, b

and

μ, σ

are parameters controlling the Beta and Gaussian distribution, respectively. Here, true posterior

p (h, γ | {\tilde{h}}_{1}, \dots, {\tilde{h}}_{N})

is approximated by

p (h, γ | a_{N}, b_{N}, μ_{N}, σ_{N}^{2})

, by which only four parameters are used to model the entire 2D distribution.

We briefly introduce the Bayesian filtering method for the computation of the 2D posterior. Assume that

p (h, γ | a_{k - 1}, b_{k - 1}, μ_{k - 1}, σ_{k - 1}^{2})

is the true posterior after the

(k - 1)

-th iteration. Upon the k-th measurement

{\tilde{h}}_{k}

, the update for the posterior of h and

γ

is approximated as:

\begin{matrix} p (h, γ | {\tilde{h}}_{1}, \dots, {\tilde{h}}_{k}) \approx & p (h, γ | a_{k}, b_{k}, μ_{k}, σ_{k}^{2}) \cdot const \\ \approx & p (h, γ | a_{k - 1}, b_{k - 1}, μ_{k - 1}, σ_{k - 1}^{2}) \\ \cdot p ({\tilde{h}}_{k} | h, γ) \cdot const \end{matrix}

(6)

The update rules for the parameters

(a_{k}, b_{k}, μ_{k}, σ_{k}^{2})

are listed in Algorithm 1, where the 2D posterior

p (h, γ)

of each 3D map point is estimated by an individual Bayesian filter. For detailed derivation, one can refer to the original work in [26].

Algorithm 1 Estimation of the Bayesian posterior

p (h, γ | {\tilde{h}}_{1}, \dots, {\tilde{h}}_{k})

.

Input:

N stereo height measurements—

{\tilde{h}}_{k}

,

k \in {1, 2, \dots N}

.

Output:

parameters for the approximate posterior PDF—

(a_{N}, b_{N}, μ_{N}, σ_{N}^{2})

.

MAP estimation value for h and

γ

—

\hat{h}

and

\hat{γ}

.

1:: set initial value for $(a_{0}, b_{0}, μ_{0}, σ_{0})$ .
2:: for each $k \in [1, N]$ do
3:: with stereo measurement ${\tilde{h}}_{k}$ , update each Bayesian filter by:
4:: calculating auxiliary variables:

$\begin{matrix} s = 1 / (1 / σ_{k - 1}^{2} + 1 / τ^{2}), m = s \cdot (μ_{k - 1} / σ_{k - 1}^{2} + {\tilde{h}}_{k} / τ^{2}) \\ C_{1} = \frac{a_{k - 1}}{a_{k - 1} + b_{k - 1}} \cdot N (x | μ_{k - 1}, τ^{2} + σ_{k - 1}^{2}), C_{2} = \frac{b_{k - 1}}{a_{k - 1} + b_{k - 1}} \cdot U ({\tilde{h}}_{k} | h^{m i n}, h^{m a x}) \\ C_{1}^{^{'}} = C_{1} / (C_{1} + C_{2}), C_{2}^{^{'}} = C_{2} / (C_{1} + C_{2}) \\ f = C_{1}^{^{'}} \frac{a_{k - 1} + 1}{a_{k - 1} + b_{k - 1} + 1} + C_{2}^{^{'}} \frac{a_{k - 1}}{a_{k - 1} + b_{k - 1} + 1} \\ e = C_{1}^{^{'}} \frac{(a_{k - 1} + 1) (a_{k - 1} + 2)}{(a_{k - 1} + b_{k - 1} + 1) (a_{k - 1} + b_{k - 1} + 2)} + C_{2}^{^{'}} \frac{a_{k - 1} (a_{k - 1} + 1)}{(a_{k - 1} + b_{k - 1} + 1) (a_{k - 1} + b_{k - 1} + 2)} \end{matrix}$
5:: updating $(a, b, μ, σ^{2})$ :

$\begin{matrix} μ_{k} = C_{1}^{^{'}} m + C_{2}^{^{'}} μ_{k - 1}, σ_{k}^{2} = C_{1}^{^{'}} (s + m^{2}) + C_{2}^{^{'}} (τ^{2} + σ_{k - 1}^{2}) \\ a_{k} = \frac{e - f}{f - e / f}, b_{k} = \frac{1 - f}{f} \cdot a_{k - 1} \end{matrix}$
6:: end for
7:: return $(a_{N}, b_{N}, μ_{N}, σ_{N})$ , $\hat{h} = μ_{_{N}}$ , $\hat{γ} = \frac{α - 1}{α + β - 2}$ .

After the Bayesian filtering, the final 3D point cloud can be calculated from the MAP estimation value of h, by R-D equations. In Section 3, we will give a faster implementation of the Bayesian inference method in this section, which is optimized for the computational speed.

2.3.2. A Simulation Experiment

At the end of this section, we make a small simulation to demonstrate the robustness of the proposed method to erroneous height measurements. As depicted in Figure 4, we simulated 40 stereo height measurements for a pixel with

1 / 4, 1 / 2, 3 / 4

outliers, respectively. The top row shows the histograms of these height measurements. As a comparison to our method, the middle row shows the estimated posterior PDF for h, where the height measurements are modeled with 1D Gaussian distributions. We can see the obvious biases in the estimated mean values for h, suggesting that this method may easily lose robustness to the large number of erroneous stereo measurements. The bottom row shows the 2D posterior PDF for h and

γ

, which is generated by our method. By comparison, the estimated mean values for h are less biased by the outliers, which demonstrates the robustness of the proposed method.

In our method, the 2D posterior can also provide a reasonable confidence measure for the final height estimates. For example, in Figure 4c, both the low peak location of

γ

’s posterior and the large variance of h’s posterior may indicate an unreliable height estimate. Thus, we can easily tell if a height estimate is converged by querying the shape of the 2D posterior.

3. Efficient 3D Point Cloud Generation Methodology

In the previous section, we propose a probabilistic 3D reconstruction method. Dense 3D reconstruction is (computationally) very expensive work; we attempted to estimate the 3D location of each pixel with negligible image gradient. Taking advantage of processing the consecutive SAR images, we designed a fast implementation for 3D reconstruction on an entire multi-aspect SAR dataset.

A normal multi-aspect SAR image dataset often consists of dozens to hundreds of independent views. We propose to sequentially reconstruct a 3D point cloud for each collected view. Thus, the redundant 3D information between adjacent views can be exploited to speed up overall the 3D mapping process.

A flowchart of the complete 3D reconstruction algorithm is depicted in Figure 5. Before starting, SAR images are reordered with their acquisition azimuth angles so that adjacent images are under a similar collection geometry. By the Bayesian filtering method proposed in Section 2, the Bayesian height estimation process is sequentially performed on each image. The key to speeding up the overall algorithm is a 3D map propagation process: each reconstructed 3D map is projected onto an adjacent new image, as the prior knowledge to speed up the new 3D estimation process on the projected image.

The main steps of the proposed 3D mapping algorithm in Figure 5 are explained as follows.

(1) Probabilistic 3D map propagation.

This step is to project each reconstructed 3D map to an adjacent new image. Propagated 3D information is used as a priori knowledge to provide the initialization for the new 3D estimations.

The probabilistic 3D map propagation is an analogy to the prediction step in the Kalman filters, but works in a heuristic way. Once 3D estimation for a map point p is finished, the knowledge about

h_{p}

will be propagated to an adjacent unestimated image. First, we calculate the image location of the 3D point

x ({\hat{h}}_{p})

in the new image by R-D equations, where

{\hat{h}}_{p}

is the MAP estimation of

h_{p}

. Assuming that

x ({\hat{h}}_{p})

is projected onto the pixel q in the new image, the posterior distribution of

h_{q}

is initialized by:

\begin{matrix} μ_{0, q} & = {\hat{h}}_{p} \end{matrix}

(7)

\begin{matrix} σ_{0, q}^{2} & = σ_{p}^{2} + σ_{p r e d}^{2} \end{matrix}

(8)

where

σ_{p r e d}^{2}

is a manually added prediction uncertainty that acts similar to the process noises in Kalman filters,

μ

and

σ

are parameters controlling the Gaussian distribution model in Equation (5). By Equations (7) and (8), projected 3D map points are used as a priori knowledge to initialize the new 3D estimates.

(2) Initialize new Bayesian filters.

During the 3D mapping process, each probabilistic map point is updated by an individual Bayesian filter. For each new pixel q in an unestimated image, we should set initial values

(a_{0, q}, b_{0, q}, μ_{0, q}, σ_{0, q})

, to create the new Bayesian filter. In our method,

(a_{0, q}, b_{0, q})

are initialized by the preset values, and

(μ_{0, q}, σ_{0, q})

are initialized by the propagated prior knowledge.

After the probabilistic 3D map propagation, if there exists a 3D point projected onto the pixel q, we can adopt Equations (7) and (8) to initialize

(a_{0, q}, b_{0, q})

for the new Bayesian filter. If more than one 3D point is projected onto the same pixel, multiple independent new filters will be initialized, to handle the potential layover pixels. If no 3D points are projected to q,

(μ_{0, q}, σ_{0, q})

will be initialized in the general form:

μ_{0} = (h^{m a x} + h^{m i n}) / 2

and

σ_{0}^{2} = {(h^{m a x} - h^{m i n})}^{2} / 36

(Because we have no prior knowledge of the pixel’s 3D location, so we initialize

μ_{0}

to the median height of the scene, and initialize the variance

σ_{0}^{2}

to a relative large value. ). In the following experimental part, we set

a_{0, q} = b_{0, q} = 10

(which offers a prior for the inlier probability

γ

with a mode of

0.5

and a standard deviation about

0.11

, calculated from the probability density function of the Beta distribution.)

Setting the initial value

(a_{0, q}, b_{0, q}, μ_{0, q}, σ_{0, q})

is analogous to the initialization process of the Kalman filter. Here, all initial parameters are set manually without elaborated tuning. We cannot give an optimal choice of those initial parameters in this paper. However, on the other hand, when the amount of images is large enough, these initial parameters typically have little effect on the final results, so the only thing we need to do is initialize the uncertainty of h and

γ

to some enough large values.

(3) Update of the Bayesian filters.

In this step, we generate the height measurements

{\tilde{h}}_{1}, \dots . {\tilde{h}}_{N}

and sequentially update the Bayesian filter for each image pixel to be estimated. The update of the Bayesian filters follows Algorithm 1, while the height measurements are generated by the pixel-wise stereo-matching process. With propagated prior knowledge, the stereo height search range can be limited by

μ \pm α σ

(we use

α = 3,

which is inspired by the three-sigma rule of thumb used in statistics, as we believe the new measurement tends to fall into the three-sigma range of the current height estimation

μ

). Therefore, we do not need to search for all possible height candidates within

[h^{m i n}, h^{m a x}]

. Moreover, a lot of computation is saved for 3D reconstruction of consecutive SAR images.

(4) Three-dimensional (3D) point cloud creation.

After Bayesian inference, 3D points are created from converged Bayesian filters. We use the estimated posterior PDF of h and

γ

to judge whether an estimate is converged or diverged: Let

γ_{t h r}

and

σ_{t h r}^{2}

be two threshold values. If

\hat{γ} > γ_{t h r}

and

σ^{2} > σ_{t h r}^{2}

, we then judge a Bayesian filter as converged. (

\hat{γ}

is the MAP estimation for

γ

, and

σ

is the standard deviation of the estimated posterior PDF for h). Final 3D points are then created from those converged Bayesian filters by R-D equations.

4. Experimental Evaluation

4.1. Experimental Data

The 3D point cloud generation is tested on multi-aspect drone SAR data, which are now partially open-source [28]. The data were collected by a

Ku

-band frequency-modulated continuous-wave (FMCW) radar mounted on a multi-rotor drone, over Taiyuan Yaocheng Airport of China.

The data collection geometry is depicted in Figure 6. The SAR system was operated in a broadside strip map mode with a 5° azimuth integration angle, so a 72-sided polygon collection geometry was adopted for the 360° full-aspect SAR data collection. Data collection task was originally designed for ATR studies of anisotropic targets, as part of an open-source full-aspect SAR dataset after the MSTAR public data [29]. The radar spotting radius is approximately 100 m. The main targets within this area include a terminal, a small hangar, and several light propellers. Optical satellite imagery of this airport from google earth is given in Figure 6. Full 72-pass SAR data were collected in 5 days while some aircraft had moved their positions. For more details about the data, one can refer to [28].

For 3D radargrammetric applications, raw SAR data were processed into 72 individual geo-coded ortho-images. Before stereo matching, radar images were processed by a Lee-filter to reduce the speckle noises. Some parameters of the radar sensor and data processing details are listed in Table 1.

Four example SAR images from different aspect angles are presented in Figure 7. It can be seen from the figure that the scattering properties of the targets change significantly with the radar viewing angle, among which the aircraft targets are the most obvious. In the building area, we can see the variation of shadows and image distortions, the multi-bounce echoes, the glint scatters from building facades and roofs, as well as other SAR imaging artifacts. Regarding a layover phenomenon, these buildings are not serious, such as in built-up areas, but 3D structures are generally more complicated. Homogeneous areas, such as flat roofs and grasses, are noisy in nearly all directions.

4.2. Experimental Results

The 3D reconstruction result for the whole imaging scene is presented with a point cloud in Figure 8, where 3D points are colored by their height values. This figure includes a total of

1.6

million 3D points generated by the 3D reconstruction algorithms proposed in this paper.

The window size for stereo matching was set to be

13 \times 13

pixels. During Bayesian inference, each reference image is matched with

N = 10

slave images within its neighborhood, which spans a maximum intersection angle of 25° for the stereo-pair images. For 3D points creation, we set the tuning threshold variables as

γ_{t h r} = 0.65

and

σ_{t h r}^{2} = 0.25

(These two parameters are set with respect to the visual effect of the final point cloud. Increasing the value of

γ_{t h r}

or decreasing the value

σ_{t h r}^{2}

generally outputs some more accurate 3D points, but the density of the point cloud will also decrease).

We attempted to generate a 3D estimation for each pixel with obvious image gradients. In total,

2.2

million Bayesian filters were initialized from image pixels with strong gradients. After Bayesian inference, around

1.6

million 3D estimates were judged as converged (by

γ_{t h r}

and

σ_{t h r}^{2}

) and kept in the final 3D point cloud. The 3D reconstruction for 72 images took 6.9 min on an Intel Xeon

2.50 GHz

CPU. It is expected that with subsequent optimization and parallel acceleration, the data processing algorithm may satisfy most real-time applications.

Compared with 2D SAR images, the dense 3D reconstruction in Figure 8 shows a great advantage for scene interpretation. There are no image deformations, such as layover or foreshortening in the 3D representation. As a by-product, the 3D reconstruction results fuse SAR image data from 72 different views. Missing features in a single SAR image due to the shadowing, visibility, and target-specific problems are found in this 3D representation.

In Figure 9, we provide a zoom-in view of the terminal region from two opposite perspectives (i.e., front and behind). Thanks to the 360° multi-aspect SAR data, we can construct a complete 3D representation of the building with occluded facades. The final the 3D point cloud is visually very clean, with 0.4 million 3D points. Some complex structures with severe layover phenomena, such as the octagonal tower, the Chinese character signs on the roof, and the facades of the terminal are correctly reconstructed in the 3D point cloud, with reference to the optical photos. At the same time, we can also see some differences between microwave vision and optical vision, such as the point-like scatters on the color steel roofs of the terminal.

In this experiment, we only try to generate 3D estimations for pixels with strong gradients. Generally, pixels in a homogeneous region cannot be directly estimated by our algorithm. However, on the basis of obtaining preliminary 3D reconstruction, many mature algorithms from MVS can be adopted to further generate a 3D reconstruction for all pixels in the image, but this is not within the scope of this paper.

For comparative studies, we chose a multi-aspect SAR stereo radargrammetry method proposed by Palm et al. [5]. We chose this method as it is one of the few stereo algorithms that has been tested in built-up areas with real-world data, and with promising performance. This method was originally proposed for the 3D reconstruction of DSM, but it could also be used to reconstruct a 3D point cloud. Moreover, the principles of our method and Palm’s method are very close, except that we adopt probabilistic processing. Therefore, the two methods share many common parameters, such as the window size of NCC, the largest intersection angle for stereo matching, etc.

We chose the hangar as our experimental target because of the relatively simple 3D structures; thus, it is suitable for quantitative analysis of the 3D positioning accuracy. In Figure 10, we use Palm’s method to generate a 3D point cloud with comparable densities as our method. By visual comparison, in our method, the noise level is significantly lower, and the structures on the building’s facades are better preserved.

For quantitative analysis of the 3D positioning accuracy, we ran a standard precision test on the reconstructed 3D points. We did not collect ground-truth data to evaluate the absolute 3D positioning accuracy. However, for 3D visualization, the relative positioning accuracy is what really matters. The roof of the hangar is a slope structure made up of colored steel, so we can test the relative 3D positioning accuracy of scattering points on this roof. The calculated error curve is depicted in Figure 11. By comparison, our method outperforms Palm’s method across all precision levels. This confirms that our method provides higher accuracy 3D points for the same SAR data.

In Figure 11, around 80% of the points have 3D positioning errors that are less than one meter, which proves that our original 3D reconstruction achieved a sub-meter accuracy level.

5. Conclusions

This work attempted to use the conventional stereo-matching methods and multi-aspect collected SAR images to reconstruct the 3D point cloud of the imaged scene. Stereo-based 3D reconstruction of urban areas is a serious ill-posed problem, mainly due to the matching difficulties caused by the anisotropic properties of man-made targets. Our initial research motivation was that we found that almost all current researchers employed the idea of averaging to deal with difficulties in stereo-matching, whether the matching results were averaged in the image domain or over multiple images. However, we believe that matching errors caused by anisotropic scatters are highly different from Gaussian noises and, thus, cannot be solved by averaging. Moreover, too much averaging will result in a loss of 3D details. So, we developed the probabilistic 3D reconstruction method in this paper. Our method can be viewed as a ‘robust averaging’ of multiple error-prone stereo SAR measurements, as the initial ideas were inspired by the robust sensor fusion problem. We proposed a Gaussian+Uniform mixture distribution model for the stereo-matching result, for the first time, modeling the anisotropy as an independent error source affecting the matching correctness. By experiments, with a Bayesian filtering method, we can obtain more accurate height estimations. Moreover, because our method does not employ heavy averaging in the spatial domain, the final result can be presented in the form of a detailed point cloud. This method is not unique, and we put much consideration into the calculation speed of the algorithm rather than maximizing its accuracy, so further work can be conducted to improve the reconstruction accuracy.

In this paper, our method was tested in an airport area with some individual buildings and aircraft; we obtained relatively good 3D reconstruction results. Compared to built-up areas in large cities, the test scene is relatively simple. However, we do not have data to test our method in a denser building area, which may serve as a future study.

For some simple scenes, the 3D reconstruction method based on radargrammetry can be used as an alternative to InSAR measurements, especially when one only has a single-channel radar system. From the results in Figure 9 and Figure 10, we can see that the 3D points are mainly generated from regions with obvious gradients in the image, while surface regions, such as lawns, are not successfully reconstructed by this method. This is quite different from conventional InSAR results, where surface targets can more easily be reconstructed by InSAR. The combination of these two technologies could also be a future research point.

Currently, stereo-based 3D target reconstruction methods are more concerned with emergency and reconnaissance areas, rather than conventional mapping uses. Because our drone platform cannot collect high-quality CSAR data, the test data were collected by a polygon geometry; therefore, the data collection costs several days, and the collection efficiency is low. With a better drone platform, the data can also be collected under a circular flight trajectory, and only one flight pass is required for data collection. In the future, we may also try this method on some circular SAR data.

Author Contributions

Methodology and writing, H.Z.; investigation, F.T.; supervision, W.H. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China under grant number 61860206013 and the Foundation of Equipment Pre-Research Area of China under grant number E1M3080106.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Wen, H. Progress in circular SAR imaging technique. J. Radars 2012, 1, 124–135. [Google Scholar]
Lin, Y.; Hong, W.; Tan, W.; Wang, Y.; Xiang, M. Airborne circular SAR imaging: Results at P-band. In Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 22–27 July 2012; pp. 5594–5597. [Google Scholar]
Palm, S.; Stilla, U. 3-D Point Cloud Generation From Airborne Single-Pass and Single-Channel Circular SAR Data. IEEE Trans. Geosci. Remote Sens. 2021, 59, 8398–8417. [Google Scholar] [CrossRef]
Feng, S.; Lin, Y.; Wang, Y.; Teng, F.; Hong, W. 3D point cloud reconstruction using inversely mapping and voting from single pass CSAR images. Remote Sens. 2021, 13, 3534. [Google Scholar] [CrossRef]
Palm, S.; Oriot, H.M.; Cantalloube, H.M. Radargrammetric DEM extraction over urban area using circular SAR imagery. IEEE Trans. Geosci. Remote Sens. 2012, 50, 4720–4725. [Google Scholar] [CrossRef]
Zhang, F.; Hu, C.; Yin, Q.; Li, W.; Li, H.C.; Hong, W. Multi-aspect-aware bidirectional LSTM networks for synthetic aperture radar target recognition. IEEE Access 2017, 5, 26880–26891. [Google Scholar] [CrossRef]
Ai, J.; Mao, Y.; Luo, Q.; Jia, L.; Xing, M. SAR Target Classification Using the Multikernel-Size Feature Fusion-Based Convolutional Neural Network. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5214313. [Google Scholar] [CrossRef]
Zhu, X.X.; Wang, Y.; Montazeri, S.; Ge, N. A review of ten-year advances of multi-baseline SAR interferometry using TerraSAR-X data. Remote Sens. 2018, 10, 1374. [Google Scholar] [CrossRef] [Green Version]
Martín del Campo, G.D.; Shkvarko, Y.V.; Reigber, A.; Nannini, M. TomoSAR Imaging for the Study of Forested Areas: A Virtual Adaptive Beamforming Approach. Remote Sens. 2018, 10, 1822. [Google Scholar] [CrossRef] [Green Version]
Hu, F.; Wang, F.; Ren, Y.; Xu, F.; Qiu, X.; Ding, C.; Jin, Y. Error analysis and 3D reconstruction using airborne array InSAR images. ISPRS J. Photogramm. Remote Sens. 2022, 190, 113–128. [Google Scholar] [CrossRef]
Garcia-Fernandez, M.; Alvarez-Lopez, Y.; Las Heras, F. Autonomous Airborne 3D SAR Imaging System for Subsurface Sensing: UWB-GPR on Board a UAV for Landmine and IED Detection. Remote Sens. 2019, 11, 2357. [Google Scholar] [CrossRef] [Green Version]
Leberl, F.W. Radargrammetric Image Processing; Artech House on Demand: London, UK, 1990. [Google Scholar]
Luo, Y.; Qiu, X.; Dong, Q.; Fu, K. A Robust Stereo Positioning Solution for Multiview Spaceborne SAR Images Based on the Range–Doppler Model. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Li, Y.; Chen, L.; An, D.; Zhou, Z. A Novel DEM Extraction Method Based on Chain Correlation of CSAR Subaperture Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 8718–8728. [Google Scholar] [CrossRef]
Zhang, J.; Suo, Z.; Li, Z.; Zhang, Q. DEM generation using circular SAR data based on low-rank and sparse matrix decomposition. IEEE Geosci. Remote Sens. Lett. 2018, 15, 724–728. [Google Scholar] [CrossRef]
Li, Y.; Chen, L.; An, D.; Zhou, Z. A Method for Extracting DEM From CSAR Sub-aperture Correlation. In Proceedings of the 2021 13th Global Symposium on Millimeter-Waves & Terahertz (GSMM), Nanjing, China, 23–26 May 2021; pp. 1–3. [Google Scholar] [CrossRef]
Oliver, C.; Quegan, S. Understanding Synthetic Aperture Radar Images; SciTech Publishing: Raleigh, NC, USA, 2004. [Google Scholar]
Ai, J.; Tian, R.; Luo, Q.; Jin, J.; Tang, B. Multi-Scale Rotation-Invariant Haar-Like Feature Integrated CNN-Based Ship Detection Algorithm of Multiple-Target Environment in SAR Imagery. IEEE Trans. Geosci. Remote Sens. 2019, 57, 10070–10087. [Google Scholar] [CrossRef]
Levine, S. Reinforcement learning and control as probabilistic inference: Tutorial and review. arXiv 2018, arXiv:1805.00909. [Google Scholar]
Teng, F.; Hong, W.; Lin, Y. Aspect entropy extraction using circular SAR data and scattering anisotropy analysis. Sensors 2019, 19, 346. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chen, J.; Xing, M.; Yu, H.; Liang, B.; Peng, J.; Sun, G.C. Motion compensation/autofocus in airborne synthetic aperture radar: A review. IEEE Geosci. Remote Sens. Mag. 2021, 10, 185–206. [Google Scholar] [CrossRef]
Gutjahr, K.; Perko, R.; Raggam, H.; Schardt, M. The epipolarity constraint in stereo-radargrammetric DEM generation. IEEE Trans. Geosci. Remote Sens. 2013, 52, 5014–5022. [Google Scholar] [CrossRef]
Méric, S.; Fayard, F.; Pottier, É. Radargrammetric SAR image processing. In Geoscience and Remote Sensing; IntechOpen: London, UK, 2009; pp. 421–454. [Google Scholar]
Jaynes, E.T. Probability Theory: The Logic of Science; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Feng, S.; Lin, Y.; Wang, Y.; Yang, Y.; Shen, W.; Teng, F.; Hong, W. DEM generation with a scale factor using multi-aspect SAR imagery applying radargrammetry. Remote Sens. 2020, 12, 556. [Google Scholar] [CrossRef]
Forster, C.; Pizzoli, M.; Scaramuzza, D. SVO: Fast semi-direct monocular visual odometry. In Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 31 May–7 June 2014; pp. 15–22. [Google Scholar]
Newcombe, R.A.; Lovegrove, S.J.; Davison, A.J. DTAM: Dense tracking and mapping in real-time. In Proceedings of the 2011 International Conference on Computer Vision, Tokyo, Japan, 25–27 May 2011; pp. 2320–2327. [Google Scholar]
Wang, R.; Zhang, H.; Han, B.; Zhang, Y.; Guo, J.; Hong, W.; Sun, W.; Hu, W. Multiangle SAR Dataset Construction of Aircraft Targets Based on Angle Interpolation Simulation. J. Radars 2022, 10, 637–651. [Google Scholar]
Mossing, J.C.; Ross, T.D. Evaluation of SAR ATR algorithm performance sensitivity to MSTAR extended operating conditions. In Algorithms for Synthetic Aperture Radar Imagery V; SPIE: Bellingham, DC, USA, 1998; Volume 3370, pp. 554–565. [Google Scholar]

Figure 1. Multi-aspect SAR data collection geometry. (a) Circular trajectory. (b) Polygon trajectory.

Figure 2. Schematic of a two-view stereo SAR configuration;

I

is the reference image and

I^{'}

is a slave image.

Figure 2. Schematic of a two-view stereo SAR configuration;

I

is the reference image and

I^{'}

is a slave image.

Figure 3. Processes for generating a probabilistic height estimation.

Figure 4. Posterior distribution for height. The top row shows the histograms of 40 height measurements corrupted by the outlier. The second row shows the posterior distributions for h when the height measurements are modeled with a single variant Gaussian distribution. The bottom row shows the 2D posterior distribution estimated by the proposed method, where the vertical axis is for inlier probability

γ

and the horizontal axis is for height h. The true height value for this simulation is

h = 5 m

; (a) 40 height measurements with

1 / 4

outliers; (b) 40 height measurements with

1 / 2

outliers; (c) 40 height measurements with

3 / 4

outliers.

Figure 4. Posterior distribution for height. The top row shows the histograms of 40 height measurements corrupted by the outlier. The second row shows the posterior distributions for h when the height measurements are modeled with a single variant Gaussian distribution. The bottom row shows the 2D posterior distribution estimated by the proposed method, where the vertical axis is for inlier probability

γ

and the horizontal axis is for height h. The true height value for this simulation is

h = 5 m

; (a) 40 height measurements with

1 / 4

outliers; (b) 40 height measurements with

1 / 2

outliers; (c) 40 height measurements with

3 / 4

outliers.

Figure 5. Flowchart for 3D point cloud generation; 3D map points are maintained and updated by Bayesian filters.

Figure 6. Multi-aspect flight campaign over a domestic airport. Yellow dash circle: spot area with 360° illumination. Red arrow: flight trajectory (partial).

Figure 7. Example SAR images collected from 4 different aspects. Radar line of sight is indicated on each image with white arrows.

Figure 8. Airport 3D reconstruction. This figure contains

1.6

million points. The result is displayed in a Cartesian coordinate system. (a) Top view. (b) Perspective view. (c) Side view.

Figure 8. Airport 3D reconstruction. This figure contains

1.6

million points. The result is displayed in a Cartesian coordinate system. (a) Top view. (b) Perspective view. (c) Side view.

Figure 9. Point cloud for the terminal, with front and behind perspectives, as well as optical photos, are presented together. This figure demonstrates the full 360° 3D reconstruction ability of multi-aspect SAR.

Figure 10. Comparison of 3D reconstruction from Palm’s method and our method; 3D point clouds are presented with a perspective view and a side view. By comparison, the roof structure is more clear in our method. (a) Palm’s method. (b) Our method. (c) Photo of hangars.

Figure 11. Error curve evaluation. An error curve shows, for a given error distance d, how much of the result falls within d of the ground truth. We can see that our method outperforms Palm12 across all precision levels.

Table 1. Acquisition and processing parameters.

Parameter	Description
Center frequency	14.6 GHz
Bandwidth	up to 1200 MHz
Flight height	150 m AGL
Flight radius	150 m
Polarization	HH
Incidence angle	35 $^{\circ}$ ∼55 $^{\circ}$
Theoretical resolution (slant range × azimuth)	0.12 m × 0.12 m
Processed azimuth integration angle	5 $^{\circ}$
Radar heading difference between 2 adjacent flights	5 $^{\circ}$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, H.; Lin, Y.; Teng, F.; Hong, W. A Probabilistic Approach for Stereo 3D Point Cloud Reconstruction from Airborne Single-Channel Multi-Aspect SAR Image Sequences. Remote Sens. 2022, 14, 5715. https://doi.org/10.3390/rs14225715

AMA Style

Zhang H, Lin Y, Teng F, Hong W. A Probabilistic Approach for Stereo 3D Point Cloud Reconstruction from Airborne Single-Channel Multi-Aspect SAR Image Sequences. Remote Sensing. 2022; 14(22):5715. https://doi.org/10.3390/rs14225715

Chicago/Turabian Style

Zhang, Hanqing, Yun Lin, Fei Teng, and Wen Hong. 2022. "A Probabilistic Approach for Stereo 3D Point Cloud Reconstruction from Airborne Single-Channel Multi-Aspect SAR Image Sequences" Remote Sensing 14, no. 22: 5715. https://doi.org/10.3390/rs14225715

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Probabilistic Approach for Stereo 3D Point Cloud Reconstruction from Airborne Single-Channel Multi-Aspect SAR Image Sequences

Abstract

1. Introduction

2. Probabilistic Stereo Height Estimation

2.1. Multi-Aspect SAR Imaging Geometry

2.2. Probabilistic Stereo Measurement Model

Height Measurement Uncertainty

2.3. Bayesian Inference for Height

2.3.1. Bayesian Filters for 2D Posterior Calculation

2.3.2. A Simulation Experiment

3. Efficient 3D Point Cloud Generation Methodology

4. Experimental Evaluation

4.1. Experimental Data

4.2. Experimental Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI