1. Introduction
Ground penetrating radar (GPR) is an electromagnetic (EM) imaging device that uses the reflection and scattering characteristics of EM waves in discontinuous media to achieve non-destructive detection [
1]. It has been used in many engineering detection fields such as ground ice detection [
2], underground pipeline detection, and criminal investigations [
3]. Recently, space exploration probes, such as Lunar and Mars exploration, are also equipped with GPR equipment for geological stratification studies. However, due to the complex underground environment, the radar echo always contains clutter such as multiple waves, antenna coupled waves, and reflected waves from other non-detection targets, which seriously obscure the signal of the buried targets and bring great difficulty to the interpretation of GPR data.
Many scholars have conducted in-depth research into methods of clutter suppression. Some research designed an appropriate antenna system to enhance the echo of a buried object and reduce the background noise. For example, Liu et al. [
4] developed a hybrid dual-polarization GPR system with one circularly polarized transmit antenna and two linearly polarized receive antennas to improve detection and estimation of slender tubular targets. Others are based on signal processing algorithms. For example, the reference wave method averages a few A-scans echoes without target information to find the wavelets of the detection scene, and then subtracts the mean from the original data [
5]. The background matrix subtraction (BMS) method is an improvement of the average cancellation method [
6,
7]. The background noise matrix is obtained through a series of sliding windows, sample exclusion, weighting and iteration, then it is removed from the original data to suppress the clutter. However, these methods rely heavily on prior information about the detection scene, and it is often difficult to know in advance which echoes do not contain the target signal in actual detection.
Limited by wavelet extraction, the multi-scale filtering method was developed. Bao [
8] proposed a noise attenuation method based on curvelet transform. He extracted the background noise according to the distribution characteristics in the curvelet domain. Kumlu [
9] designed a multiscale directional bilateral filter (MDBF), which can flexibly extract the directional details corresponding to different geometric structures from the original data. Then, the inverse MDBF is applied to reconstruct the image of targets. In addition, there are some subspace projection techniques, such as morphological component analysis (MCA) [
10], Singular Value Decomposition (SVD) [
11], and Principal Component Analysis (PCA) [
12]. These methods project the detection signals into different subspaces and remove the noise subspace to reconstruct the clutter-free signal. However, when the complex underground environment leads to serious signal distortion or strong clutter, the wavelets will be spread in multiple principal components, which causes difficulty in determining the truncation of eigenvalues. In addition, a sparse blind deconvolution (SBD) method has recently been proposed for the GPR data process [
13]. However, it would likely be more effective to estimate the optimum wavelet for each individual transmitter location separately, rather than for the whole data set. Therefore, the existing methods can only be used in specific detection environments, not suitable for actual engineering wavelet extraction.
This paper designs an LSTM neural network to directly predict the wavelet from the GPR A-scan echoes. Firstly, the possibility of the neural network to extract the wavelet is analyzed based on the characteristics of GPR signals and the neural network. Then, some A-scan data generated with a 3D-FDTD simulator are used to train a two-layer LSTM network. Some simulated and measured GPR A-scan echoes are used to test the performance of the trained network model. The results show that this extraction method can be applied to different detection environments without any prior information. It can avoid the heavy marking tasks in large-scale detection areas. It can also solve the calibration in special detection environments.
2. Detection Principle and Data Characteristics of GPR Radar
As shown in
Figure 1a, the GPR detection trolley is used to detect the buried object in a sand bunker. The trolley moves at a constant speed along a straight line to collect 512 channels of A-Scan data; each has 1024 samples. Generally, the convolution of GPR wavelet and reflection sequence constitutes the echo of GPR detection system. The radar first receives the strong reflection signal of the upper surface. When the transmitted pulse encounters the target in the underground, there is a reflected signal of the target, which is relatively weak.
Figure 1b is the time stacking diagram of the first 512 sampling points. It can be seen that the strong wavelet submerges the signal of target. Moreover, the wavelets at the adjacent monitoring points on the same survey line in the same detection area are very similar.
Three correlation coefficients of Pearson’s linear correlation coefficient [
14], Kendall’s tau coefficient [
15], and Spearman’s rho [
16] are used to evaluate the relevance of adjacent A-Scan echoes. The correlation coefficient greater than zero indicates that the two groups of data are positively correlated. On the contrary, a correlation coefficient less than zero indicates that they are negatively correlated. Moreover, the greater absolute value indicates the stronger correlation. Three correlation coefficients of the A-Scan echoes in
Figure 1 are shown in
Figure 2. The mean values are 1, 0.9998, and 0.9894, respectively, which proves the strong correlation between adjacent echoes. Several isolated valley points on the curves correspond to the buried target, especially on the line of Kendall’s tau. This strong correlation in time series could be well explored by neural networks with memory.
3. Two Layers LSTM Network for Wavelet Prediction
3.1. The Structure of Two-Layer LSTM Model Network
Recurrent neural networks (RNNs) are commonly applied to explore relations in sequential data. The special network structure can selectively store the past information and use them together with current input to speculate the future information. Long Short-Term Memory (LSTM) is a variation of RNN with the capability to prevent gradients decaying or exploding. It can fully explore the non-linear relationship between variables and process complex long-term time series dynamic information [
17].
Figure 3 shows the cell structure of LSTM network, which has a new memory unit
Rt and three control gates, namely input gate ①, forget gate ②, and output gate ③. The computing flow is expressed as the following equations [
18].
Here, yt is the output and xt is the input. σ is the sigmoid function. tanh is the hyperbolic tangent function. w and b are the weights and biases, respectively. The subscripts I, F, O, and R represent the input gate, forget gate, output gate and memory unit, respectively. The symbol “∗” means convolution. As is well known, the wavelet sequence of a given GPR system and detection environment is always stable. The characteristics of LSTM network makes it possible to extract the wavelet sequences from the GPR echoes. In the following, multiple A-scan echoes on one survey line are connected end to end to form a longer sequence for wavelet prediction.
The network structure contains two-layer LSTM, two-layer Dropout, and one-layer Dense. The dropout probability of 0.2 is set to prevent overfitting. The timestep is 100, batch size is 200, epoch is 200, and learning rate is 1 × 10−6. The input length of each timestep is 100, and the output length is 1. If the loss does not decrease within 10 epochs, the learning rate is dynamically adjusted to be 0.1 times that of before.
3.2. Network Training
As shown in
Figure 4a, three-layer infinite medium is set up to simulate air, cement, and limestone from top to bottom. The relative permittivity of cement and limestone are
εr2 = 6 and
εr3 = 9, respectively. The cement layer thickness is
d1 = 0.4 m. The top air layer and the bottom limestone layer are both infinite. A PEC cube object with a 0.2 m edge length is partially buried between the cement layer and the limestone layer. The depth from the top of the cube to the ground is
d = 0.3 m. The distance between the transmitting and receiving antennas is
L = 0.4 m, and the height above the ground is
h = 0.4 m. The two antennas are vertically located on both sides of the survey line and move synchronously along the survey line of the red arrow. The signal source is a ricker wave with center frequency of 200 MHz, time sampling interval of 0.0385 ns. The spatial sampling interval along the survey line is 0.04 m. A 3D-FDTD simulation tool [
19] is used to generate the A-scan data. In total, 95 channels of A-Scan echoes, each with 780 samples, are accumulated along survey line to form the B-Scan image, as shown in
Figure 4b. In order to eliminate the signal similarity of dense sampling along the survey line, we selected 30 channels of data at equal intervals as the data set. From these data, 18 channels of A-scan echoes are randomly selected for the training set, 6 channels are for the validation set, and another 6 channels are for the testing set.
The 18 channels of A-scan echoes of the training set are randomly connected end-to end to construct a longer sequence with 780 × 18 = 14,040 samples as the input sequence. In each step of network training, the input length is 100, which means the first 100 samples of the input sequence are used to predict the next sample. The time window of the input data moves backward step by step. The network uses the mean square error (MSE) as the loss function.
Figure 5 shows the loss during the training process. The loss of the one-layer network decreases slowly, and the final loss is much greater than the loss of the two-layer network. The loss of the three-layer network drops quickly, but it appears a slight over-fitting has resulted in poor prediction results. The two-layer network converges after the 180 steps, with the loss less than 0.001.
Figure 6 compares the predicted wavelet. It can be seen that the red line predicted by the two-layer network is most similar to the ideal wavelet, which is the echo when there is no target buried in the ground. So, the two-layer LSTM network is superior in training time and performance and will be used for wavelet prediction in the following.
3.3. Network Testing
From the test set, we randomly selected two channels to connect into a sequence with 780 × 2 = 1560 samples as the network input for the prediction of wavelet. For example, the 5th, 32nd, and 48th channels of A-Scan data (corresponding to A, B, and C in
Figure 4b, respectively) are used to evaluate the wavelet prediction ability of the trained network. Position A is far away from the target, and it contains less target signal. This channel of A-scan data is used as the ground truth of the wavelet. Position B is close to the target, and the corresponding A-scan echo contains some target signals. As shown in
Figure 7, the predicted wavelet of green line is consistent with the ground truth of black dot line. The enlarged view in the upper right corner shows that the predicted wavelet amplitude is non-zero after the 510th sampling point, due to the reflected signal of the underground layer.
Another interesting observation is the wavelet prediction with the A-scan data at position C, which is just above the PEC target. The strong reflected signal of the metal is much larger than the wavelet of the background. At this time, the network cannot ignore the very strong target interference, resulting in the failure of wavelet prediction. Therefore, for the wavelet prediction of the actual measured data, it is necessary to remove the echo containing strong target signal.
The effect of wavelet removal and deconvolution strongly depends on the quality of the wavelet. If the extracted wavelet is incorrect, the direct wave cannot be offset with the original data, and interference is also induced. The inaccurate wavelet tailing will affect the resolution of deep detection. In the following, the results of several methods of wavelet removal and deconvolution are compared to evaluate the accuracy of wavelet.
Figure 8 shows the results of wavelet removal with different wavelet extraction methods. The above two figures still contain residual direct waves and the layered interference. Although the reference wave method removes the direct wave cleanly, the layered interference is still obvious. The removal of the wavelet predicted by the above LSTM network can effectively remove the direct wave and the layered interference at the same time and improve the signal-to-noise ratio and resolution.
Figure 9 shows the deconvolution results of the wavelets. It can be seen that the deconvolution with the LSTM-predicted wavelet can effectively compress the wavelet tailing and the layered signal so that the signal of the deep target is highlighted.
5. Conclusions
This paper presented a wavelet prediction method based on the LSTM network. This method takes advantage of the strong correlation of the GPR signals to construct quasi-periodic input signal by splicing the data head and tail. The network is trained to learn the commonality of the input data, while ignoring the random interference and predicting a smooth wavelet.
Several groups of experiments (corresponding to different excitation source waveform, target location, shape, material, background geometry, etc.) show the generalization ability of the network for different detection scenarios. As long as the antenna system parameters remain unchanged, the network trained with the simulation data of one arbitrary scene can be used to predict the wavelet of many different scenes. In order to expand the applicability of the network to different media, the network is optimized by expanding the range of dielectric parameters of the simulation model. For the application of actual detection data, we directly use a small amount of measured data for network training, and then use the trained network for detection data in other different scenes. Compared with the wavelet extracted by SVD, the predicted wavelet has obvious advantages in the integrity and adaptability of the detection area, and the method can be used in large-scale underground exploration projects such as pipeline detection under urban roads, defect detection inside a tunnel, and so on.
This method has many advantages, as follows:
This method does not rely on prior knowledge and can effectively extract the wavelets of different scenes.
There is no need for artificial marking during network training. The A-Scan echo can be directly used as training data, and the input data is easy to obtain.
The trained network has good generalization ability and can solve many practical problems, such as heavy marking of a large-scale detection area, the inability to label special detection environments, and poor processing results caused by inaccurate calibration.
The proposed neural network method has very strong generalization ability for wavelet prediction of the same antenna system. However, if the antenna system is changed, the network must be retrained. This issue needs to be further studied in the future. The probable approach is to conduct a large number of network training experiments by controlling a single variable or multiple variables, such as antenna system parameters or the environment parameters. The large number of detection data will make the network learn its internal connections. Moreover, we will also try to use the Transform network structure to predict.