Next Article in Journal
Machine Design Automation Model for Metal Production Defect Recognition with Deep Graph Convolutional Neural Network
Previous Article in Journal
ICT Penetration and Insurance Sector Development: Evidence from the 10 New EU Member States
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

DOA Estimation Using Deep Neural Network with Angular Sliding Window

1
Radar Research Laboratory, School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
2
Beijing Institute of Technology Chongqing Innovation Center, Chongqing 401120, China
3
Electromagnetic Sensing Research Center of CEMEE State Key Laboratory, School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
4
Beijing Key Laboratory of Embedded Real-Time Information Processing Technology, Beijing 100081, China
5
Advanced Technology Research Institute, Beijing Institute of Technology, Jinan 250300, China
6
Beijing Racobit Electronic Information Technology Co., Ltd., Beijing 100081, China
*
Author to whom correspondence should be addressed.
Electronics 2023, 12(4), 824; https://doi.org/10.3390/electronics12040824
Submission received: 11 January 2023 / Revised: 29 January 2023 / Accepted: 3 February 2023 / Published: 6 February 2023
(This article belongs to the Topic Radar Signal and Data Processing with Applications)

Abstract

:
Deep neural network (DNN) has shown great potential in direction-of-arrival (DOA) estimation. In high dynamic signal-to-noise (SNR) scenarios, the estimation accuracy of the weaker sources may degrade significantly due to insufficient training samples. This paper proposes a deep neural network framework with sliding window operation. The whole field-of-view (FOV) is divided into a series of sub-regions via sliding windows. Each sub-region is assumed to contain one source at most. Thus, the single-source data can be used to train all the networks, alleviating the need for the training samples and the prior information on the number of sources. A detector network and an estimator network are followed for each sub-region, enabling high estimation accuracy and the number of sources. Simulation and real data experiment results show that the proposed method can achieve excellent DOA and source number estimation performance. Specifically, in the real data experiment, the results show that the RMSE of the proposed method reaches 0.071, which is at least 0.03 lower than FFT, MUSIC, ESPRIT, and a deep learning method namely deep convolutional network (DCN), cannot estimate the lower SNR source in high dynamic SNR scenarios.

1. Introduction

Direction-of-arrival (DOA) estimation is an essential task in array signal processing due to the extensive applications in radar, wireless communications, and acoustics [1]. Model-based DOA estimation approaches have been comprehensively investigated, including Fourier Transform (FT), subspace methods, and compressed sensing (CS) [2]. FT is easy to implement with high computational efficiency and robustness [3]. However, its angular resolution is restricted by the size of the array. Subspace methods, such as multiple signal classification (MUSIC) [4] and estimation of signal parameters via rotational invariance techniques (ESPRIT) [5], can break the resolution barrier. They utilize the orthogonality between the signal subspace and the noise subspace, which usually requires multiple snapshots to estimate the covariance matrix and the underlying subspaces [6]. CS methods solve the angular observation equation by introducing certain sparse regularizations such as L0 or L1-norm [7,8]. The accuracy and resolution are improved at a much higher computational cost [9].
Despite the wide applications, model-based approaches are susceptible to adverse conditions, such as low signal-to-noise ratio (SNR) and array imperfections. Recently, owing to the powerful nonlinear fitting capabilities, deep neural network (DNN) has been introduced to DOA estimation [10,11]. As a data-driven approach, DNN can deal with adverse situations given sufficient training samples. The key concept of existing DNN methods is discretizing the field of view (FOV) and transforming DOA estimation into a multi-label classification task. To cope with the low SNR conditions, the networks are trained across a range of low SNRs and outperform their competitors in the low SNR regime [12,13]. To obtain good adaptation to array imperfections, autoencoders are constructed to reduce the influence of noise and array imperfections, after which the network performs satisfyingly in imperfection adaptation [14]. In addition, a new deep learning architecture is proposed based on the imperfect array, and the output of the network is a vector for the spectrum estimation, which can avoid discretizing the spatial domain [15]. Furthermore, to deal with the grid mismatch, some networks are proposed to exploit the Toeplitz property and reconstruct the covariance matrix, and model-based approaches, such as MUSIC or root-MUSIC, are applied to obtain gridless DOA [16,17].
DNN has shown great potential in adverse conditions. However, high dynamic SNR is still a limiting factor in the accuracy of DOA estimation. DNN requires enough training samples that cover all the practical situations. When the number of sources is large, it is difficult to cover all possible combinations of positions and amplitudes. When the SNRs of the sources vary significantly, weaker sources achieve much lower accuracy or may even be missed. Moreover, the number of sources is usually needed as the prior information. The number of the resulting training samples is huge, which leads to very high computational costs as well [13].
In this paper, we propose a DNN framework with a sliding window (DNN-SW) for DOA estimation to cope with the high dynamic SNR scenarios. The entire FOV is divided into a series of overlapping angular sub-regions through the sliding window operation. The core network consists of a detector network and an estimator network. The detector determines whether the sub-region contains a source, and the estimator informs its angle. This paper assumes that there is only one source in each sub-region. Based on the assumption, single-source data is used to train the networks, and the requirement of the training sample is alleviated. Compared with existing methods based on DNN, the DOA estimation task is jointly accomplished by multiple networks, and each network only estimates the angle of the current sub-region, greatly simplifying the task of each network and improving network performance in high dynamic scenarios. Furthermore, the number of sources can be obtained adaptively according to the number of sub-regions in which the source exists, which adapts to the number of sources.
The rest of the paper is organized as follows. In Section 2, we present the signal model. In Section 3, we introduce the input data preprocessing and network structure in detail, including the sliding window module, detection module, DOA estimation module, and angle merging. In Section 4, we offer the results of the simulation to prove the advantageous performance of the proposed method by comparing the other approaches and collecting the real data for verification. Finally, conclusions are summarized in Section 5.

2. Signal Model

In this paper, the single snapshot scenario is considered [18]. As shown in Figure 1, we consider a uniform line array (ULA) with N elements in the narrow-band mode. It is assumed that there are K far-field sources from θ = θ 1 , θ 2 , , θ K T , θ k indicates the direction of the kth source, and the received signal can be formulated as
x = k = 1 K a θ k S k + n = A ( θ ) S + n .
where S = S 1 , S 2 , , S K T represents the amplitude of the sources and n ( t ) is a N × 1 vector denoting the statistically independent white Gaussian noise vector with zero mean value and unknown variance σ n 2 . The steering vector a θ k can be expressed as
a θ k = 1 ,   e j 2 π λ d sin θ k ,   ,   e j 2 π λ N - 1 d sin θ k T .
where λ = c / f is the wavelength of the transmitted signal, f is the carrier frequency and c is the speed of the light. Additionally, d describes the array element space. As for A(θ), it is the array manifold matrix, which can be written as
A θ = 1 1 1 exp j 2 π d λ sin θ 1 exp j 2 π d λ sin θ 2 exp j 2 π d λ sin θ K exp j 2 π ( N 1 ) d λ sin θ 1 exp j 2 π ( N 1 ) d λ sin θ 2 exp j 2 π ( N 1 ) d λ sin θ K

3. The Proposed Method

The structure of DNN-SW is shown in Figure 2. it is divided into several overlapping sub-regions and assume the sources are located in different sub-regions independently. Specifically, there are four main modules in the proposed network. First, in the sliding window module, the FOV is split into a series of overlapping sub-regions in which there is one source at most. Then, the detection module and DOA estimation module are followed. The detection module contains multiple detectors for each sub-region. The detector network determines whether a source exists in the corresponding sub-region, which is formed as a binary classification task. Similar to the detection module, the estimation module also includes multiple estimator networks to obtain the angles of the sources. Since the sub-regions are overlapped, one source can be detected and estimated in several sub-regions. Therefore, an angle merging module is applied to give the final results.

3.1. Input Data Preprocessing

To retain the amplitude and phase information of the single snapshot signal x [19], we extract four parts of information from x, including the real part, imaginary part, angle value, and modulus. The input vector X can be described as
X = Re x T Im x T x T abs x T .

3.2. Sliding Window Module

To alleviate the negative influence of different sources in the high dynamic SNR scenario, the sliding window module is proposed. By detecting and estimating the sources separately through the network, the information of each source can be focused on and easily extracted. Furthermore, based on this structure, the detectors and estimators are trained using single-source data. Compared to other methods [11,12,13,14], we can greatly reduce the number of training samples and the training time cost. Here, the range of the FOV is φ m i n ,   φ m a x , it is divided into several overlapping sub-regions, as shown in Figure 3. In this paper, the range of the sub-region δ1 is set to be close to Δ θ 3 d B (the 3 dB beamwidth of the array), resulting in the situation that at most one source can be estimated in each sub-region. According to [20], Δ θ 3 d B can be calculated by
Δ θ 3 d B = 0.886 λ d N 1 cos θ m
where, θ m is the center of the beamwidth. Moreover, δ 2 is the step size when dividing the FOV. To alleviate the missing problem of sources and improve the estimation accuracy, each source is set in multiple sub-regions for repeated detection and estimation, resulting in δ 2 δ 1 / 2 . Thus, in this paper, δ 2 is configurated to be δ 1 / 2 , and the number of sub-regions L is
L = φ max φ min δ 1 δ 2 + 1 .

3.3. Detection Module

The structure of this module is shown in Figure 2. L detector networks are constructed to accomplish the detection task in L sub-regions. The data is fed into the L detectors to decide which sub-regions the sources are located in. Since this is a supervised task, the training and testing details are introduced, respectively. During the training process, take the ith detector as an example; given the jth training data X j t r a i n , the output d ˜ i j can be obtained
d ˜ i j = f W f W 1 f 1 X j t r a i n ,
where f1 to fW are W fully connected layers. f1 to fW−1 are all followed by a rectified linear unit (ReLU) layer. Finally, the Tanh layer is applied to generate the final detection. The ReLU layer and Tanh layer are defined as
ReLU x = max 0 , x
Tanh x = e x e x e x + e x
Furthermore, d i j represents the ground-truth label of the sample X j t r a i n . If X j t r a i n is in the ith sub-region, the label d i j is set to 1; otherwise, d i j is set to 0. For each detector, the mean square error is served as the loss function for backpropagation to evaluate the detection performance. The objective function can be written as
l o s s 1 = 1 J j = 1 J d i j d ˜ i j 2 ,
where J denotes the number of all the training samples. The parameters of the network are updated by minimizing (10) through the adaptive moment estimation (Adam) optimizer.
In the testing process, a threshold Th1 is designed to judge whether the source lies in ith sub-region. Its value is determined by the statistical analysis results of training samples to guarantee a satisfying detection. The details will be discussed in the simulation section. When a testing sample X j t e s t is put into the ith detector, y i j is the detection result, which is defined as
y i j = 1 ,   if   d ˜ i j > T h 1 0 ,   else .
When y i j = 1 , the source is considered in this sub-region, otherwise, it does not exist in this sub-region.

3.4. DOA Estimation Module

After detecting which sub-regions the sources are in, the DOA estimation module is built to achieve the specific angle estimation task. As shown in Figure 2, similar to the detection module, L estimator networks are established for the L sub-regions. For each estimator, the sub-region is further divided into a series of grids with the step of δ3 and the number of grids M is δ 1 / δ 3 . During the training process, to avoid the negative effects of error detection, estimators are trained independently of the detectors. We take the ith estimator into account, given the sample X j t r a i n , the input data of this estimator is d i j × X j t r a i n , which means only the data in the ith sub-region is trained. In the estimating phase, the label is encoded by the one-hot encoding method, which is a vector that represents the probabilities of all the alternative angles. The ground-truth label P i j of X j t r a i n for the ith estimator can be written as P i j = p i j 1 ,   p i j 2 , ,   p i j M T . If the source locates in the mth grid, p i j m is 1, otherwise, p i j m is 0. When training, the estimation results of the estimator P ˜ i j can be expressed as
P ˜ i j = f Z f Z 1 f 1 d i j × X j t r a i n = p ˜ i j 1 ,   p ˜ i j 2 ,     ,   p ˜ i j M T ,
where f 1 to f Z are Z fully connected layers. f 1 to f Z 1 are followed by a ReLU layer and f Z is followed by a Softmax layer. The Softmax layer is defined as
Softmax p ˜ i j m = exp p ˜ i j m h exp p ˜ i j h .
According to the ground-truth label P i j and the predicted label P ˜ i j , cross-entropy is selected as the loss function
l o s s 2 = 1 J j = 1 J m = 1 M p i j m log p ˜ i j m ,
where J denotes the number of training samples. The parameters of the estimator are updated by minimizing (14) through the Adam optimizer.
In the testing process, for the ith estimator, the sample X j t e s t is served as the input data based on the detection result. The input can be formulated as   y i j × X j t e s t . Only when y i j is 1, does the output of the estimator makes sense. The location with max probability is used to calculate the estimated angle θ ˜ i j , and it can be expressed as
θ ˜ i j = arg max P ˜ i j + j 1 × δ 2 + φ min .

3.5. Angle Merging

Through the detection and DOA estimation module, the source is first detected, and then the specific angle is obtained. However, although the adjacent overlapping sub-regions allow the detection results to be more accurate and complete, the same source will be detected and estimated repeatedly in multiple sub-regions. Namely, several angles may be obtained according to one source. An angle merging algorithm is proposed to solve the angle redundancy problem. The estimated angles from the same source are considered to have minor differences so that they can merge into one angle as the final output. The fusion threshold Th2 is introduced to decide whether the angles should be fused. Since high-resolution DOA estimation is not considered in this paper, Th2 is set to be a little lower than Δ θ 3 d B . If the difference between the estimated angles θ ˜ l θ ˜ h is lower than Th2, they will be merged by
θ ˜ = 1 2 θ ˜ l + θ ˜ h     i f     a b s θ ˜ l θ ˜ h < T h 2 .
After the merging algorithm, the final angles estimated are obtained, and the number of sources can be acquired automatically.

4. Experiment

In this section, we conduct the DOA estimation based on the simulation data and real data to evaluate the proposed method. First, the simulation data is used to verify the effectiveness and advantages of our method. Then, we collected real radar signals to assess the performance of DNN-SW in practical application.

4.1. Simulation Settings

In the simulations, the FOV is the range of 60 ° ,   60 ° , and a 40-element uniform linear array with λ / 2 inter-element spacing is considered. According to (5), since the Δ θ 3 d B of different beamwidth centers are different, in this paper, the center of FOV is set as the beamwidth center θ m . When θ m is 30°, Δ θ 3 d B is 3.01°. The range of sub-region δ1 is set to 3° and the step size δ2 is set to 1.5°. As a result, the sub-region number L is 79. Additionally, the sampling interval δ3 is specified as 0.1° so that the grid number for each sub-region M is 30 and the grid number for FOV is 1200 categories of direction in total. The detailed parameters are listed in Table 1.
For the training dataset, the SNR of the source is 15 dB, and 30 samples are collected in each direction. Therefore, there are 36,000 training samples in total. For each detector and estimator, the size of training data is 36,000 and 900, respectively. For all the experiments, the SNR is defined in [13]
S N R = 10 log 10 min σ 1 2 , , σ K 2 σ n 2 ,
where σ i 2 represents the power of the ith source, i = 1 , , K . σ n 2 represents the power of the noise.
For each detector, the number of neurons per layer of the network is {16, 1} with a batch size of 128 during 100 training epochs. Similarly, for each estimator, the number of neurons per layer of the network is {128, 256, 30} with a batch size of 128 during 200 training epochs. Moreover, the learning rate is configured to 0.001 for all networks.
The simulations are carried out in a workstation with MATLAB R2022a, Intel Xeon Gold 6240 processor at 2.60GHz, and NVIDIA A100 Tensor Core GPU. The detector networks and estimator networks are based on Pytorch 1.11.0 and Python 3.9.12. Based on the conditions in the training process, the average running time of all the 79 detection networks and estimator networks is about 90.2 s and 5.6 s, respectively. In the testing process, each detection network and each estimator network respectively cost 3.09 us and 3.89 us, which is obtained by calculating the average running time of 1000 testing samples.

4.2. Evaluation Metrics

In the simulations, to objectively and effectively evaluate the performance of the DNN-SW, two evaluation metrics are utilized, including Acc and root mean square error (RMSE). Since the number of sources is unknown, Acc is an important metric to evaluate DNN-SW. It describes the percentage of the number of testing samples whose source numbers are estimated correctly by the network [21]. It can be formulated as
A c c = 1 m i = 1 m p i × 100 % ,  
where
p i = 1 ,       i f   num ( θ ˜ i ) = num ( θ i ) 0 ,       o t h e r s .
Here, θ i and θ ˜ i denotes the ground truth and the prediction directions of the ith testing sample, respectively, i = 1 , 2 , , m , where m denotes the number of testing samples.
Additionally, RMSE is also a classic and common metric in past research [13,22]. We calculate the RMSE of the testing samples whose source number is estimated correctly. RMSE can be obtained by
RMSE = 1 H Q h = 1 H q = 1 Q θ ˜ h , q θ h , q 2 ,
where H represents the number of samples whose source number is estimated correctly, and Q represents the number of the source in a testing sample. θ h , q and θ ˜ h , q denote the qth estimated direction and ground-truth direction of the hth sample, respectively.

4.3. Determination of Th1 and Th2

In this part, the determination methods of T h 1 and T h 2 are described in detail. Since the detection process can be regarded as a binary task and T h 1 is an important threshold to decide the detection results, F1 score is served as the criterion to select the optimal parameter T h 1 . In binary classification tasks, the F1 score is widely used to analyze the accuracy of machine learning models [23,24,25,26]. It takes both the precision and recall of the model into account to provide an objective description of the method.
In order to obtain the F1 score for the detector network, the samples can be split into four parts according to their ground truth and predicted labels, as shown in Table 2.
In the detector network, the sample is considered positive if its source is in the corresponding sub-region. According to [25], precision and recall are first calculated by
p r e c i s i o n = T P T P + F P ,
r e c a l l = T P T P + F N ,
where precision is the proportion of the positive predicted samples in the actual positive samples, and recall is the proportion of the actual positive samples in all predicted positive samples. It should be noted that since there are L = 79 detectors in our method, the number of samples used to obtain the precision and recall are the overall number of samples of all 79 detectors. In this case, the overall performance of the detection module is assessed, and the results will not be influenced by the extreme results of some detectors. Then, F 1 score is obtained by calculating the harmonic mean of precision and recall
F 1 = 2 × p r e c i s i o n × r e c a l l p r e c i s i o n + r e c a l l = 2 × T P 2 × T P + F P + F N
where the range of F 1 is 0 ,   1 . If all the positive samples are wrongly predicted, F 1 is equal to 0 which is the minimum value. Additionally, when the samples are all correctly predicted, F 1 is equal to 1, which is the maximum value.
In the simulation, to select the best threshold T h 1 , we randomly generate 10,000 testing samples. The samples contain two sources, which are not located in one sub-region, and the SNR of these sources is configured to 15 dB. According to (23), we calculate the F 1 for each T h 1 with the step of 0.05 from 0 to 1, and the results are shown in Figure 4.
From Figure 4, we can observe that the F 1 score increases and then decreases with the increase of T h 1 . When T h 1 is 0.25, F 1 reaches the highest, which is 0.942. Thus, the T h 1 is fixed to 0.25 in the remaining simulations and real data experiments.
As for the threshold of angle merging T h 2 , since the high-resolution DOA estimation is not considered, it is supposed to be lower than δ 1 . Therefore, T h 2 is set to 2° in the simulation and real data experiments.

4.4. Simulation Results

4.4.1. Sources in the Same SNR Scenarios

Two simulations are conducted under adverse conditions to assess the performance of DNN-SW, including low SNR and array imperfections. In this part, the high dynamic condition is not considered, which means that the SNR of all the sources is the same.
Since the DNN-SW can estimate the number and directions of sources simultaneously, in this simulation, the two values are unknown, and both need to be obtained. To further evaluate the proposed method, the comparison methods are applied. The existing methods can rarely achieve the two tasks at the same time. Therefore, the whole task is divided into the source number estimation part and the DOA estimation part for the comparison methods. In the source number estimation part, two conventional methods are employed for comparison, including AIC and MDL [27,28,29,30,31]. Then, in the DOA estimation part, FFT, MUSIC, and ESPRIT are utilized. Furthermore, since MUSIC and ESPRIT are based on the covariance matrix, the space smoothing algorithm is applied to generate the covariance matrix before estimation [32].
Firstly, we consider two sources in the low SNR situation, and both targets have the same SNR in the testing sample. Two sources impinge on this array from the directions of −6.25° and 3.18°. The SNR varies from 0 dB to 20 dB with the step of 2 dB. For each SNR, the RMSE is obtained by averaging the results of 1000 Monte Carlo (MC) runs. The source number estimation and DOA estimation results are shown in Figure 5, respectively. In (a), we can observe that the Acc of three source number estimation methods all reach very high. For MDL and DNN-SW, when SNR is larger than 4 dB, the Acc is constantly above 99.5%. For AIC, the Acc is a little lower, which is about 98%, while it is more robust to SNR. The results indicate that the proposed method can achieve advanced performance for source number estimation. Furthermore, considering AIC is more robust and the difference of RMSE based on the two methods is small due to the high Acc, and adequate MC runs, the results of AIC are used to accomplish DOA estimation for FFT, MUSIC, and ESPRIT. As for the DOA estimation results in (b), the results show that as the SNR increases, the RMSE of all methods decreases. Among all the methods, DNN-SW performs best when SNR is below 18 dB. The results indicate that compared with other methods, DNN-SW can achieve better DOA estimation performance in low SNR conditions due to its strong data-fitting capability.
Moreover, three kinds of array imperfections are considered, including gain inconsistent, phase inconsistent, and position perturbation [14]. It is assumed that the gain-phase inconsistency and position perturbation of different antennas are uniformly distributed within 3η dB, 30η°, and 0.15λη, respectively. η is an imperfect factor to measure the imperfect effect, and it varies from 0 to 1 with the step of 0.1. In this simulation, two sources impinge on this array from the directions of −6.25° and 3.18°. The SNR of these sources is set to 15 dB. For each η, the RMSE is obtained by averaging the results of 1000 MC runs. The results are shown in Figure 6. From (a), it can be observed that the three methods can all estimate the source number accurately. The Acc of DNN-SW is above 99% when η is 0.8 and can reach 92% even η is 1, which means the source number estimation results of DNN-SW are reliable. For MDL and AIC, the results are both satisfying, and they show a similar law to the previous simulation. In this case, the number estimation results of AIC are still considered the basis for DOA estimation comparison methods. (b) gives the RMSE of different methods, and DNN-SW consistently performs better than other methods when the error increases. The results show that the deep learning method is more robust for array imperfections because it has the capability of adaptively learning detailed information from the input data.

4.4.2. Sources in High Dynamic SNR Scenarios

In this part, we focus on the high dynamic SNR scenarios, which means the SNRs of different sources are different. Since the results in Section 4.4.1 show that the deep learning methods perform better under adverse conditions than conventional methods, in this simulation, two deep learning methods for DOA estimation are applied as comparisons, including DNN-NSW and Deep convolution network (DCN) [11]. The difference between DNN-NSW and DNN-SW is that there is no overlapping part between adjacent sub-regions in DNN-NSW. That is to say, in DNN-NSW, the step size of the sliding window δ 2 is set to 3°, which is the same as the δ 1 . So, a source will only appear in one sub-region. The remaining parameters and configurations of the two methods are the same. Furthermore, due to DCN methods needing multiple snapshots, the number of snapshots is set to 50.
Firstly, we change the directions of the two sources to assess the DOA estimation performance. The first source θ 1 varies from −49.59° to 49.41° with the step of 1°, and the direction of the second source θ 2 is set to θ 1 + 6.25 ° . The directions of the sources are all off-grid. When the SNR of the two sources is both 10 dB, the estimation results of DCN and DNN-SW are shown in Figure 7a, b, and c, respectively. We can observe that the three deep learning methods can achieve satisfying performance in the ideal case. Figure 7d–f depicts the results when the SNRs of two sources are 10 dB and 18 dB, respectively. The results demonstrate that ΔSNR between two sources severely degrades the performance of DCN; the source with lower SNR is rarely estimated. By contrast, DNN-SW can significantly alleviate the problem due to the design of the sub-region network structure. The lower SNR source is estimated in its sub-region, and the influence of the other sources can be reduced. In addition, compared with DNN-NSW, we can infer that the sliding window can improve the accuracy of estimating the number of sources and the performance of DOA estimation.
Additionally, to investigate the effect of the difference of SNRs between two sources, the direction of the two sources is fixed at 6.28° and 15.72°, and the SNR of the sources are configured as 10 dB and 10 dB + ΔSNR, respectively. In the simulation, ΔSNR varies from 0 dB to 10 dB, and the RMSE for each ΔSNR is obtained by calculating the average results of 1000 MC runs. As shown in Figure 7d, since DCN may miss the low SNR source, the Acc of three methods is also discussed. The results are shown in Figure 8a; it can be seen that DNN-SW can precisely estimate the source number even for high ΔSNR, while DCN fails to estimate the source number when ΔSNR > 5 dB. As for the RMSE given in Figure 8b, it can be observed that the RMSE of DNN-SW is much lower than DNN-NSW due to the overlapping design and repeat estimation of the sources. As for DCN, it achieves a more precise estimation when ΔSNR is small because the input contains information from multiple snapshots. However, with the increase of ΔSNR, DNN-SW shows its advantages due to the design of the sub-region network structure.

4.5. Real Data Experiment Results

To further evaluate the practical application value of DNN-SW, the real data are collected using MMWCAS-RF-EVM radar in the practical scenario, and the experiments are conducted based on the real data. The specific configurations of the radar antennas are described Figure 9. It has 12-transmit and 16-receive antennas, resulting in 86 non-overlapping azimuth virtual arrays. In this experiment, there are 40 virtual arrays considered. The data collection scenario is shown in Figure 10. Two different corner reflectors are fixed at a distance of 6 m from the radar, and their directions relative to the radar are −7.2° and 4.8°, respectively. In [13], the RMSE of ESPRIT is lower than 0.01 when the array element is 16, the number of snapshots is 1000, and the SNR of sources is 15 dB. Based on this result, in our experimental condition, the ground-truth directions of the corner reflectors are calculated using 86 virtual arrays and 1000 snapshots by the ESPRIT methods, and the RMSE will be lower than 0.01. Therefore, it is considered the ground truth in this real data experiment.
Based on the measured reflected signal, the DOA estimation results of four methods are shown in Figure 11 and Table 3. We can observe from the spectrum in Figure 11 that the difference in the SNR of the two corner reflectors is 5.3 dB. In this case, DNN-SW performs best whose RMSE is only 0.071. For the conventional methods, the estimation error is larger, and MUSIC performs best among them. As for the deep learning method, DCN, the higher source is estimated more accurately compared with most conventional methods, while the weaker source is missed. The results verify the effectiveness of DNN-SW in the practical application and indicate that the structure of our method can improve performance in high dynamic scenarios.

5. Conclusions

In this paper, a deep neural network framework with the angular sliding window is proposed for DOA estimation in highly dynamic scenarios. This method divides FOV into a set of sub-regions. In each sub-region, the sources are separately estimated. A detector network and an estimator network are designed for source detection and estimation. Based on the assumption that there is at most one source in each sub-region, each network can be trained with single-source data, which alleviates the requirement of training data and improves DOA estimation performance in highly dynamic scenarios. Simulation results verify the effectiveness of DNN-SW, and the results show that it can significantly estimate the source direction in highly dynamic SNR scenarios. Furthermore, the experiment results on real data show that the RMSE of the proposed method is 0.071, which is superior to FFT, MUSIC, ESPRIT, and DCN.

Author Contributions

Conceptualization, Y.L. and Y.W.; methodology, Y.L., Y.W. and Z.H.; software, Z.H. and C.L.; validation, Z.H., Y.L. and C.L.; formal analysis, C.L. and L.Z.; investigation, Y.L. and Z.H.; resources, Y.W. and L.Z.; data curation, J.W., Y.Z. and H.L.; writing—original draft preparation, Y.L. and Z.H.; writing—review and editing, Y.W., C.L., L.Z., J.W., Y.Z. and H.L.; visualization, J.W., Y.Z. and H.L.; supervision, Y.W. and L.Z.; project administration, Y.W., Y.L. and Z.H.; funding acquisition, Y.W. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key R&D Program of China (Grant No. 2018YFE0202101, Grant No. 2018YFE0202103), China Postdoctoral Science Foundation (Grant No. 2021M690412), Natural Science Foundation of Chongqing, China (Grant No. cstc2020jcyj-msxmX0812) and project ZR2021MF134 supported by Shandong Provincial Natural Science Foundation.

Data Availability Statement

Data available on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wang, Q.; Hu, X.; Deng, X.; Buris, N.E. DoA Estimation Using Neural Tangent Kernel under Electromagnetic Mutual Coupling. Electronics 2021, 10, 1057. [Google Scholar] [CrossRef]
  2. Hassanien, A.; Amin, M.G.; Zhang, Y.D.; Ahmad, F. High-resolution single-snapshot DOA estimation in MIMO radar with colocated antennas. In Proceedings of the 2015 IEEE Radar Conference (RadarCon), Arlington, VA, USA, 10–15 May 2015; pp. 1134–1138. [Google Scholar]
  3. Xiang, H.; Chen, B.; Yang, T.; Liu, D. Improved De-Multipath Neural Network Models With Self-Paced Feature-to-Feature Learning for DOA Estimation in Multipath Environment. IEEE Trans. Veh. Technol. 2020, 69, 5068–5078. [Google Scholar] [CrossRef]
  4. Chen, H.; Chen, K.; Cheng, K.; Chen, Q.; Fu, Y.; Li, L. An Efficient Hardware Accelerator for the MUSIC Algorithm. Electronics 2019, 8, 511. [Google Scholar] [CrossRef]
  5. Jung, Y.; Jeon, H.; Lee, S.; Jung, Y. Scalable ESPRIT Processor for Direction-of-Arrival Estimation of Frequency Modulated Continuous Wave Radar. Electronics 2021, 10, 695. [Google Scholar] [CrossRef]
  6. Sun, S.; Petropulu, A.P.; Poor, H.V. MIMO Radar for Advanced Driver-Assistance Systems and Autonomous Driving: Advantages and Challenges. IEEE Signal Process. Mag. 2020, 37, 98–117. [Google Scholar] [CrossRef]
  7. Wei, R.; Wang, Q.; Zhao, Z. Two-Dimensional DOA Estimation Based on Separable Observation Model Utilizing Weighted L1-Norm Penalty and Bayesian Compressive Sensing Strategy. In Proceedings of the 2017 4th International Conference on Information Science and Control Engineering (ICISCE), Changsha, China, 21–23 July 2017; pp. 1764–1768. [Google Scholar]
  8. Bosse, J.; Rabaste, O. Subspace Rejection for Matching Pursuit in the Presence of Unresolved Targets. IEEE Trans. Signal Process. 2018, 66, 1997–2010. [Google Scholar] [CrossRef]
  9. Wei, Z.; Wang, W.; Dong, F.; Liu, Q. Gridless One-Bit Direction-of-Arrival Estimation Via Atomic Norm Denoising. IEEE Commun. Lett. 2020, 24, 2177–2181. [Google Scholar] [CrossRef]
  10. Barthelme, A.; Utschick, W. A Machine Learning Approach to DoA Estimation and Model Order Selection for Antenna Arrays With Subarray Sampling. IEEE Trans. Signal Process. 2021, 69, 3075–3087. [Google Scholar] [CrossRef]
  11. Wu, L.; Liu, Z.-M.; Huang, Z.-T. Deep Convolution Network for Direction of Arrival Estimation With Sparse Prior. IEEE Signal Process. Lett. 2019, 26, 1688–1692. [Google Scholar] [CrossRef]
  12. Guo, Y.; Zhang, Z.; Huang, Y.; Zhang, P. DOA Estimation Method Based on Cascaded Neural Network for Two Closely Spaced Sources. IEEE Signal Process. Lett. 2020, 27, 570–574. [Google Scholar] [CrossRef]
  13. Papageorgiou, G.; Sellathurai, M.; Eldar, Y. Deep Networks for Direction-of-Arrival Estimation in Low SNR. IEEE Trans. Signal Process. 2021, 69, 3714–3729. [Google Scholar] [CrossRef]
  14. Liu, Z.-M.; Zhang, C.; Yu, P.S. Direction-of-Arrival Estimation Based on Deep Neural Networks With Robustness to Array Imperfections. IEEE Trans. Antennas Propagat. 2018, 66, 7315–7327. [Google Scholar] [CrossRef]
  15. Chen, P.; Chen, Z.; Liu, L.; Chen, Y.; Wang, X. SDOAnet: An Efficient Deep Learning-Based DOA Estimation Network for Imperfect Array. arXiv 2022, arXiv:2203.10231. [Google Scholar]
  16. Wu, X.; Yang, X.; Jia, X.; Tian, F. A Gridless DOA Estimation Method Based on Convolutional Neural Network With Toeplitz Prior. IEEE Signal Process. Lett. 2022, 29, 1247–1251. [Google Scholar] [CrossRef]
  17. Su, X.; Hu, P.; Liu, Z.; Shi, J.; Li, X. Deep Alternating Projection Networks for Gridless DOA Estimation With Nested Array. IEEE Signal Process. Lett. 2022, 29, 1589–1593. [Google Scholar] [CrossRef]
  18. Häcker, P.; Yang, B. Single snapshot DOA estimation. Adv. Radio Sci. 2010, 8, 251–256. [Google Scholar] [CrossRef]
  19. Zhang, L.; Shi, C.; Niu, J.; Ji, Y.; Jonathan Wu, Q.M. DOA Estimation for HFSWR Target Based on PSO-ELM. IEEE Geosci. Remote Sensing Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  20. Richards, M.A. Fundamentals of Radar Signal Processing, 2nd ed.; McGraw-Hill Education: New York, NY, USA, 2014; ISBN 978-0-07-179833-4. [Google Scholar]
  21. Yang, Y.; Gao, F.; Qian, C.; Liao, G. Model-Aided Deep Neural Network for Source Number Detection. IEEE Signal Process. Lett. 2020, 27, 91–95. [Google Scholar] [CrossRef]
  22. Lima de Oliveira, M.L.; Bekooij, M.J.G. ResNet Applied for a Single-Snapshot DOA Estimation. In Proceedings of the 2022 IEEE Radar Conference (RadarConf22), New York, NY, USA, 21–25 March 2022; pp. 1–6. [Google Scholar]
  23. Huang, H.; Xu, H.; Wang, X.; Silamu, W. Maximum F1-Score Discriminative Training Criterion for Automatic Mispronunciation Detection. IEEE/ACM Trans. Audio Speech Lang. Process. 2015, 23, 787–797. [Google Scholar] [CrossRef]
  24. Pillai, I.; Fumera, G.; Roli, F. Designing multi-label classifiers that maximize F measures: State of the art. Pattern Recognit. 2017, 61, 394–404. [Google Scholar] [CrossRef]
  25. Calders, T.; Esposito, F.; Hüllermeier, E.; Meo, R. (Eds.) Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, 15–19 September 2014. Proceedings, Part II; Lecture Notes in Computer Science; Springer Berlin Heidelberg: Berlin, Heidelberg, 2014; Volume 8725, ISBN 978-3-662-44850-2. [Google Scholar]
  26. Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef] [Green Version]
  27. DeRidder, F.; Pintelon, R.; Schoukens, J.; Gillikin, D.P. Modified AIC and MDL Model Selection Criteria for Short Data Records. IEEE Trans. Instrum. Meas. 2005, 54, 144–150. [Google Scholar] [CrossRef]
  28. Ding, J.; Tarokh, V.; Yang, Y. Bridging AIC and BIC: A New Criterion for Autoregression. IEEE Trans. Inform. Theory 2018, 64, 4024–4043. [Google Scholar] [CrossRef]
  29. Seghouane, A.-K. Asymptotic bootstrap corrections of AIC for linear regression models. Signal Process. 2010, 90, 217–224. [Google Scholar] [CrossRef]
  30. Huang, L.; So, H.C. Source Enumeration Via MDL Criterion Based on Linear Shrinkage Estimation of Noise Subspace Covariance Matrix. IEEE Trans. Signal Process. 2013, 61, 4806–4821. [Google Scholar] [CrossRef]
  31. Bazzi, A.; Slock, D.T.M.; Meilhac, L. Detection of the number of superimposed signals using modified MDL criterion: A random matrix approach. In Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 20–25 March 2016; pp. 4593–4597. [Google Scholar]
  32. Liao, W.; Fannjiang, A. MUSIC for single-snapshot spectral estimation: Stability and super-resolution. Appl. Comput. Harmon. Anal. 2016, 40, 33–67. [Google Scholar] [CrossRef]
Figure 1. The system model for the DOA estimation with the far-field narrowband source.
Figure 1. The system model for the DOA estimation with the far-field narrowband source.
Electronics 12 00824 g001
Figure 2. The architecture of DNN-SW for DOA estimation.
Figure 2. The architecture of DNN-SW for DOA estimation.
Electronics 12 00824 g002
Figure 3. The subregions division method.
Figure 3. The subregions division method.
Electronics 12 00824 g003
Figure 4. (a) The value of F1 for each Th1 with the step of 0.05 from 0 to 1. (b) The specific picture in the red box from (a).
Figure 4. (a) The value of F1 for each Th1 with the step of 0.05 from 0 to 1. (b) The specific picture in the red box from (a).
Electronics 12 00824 g004
Figure 5. The DOA estimation performance with different SNRs. (a) Acc (b) RMSE.
Figure 5. The DOA estimation performance with different SNRs. (a) Acc (b) RMSE.
Electronics 12 00824 g005
Figure 6. The DOA estimation performance with array imperfection. η is an imperfect factor The gain-phase error and position perturbation are 3η dB, 30η °, and 0.15λη, respectively. (a) Acc (b) RMSE.
Figure 6. The DOA estimation performance with array imperfection. η is an imperfect factor The gain-phase error and position perturbation are 3η dB, 30η °, and 0.15λη, respectively. (a) Acc (b) RMSE.
Electronics 12 00824 g006
Figure 7. The DOA estimation of off-grid sources. First row: the SNR of the two sources is 15 dB. Second row: the SNR of two sources is 10 dB and 18 dB. (a) DCN, Acc:100%, RMSE:0.061, (b) DNN-NSW, Acc:100%, RMSE:0.149 (c) DNN-SW, Acc:100%, RMSE:0.128 (d) DCN, Acc:15%, RMSE:0.129, (e) DNN-NSW, Acc:78%, RMSE:0.396 (f) DNN-SW, Acc:97%, RMSE:0.262.
Figure 7. The DOA estimation of off-grid sources. First row: the SNR of the two sources is 15 dB. Second row: the SNR of two sources is 10 dB and 18 dB. (a) DCN, Acc:100%, RMSE:0.061, (b) DNN-NSW, Acc:100%, RMSE:0.149 (c) DNN-SW, Acc:100%, RMSE:0.128 (d) DCN, Acc:15%, RMSE:0.129, (e) DNN-NSW, Acc:78%, RMSE:0.396 (f) DNN-SW, Acc:97%, RMSE:0.262.
Electronics 12 00824 g007
Figure 8. The DOA estimation performance of two unequally-powered sources from the directions of 6.28° and 15.72°. (a) Acc, (b) RMSE.
Figure 8. The DOA estimation performance of two unequally-powered sources from the directions of 6.28° and 15.72°. (a) Acc, (b) RMSE.
Electronics 12 00824 g008
Figure 9. The PCB antenna arrays of MMWCAS-RF-EVM radar. 1, 2, 3 are the receiving antennas and 4 is the transmitting antennas.
Figure 9. The PCB antenna arrays of MMWCAS-RF-EVM radar. 1, 2, 3 are the receiving antennas and 4 is the transmitting antennas.
Electronics 12 00824 g009
Figure 10. The measured scenario with 77 GHz millimeter-wave array. (a) The measured scenario, (b) The scenario schematic.
Figure 10. The measured scenario with 77 GHz millimeter-wave array. (a) The measured scenario, (b) The scenario schematic.
Electronics 12 00824 g010
Figure 11. The DOA estimation of two different corner reflectors.
Figure 11. The DOA estimation of two different corner reflectors.
Electronics 12 00824 g011
Table 1. Parameter Settings.
Table 1. Parameter Settings.
ParameterValue
Sensor array
ConfigurationULA
Inter-element spacing d = λ / 2
The number of the element N = 40
Sub-region
FOV φ m i n ,   φ m a x = 60 ° ,   60 °
Range of sub-region δ 1 = 3 °
Step of sub-region δ 2 = 1.5 °
Interval of the FOV δ 3 = 0.1 °
Threshold of detector 1 T h 1 = 0.25
Threshold of angle merging 1 T h 2 = 2 °
Detector and Estimator Network
Hidden layers, # Detector Network16, 1
Hidden layers, # Estimator Network128, 256, 30
Activation function >Detector: Tanh; Estimator: Softmax
1 The value of Th1 and Th2 are described in Section 4.3.
Table 2. The standard confusion matrix.
Table 2. The standard confusion matrix.
Predicted PositivePredicted Negative
Actual positiveTPFN
Actual negativeFPTN
Table 3. DOA estimation results of the real data.
Table 3. DOA estimation results of the real data.
Target 1 (−7.2°)Target 2 (4.8°)RMSE
Estimated ResultEstimated ErrorEstimated ResultEstimated Error
FFT−6.955°0.245°5.042°0.242°0.243
MUSIC−7.1°0.1°4.9°0.1°0.1
ESPRIT−7.2°5.0°0.2°0.141
DCN 1−7.25°0.05°///
DNN-SW 2−7.2°4.7°0.1°0.071
1 Deep convolution network (DCN) was proposed in [11]. 2 Deep neural network with sliding window (DNN-SW) is proposed in this paper.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Y.; Huang, Z.; Liang, C.; Zhang, L.; Wang, Y.; Wang, J.; Zhang, Y.; Lv, H. DOA Estimation Using Deep Neural Network with Angular Sliding Window. Electronics 2023, 12, 824. https://doi.org/10.3390/electronics12040824

AMA Style

Li Y, Huang Z, Liang C, Zhang L, Wang Y, Wang J, Zhang Y, Lv H. DOA Estimation Using Deep Neural Network with Angular Sliding Window. Electronics. 2023; 12(4):824. https://doi.org/10.3390/electronics12040824

Chicago/Turabian Style

Li, Yang, Zanhu Huang, Can Liang, Liang Zhang, Yanhua Wang, Junfu Wang, Yi Zhang, and Hongfen Lv. 2023. "DOA Estimation Using Deep Neural Network with Angular Sliding Window" Electronics 12, no. 4: 824. https://doi.org/10.3390/electronics12040824

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop