Off-grid DOA estimation using improved root sparse Bayesian learning for non-uniform linear arrays

Shen, Jiajun; Gini, Fulvio; Greco, Maria Sabrina; Zhou, Tian

doi:10.1186/s13634-023-00991-7

Research
Open access
Published: 14 March 2023

Off-grid DOA estimation using improved root sparse Bayesian learning for non-uniform linear arrays

Jiajun Shen¹,
Fulvio Gini²,
Maria Sabrina Greco² &
…
Tian Zhou ORCID: orcid.org/0000-0001-8403-9743^1,3

EURASIP Journal on Advances in Signal Processing volume 2023, Article number: 34 (2023) Cite this article

1788 Accesses
Metrics details

Abstract

This paper concerns direction of arrival (DOA) estimation based on a sparse Bayesian learning (SBL) approach. We address two inherent problems of this class of DOA estimation methods: (i) a predefined dictionary can generate off-grid problems to a SBL DOA estimator; (ii) a parametric prior generally enforces the solution to be sparse, but the existence of noise can greatly affect the sparsity of the solution. Both of these issues may have a negative impact on the estimation accuracy. In this paper, we propose an improved root SBL (IRSBL) method for off-grid DOA estimation that adopts a coarse grid to generate an initial dictionary. To reduce the bias caused by dictionary mismatch, we integrate the polynomial rooting approach into the SBL method to refine the spatial angle grid. Then, we integrate a constant false alarm rate rule in the SBL framework to enforce sparsity and improve computational efficiency. Finally, we generalize the IRSBL method to the case of non-uniform linear arrays. Numerical analysis demonstrates that the proposed IRSBL method provides improved performance in terms of both estimation accuracy and computational complexity over the most relevant existing method.

1 Introduction

Direction-of-arrival (DOA) estimation has been a popular research field in array processing for several decades [1,2,3,4,5]. This problem aims at retrieving the directional information of a source from an array of received signals, and plays a fundamental role in various practical scenarios, such as radar, sonar, and navigation. Conventional subspace-based DOA estimation methods, such as ESPRIT [6, 7] and MUSIC [8], usually need multiple snapshots for accurate DOA estimation. Alternatively, sparse signal representation (SSR) methods have attracted a lot of attention because they exhibit various advantages over the above subspace-type DOA estimation methods, such as improved robustness to noise and correlation of signals, limited number of snapshots. Moreover, they do not require prior knowledge of the number of sources [9,10,11].

With the in-depth development of compressed sensing theory, several SSR approaches have been applied to DOA estimation. For example, $\ell _{1}$-SVD [9], LASSO [12], basis pursuit denoising [13], and iterative soft thresholding algorithm [14, 15], are favorable to the SSR problem because of their guaranteed recovery accuracy. Nevertheless, these methods require the setting of a penalty parameter to balance the influence of the residual-fitting term and the sparsity-promoting term. This limits the performance and usage of the above approaches in practical applications. Sparse Bayesian learning [16,17,18,19,20,21,22] is another popular method in the SSR category. Under SBL framework, in order to promote sparsity, the SSR problem is formulated under the Bayesian framework where the signal of interest is assumed to have a sparse prior. As an example, a Student t prior leads to the maximum a posteriori (MAP) estimation that is equivalent to the reweighted $\ell _{2}$-norm approach [21]. Later, the penalty parameter, as well as the weight parameters, can be adaptively updated by exploiting the expectation-maximization (EM) method. As a result, a valuable advantage of the SBL method is that it is free from penalized parameter empirical settings [23]. Therefore, SBL method recently has attracted a lot of attention, especially in applications such as DOA estimation.

Like the majority of SSR methods, the conventional SBL method explores the spatial sparsity of the incident signal, then the angle space is divided into a grid whose points are the potential echo incident directions. Unfortunately, it is often unrealistic to assume that the DOA lies exactly on a predefined grid because of the continuous angle space. In the early research, a dense grid has been introduced to reduce the bias between the actual DOA and its nearest point on the grid. However, the compact grid setting results in high-dimensional steer dictionary, whose computational cost is unacceptable for low-cost hardware. Therefore, some researchers turned back to the off-grid SSR methods, where DOA is no longer constrained by the discrete grid. In [24], the authors formalized the bias between the actual DOA and its nearest point on the grid as an errors-in-variables estimation problem, and invoked the sparse total least squares solver to obtain the perturbation matrix. If the perturbation matrix caused by the basis mismatch obeys the Gaussian distribution, the result is equivalent to an optimal MAP estimator. Obviously, this assumption is not always suitable for DOA estimation problem. In [25], Yang et al. proposed an off-grid SBL (OGSBL) method by exploiting the first-order Taylor expansion on the discrete grid and assuming that the off-grid gap obeys a uniform distribution (non-informative). Since the corresponding covariance matrix is irreversible, it is analytically intractable to obtain an explicit expression of the stepsize in the Taylor expansion approximation. Instead, a truncated version was exploited to ensure a feasible solution, but at the same time it inevitably leads to model errors, especially when a coarse grid is used. To address this problem, Dai et al. applied the polynomial rooting method to refine the grid of the dictionary followed by an alternate optimization operation [26]. In this way, this method not only promotes the computational efficiency, but also avoids modeling error. In order to further reduce the computational complexity, Wang et al. [27] proposed an off-grid SBL method with generalized double Pareto prior, called CGDP-SBL. At first, the CGDP-SBL method utilizes the conventional SBL algorithm for iterative optimization until convergence, and then exploits the fixed step search method for grid gap estimation. Clearly, although this method can reduce the computational complexity, the estimation result is not optimal because the alternate iterative optimization method is not used, especially for the situation of high signal-to-noise ratio (SNR) or large number of snapshots. In [28], Dai et al. proposed a real-valued variational Bayesian inference (RVBI) method to estimate the mean and covariance matrix of the weight matrix by embedding generalized approximate message passing into real-valued SBL framework, instead of a brute-force matrix computation, which promotes the computational efficiency to some extent. Therefore, this motivated us to develop a computationally efficient method with improved estimation accuracy.

Moreover, a parametric prior generally enforces the model to be sparse and has few non-zero weights, but due to the presence of noise, there are still many weights close to zero but not equal to zero. As a result, we cannot obatin a sparse solution. To cope with this problem, we should enforce sparsity. Reviewing existing SBL methods, they can be roughly divided into three categories. i) Methods that use prior distributions different from the Gaussian, with sharper concave peak to enhance sparsity, are exploited to enhance sparsity, for example, Laplace [29], Gauss-Exp-$\hbox {Chi}^{2}$ [30], generalized double Pareto [27]. However, because of the existence of noise, there are still several very small estimates corresponding to the wrong candidate columns of the dictionary, that may result in a negative impact on the estimation. ii) According to the correlation subspace property [28], a real orthogonal projection transformation is incorporated to the observation model, which greatly reduces the power of the wrong grid points, but causes the signal power to degenerate into a pseudo-spectrum. iii) The variance of the prior of SBL method actually represents a measure of the spread of the values of the random variable around its mean value. As an example, for a Gaussian prior $p(x_{n}|0,\alpha _{n})$, the estimate of $x_{n}$ is forced to be zero (or sparse for a vector $\varvec{x}$) with a higher probability, as the variance approaches zero ($\alpha _{n}\rightarrow 0$). Therefore, when the corresponding variance value ${\varvec{\alpha }}$ is lower than a fixed threshold, the scheme in which the columns of the dictionary are pruned, are exploited to enforce sparsity [16, 31, 32]. Noteworthy, this process can enforce sparsity and reduce the dimension of the dictionary, so decreasing the computational complexity. In some cases, these thresholds may not be tractable choices because they have a significant effect on the estimation errors. Alternatively, the authors in [33] proposed an empirical threshold which is set by a specific SNR ratio, which seems more robust but lacks of theoretical support. In [34] the authors formulated the variance of the prior as Gamma distributed with distinct parameters and used Chebyshev’s inequality to determine the threshold. But, this method relies on a prior knowledge about the sparse level (or number of sources for DOA estimation problems). As we know, constant false alarm rate is a classical rule to make detection problem robust against changes in the operation conditions. In this paper, we enforce sparsity by integrating a CFAR algorithm in the SBL DOA estimator.

Recently, non-uniform linear arrays arouse widespread attention because they can achieve lower estimation errors compared to ULA with the same number of sensors [35, 36]. However, several traditional methods, such as spatial smoothing MUSIC [37], least squares ESPRIT [6], and total least squares ESPRIT [7] cannot be directly applied since they require to divide the array into subarrays with similar geometry. Fortunately, the SBL method is not limited to the array geometry used for sampling measurements. It is worth noting that we can easily generalize the aforementioned SBL, OGSBL, CGDP-SBL methods to the case of non-uniform sampling. However, the Root SBL method is slightly different since the polynomial coefficients are non-uniform. Generalizing the proposed algorithm to non-uniform sampling is another contribution of this paper.

Summarizing, the main contributions of this paper are:

(i)
To enforce sparsity, reduce off-grid errors and keep resonable the computational complexity, a CFAR rule is integrated into the SBL model, and the adaptive threshold is derived for pruning the columns of the dictionary.
(ii)
To generalize the proposed method to the NULA configuration, an irregular polynomial rooting method is incorporated into the SBL framework. As a result, off-grid DOAs can be obtained as roots of a certain polynomial.
(iii)
Finally, an alternating iterative optimization sche-me is exploited to obtain the hyperparameters ${\varvec{\alpha }}^{{\text{new}}}$, $\beta ^{{\text{new}}}$, the spatial angle grids ${\varvec{\theta }}^{{\text{new}}}$, the adaptive thresholds $T(\tilde{\varvec{\alpha }}_{n})$, and the expectation of weight matrix $\varvec{X}$. Several simulation results are presented to show that the proposed method has a remarkable performance, such as, estimation accuracy close to the Cramér-Rao Bound and low computational complexity.

Notations: Throughout this paper, parameters, vectors, matrices and sets are denoted by italic letters, italic lowercase bold letters, italic uppercase bold letters and uppercase outline letters, respectively. The superscripts $(\cdot )^{*}$, $(\cdot )^{T}$, $(\cdot )^{H}$ and $(\cdot )'$ represent the complex conjugate, the transpose, the complex conjugate transpose, and the derivative, respectively. ${\mathbb {R}}$ and ${\mathbb {C}}$ respectively indicate the real space and complex space. Besides, ${\mathbb {E}}(\cdot )$ indicates the expectation operation. ${\text{diag}(\varvec{\alpha })}$ denotes a diagonal matrix with the diagonal elements being the vector $\varvec{\alpha }$. ${\text{Tr}(\cdot )}$ returns the trace of matrix, which is equal to sum of diagonal elements of matrix. $\odot$ indicates the hadamard product.

The remaining part of this paper is organized as follows: Sect. 2 presents the problem formulation for DOA estimation in ULA or NULA. In Sect. 3, an improved root SBL method is presented for off-grid DOA estimation, which provides a CFAR rule for dictionary basis selection. Section 4 provides several simulation results, and finally, the conclusions are drawn in Sect. 5.

2 Problem formulation

We suppose that K narrow-band far-field sources impinge on an M-element uniform linear array (ULA) or non-uniform linear array (NULA) as depicted in Fig. 1, and the corresponding sensor positions are respectively given by ${\mathbb {S}}_{{\text{ULA}}}d=[0,1,2,3,4,5]d$ and ${\mathbb {S}}_{{\text{NULA}}}d=[0,1,2,5,8,11]d$, where d is set to half the wavelength. The K sources arrive at the array from distinct directions, with respect to (w.r.t.) the normal of the array. The $M\times 1$ array output vector ${\varvec{y}}_t$ is given by

$$\begin{aligned} {\varvec{y}}_t={\varvec{A}}{\varvec{x}}_t+{\varvec{n}}_t,t=0,1,\cdots ,L-1 \end{aligned}$$

(1)

where ${\varvec{y}}_t=[y_{1}(t),y_{2}(t),\cdots ,y_{M}(t)]^{T}$, ${\varvec{x}}_t=[x_{1}(t),x_{2}(t), \cdots ,x_{N}(t)]^{T}$ represents the weight vector where K weights are non-zeros; ${\varvec{A}}=[{\varvec{a}}(\theta _{1}),{\varvec{a}}(\theta _{2}),\cdots ,{\varvec{a}}(\theta _{N})]\in {\mathbb {C}}^{M\times N}$ indicates the steer matrix, where K vectors are related to DOA of K sources; and N is the dictionary size; ${\varvec{a}}(\theta _{n})=e^{j2\pi d{{\mathbb {S}}}\sin (\theta _{n})/\lambda }$; and L is the number of snapshots. Herein, we define ${{\mathbb {S}}}={{\mathbb {S}}}_{{\text{ULA}}}$ for the ULA, while ${{\mathbb {S}}}={{\mathbb {S}}}_{{\text{NULA}}}$ for the NULA. ${\varvec{n}}_t=[n_{1}(t),n_{2}(t),\cdots ,n_{M}(t)]^{T}$ is the noise vector. Define ${\varvec{Y}}=[{\varvec{y}}_0,{\varvec{y}}_1,\cdots ,{\varvec{y}}_{L-1}]$, ${\varvec{N}}=[{\varvec{n}}_0,{\varvec{n}}_1,\cdots ,{\varvec{n}}_{L-1}]$, and ${\varvec{X}}=[{\varvec{x}}_0,{\varvec{x}}_1,\cdots ,{\varvec{x}}_{L-1}]$. The array output can be rewritten as multiple measurement vectors model:

$$\begin{aligned} {\varvec{Y}}={\varvec{A}}{\varvec{X}}+{\varvec{N}} \end{aligned}$$

(2)

The weight matrix $\varvec{X}$ to be estimated is jointly sparse (or row-sparse), i.e., all columns of are sparse and share the same support.

3 Improved root SBL method

3.1 Hierarchical sparse Bayesian framework

Under the assumption of circular symmetric complex white Gaussian noise [38], we have

$$\begin{aligned} p({\varvec{N}}|\beta )=\prod _{t=0}^{L-1}\mathcal{C}\mathcal{N}({\varvec{n}}_{t}|0,\beta ^{-1}{\varvec{I}}) \end{aligned}$$

(3)

where $\beta ^{-1}=\sigma ^{2}$ indicates the noise variance. According to the array output model of Eq. 2, we have

$$\begin{aligned} p({\varvec{Y}}|{\varvec{X}},\beta ;{\varvec{\theta }})=\prod _{t=0}^{L-1}\mathcal{C}\mathcal{N} ({\varvec{y}}_{t}|{\varvec{A}}{\varvec{x}}_{t},\beta ^{-1}{\varvec{I}}) \end{aligned}$$

(4)

${\varvec{y}}_{t}$ indicates the t-th column of ${\varvec{Y}}$ and ${\varvec{\theta }}=[\theta _1,\theta _2,\cdots ,\theta _N]$. In order to obtain a two-stage hierarchical prior that forces most rows of ${\varvec{X}}$ being zeros, a sparse prior is required for the sparse matrix ${\varvec{X}}$ of interest. Typically, under the SBL framework, a zero-mean Gaussian prior is often applied [16, 18, 25]:

$$\begin{aligned} p({\varvec{X}}|{\varvec{\alpha }})=\prod _{t=0}^{L-1}\mathcal{C}\mathcal{N}({\varvec{x}}_{t}|0,{\varvec{\Lambda }}) \end{aligned}$$

(5)

where ${\varvec{\Lambda }}={\text{diag}({\varvec{\alpha }})}$ and ${\varvec{\alpha }}=[\alpha _{1},\alpha _{2},\cdots ,\alpha _{N}]^{T}$ is a column vector, whose entries, $\alpha _{n}$, are referred to as the hyperparameters. When $\alpha _{n}$ approaches zero, $\langle {\varvec{x}}_{t}\rangle _{n}$ is forced to zero since the distribution of the $\langle {\varvec{x}}_{t}\rangle _{n}$ is highly concentrated around the expected value (zero) with high probability. Furthermore, a Gamma distribution is considered over $\varvec{\alpha }$ and $\beta$ [16, 18, 25]:

$$\begin{aligned}&p({\varvec{\alpha }})=\prod _{n=1}^{N}{\text{Gamma}}(\alpha _{n};1,\rho ) \end{aligned}$$

(6)

$$\begin{aligned}&p(\beta )={\text{Gamma}}(\beta ;a,b) \end{aligned}$$

(7)

where $p(\alpha _{n};1,\rho )={\rho {\text{exp}}(-\rho \alpha _{n})}$, $p(\beta ;a,b)=\beta ^{a-1}b^{a} \cdot \frac{{\text{exp}}(-b\beta )}{\Gamma (a)}$. And $\rho$, a, and b are geneally set as small positive values so as to obtain a broad hyperior [16, 18]. The conditional distribution $p({\varvec{Y}}|{\varvec{X}},\beta ;{\varvec{\theta }})$ and prior distribution $p({\varvec{X}}|{\varvec{\alpha }})$ constitutes 2-level hierarchical SBL framework, as depicted in Fig. 2.

3.2 Bayesian inference

Paper [26] has discussed the off-grid DOA estimation problem using polynomial rooting method. However, the above method is only valid for the ULA case. To generalize the above method to the case of NULA, we re-derive the update expressions for hyperparameters and polynomial coefficients under the NULA case. To estimate the unknown hyperparameters, we adopt the MAP estimator. According to the Bayes theorem, the a posterior distribution can be written as follows:

$$\begin{aligned} p({\varvec{X}},{\varvec{\alpha }},\beta |{\varvec{Y}};{\varvec{\theta }})=\frac{p({\varvec{Y}}|{\varvec{X}},{\varvec{\alpha }},\beta ;{\varvec{\theta }})p({\varvec{X}},{\varvec{\alpha }},\beta )}{ p({\varvec{Y}})} \end{aligned}$$

(8)

However, the normalized integral in the denominator cannot be calculated directly, $p({\varvec{Y}})=\iiint p({\varvec{X}},{\varvec{\alpha }},\beta ) p({\varvec{Y}}|{\varvec{X}},{\varvec{\alpha }},\beta )d{\varvec{X}}d{\varvec{\alpha }}d\beta$. Instead, we decompose the posterior as $p({\varvec{X}},{\varvec{\alpha }},\beta |{\varvec{Y}})=p({\varvec{X}}|{\varvec{\alpha }},\beta ,{\varvec{Y}})p({\varvec{\alpha }},\beta |{\varvec{Y}})$. When the hyperparameters achieve most-probable values ${\varvec{{\hat{\alpha }}}}$, ${\hat{\beta }}$, $p({\varvec{\alpha }},\beta |{\varvec{Y}})$ can be approximated by a Dirac delta function [16]. The reason why we can adopt this point estimate as a posteriori is that the function generated using the posteriori mode values are nearly identical to the function obtained by sampling from the full posteriori distribution. Consequently, the posterior can be simplify as $p({\varvec{X}},{\varvec{\alpha }},\beta |{\varvec{Y}};{\varvec{\theta }})=p({\varvec{X}}|{\varvec{\alpha }},\beta ,{\varvec{Y}};{\varvec{\theta }})\delta ({\varvec{\alpha }}-{\varvec{{\hat{\alpha }}}})\delta (\beta -{\hat{\beta }})=p({\varvec{X}}|{\varvec{{\hat{\alpha }}}},{\hat{\beta }},{\varvec{Y}};{\varvec{\theta }})\delta ({\varvec{\alpha }}-{\varvec{{\hat{\alpha }}}})\delta (\beta -{\hat{\beta }})$.

To derive the MAP estimate of the hyperparameters, an EM procedure can be exploited. The expectation framework integrates over the weight ${\varvec{X}}$ to obtain the expectation function ${\text{ln}}p(\beta ,{\varvec{\alpha }}|{\varvec{Y}};{\varvec{\theta }})$, or equivalently, ${\text{ln}}p({\varvec{Y}},\beta ,{\varvec{\alpha }};{\varvec{\theta }})$ (E-step), since ${\text{ln}}p({\varvec{Y}},\beta ,{\varvec{\alpha }};{\varvec{\theta }})={\text{ln}}p(\beta ,{\varvec{\alpha }}|{\varvec{Y}};{\varvec{\theta }})p({\varvec{Y}})$ and $p({\varvec{Y}})$ is independent of the parameters $\beta ,{\varvec{\alpha }},{\varvec{\theta }}$. In the maximization step, the expectation function is maximized w.r.t. the parameters of interest, e.g., ${\hat{\beta }},{\varvec{{\hat{\alpha }}}},{\varvec{{\hat{\theta }}}}$ (M-step). Once ${\hat{\beta }},{\varvec{{\hat{\alpha }}}},{\varvec{{\hat{\theta }}}}$ are obtained, the associated posterior $p({\varvec{X}},{\varvec{\alpha }},\beta |{\varvec{Y}};{\varvec{\theta }})$ is approximated as $p({\varvec{X}},\hat{\varvec{\alpha }},\hat{\beta }|{\varvec{Y}};\hat{\varvec{\theta }})$ and the expectation of the above-mentioned approximate posterior is exploited as a point estimation of ${\varvec{X}}$.

In the E-step, we treat ${\varvec{X}}$ as latent variables. From Bayes’ rule, the posterior distribution $p({\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }})$ turns out to be a multivariate complex Gaussian [16, 18]:

$$\begin{aligned} p({\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }})&=\frac{p({\varvec{Y}}|{\varvec{X}}, \beta ;{\varvec{\theta }})p({\varvec{X}}|{\varvec{\alpha }})}{\int p({\varvec{Y}}|{\varvec{X}},\beta ;{\varvec{\theta }})p({\varvec{X}}|{\varvec{\alpha }})d{\varvec{X}}}\\&=\prod _{t=0}^{L-1}\mathcal{C}\mathcal{N}({\varvec{x}}_{t}|{\varvec{\mu }}_{t},{\varvec{\Sigma }}) \end{aligned}$$

(9)

^{Footnote 1} where

$$\begin{aligned} {\varvec{\mu }}_{t}&=\beta {\varvec{\Sigma }}{\varvec{A}}^{H}{\varvec{y}}_{t},t=0,1,\cdots ,L-1 \end{aligned}$$

(10)

$$\begin{aligned} {\varvec{\Sigma }}&=(\beta {\varvec{A}}^{H}{\varvec{A}}+{\varvec{\Lambda }}^{-1})^{-1} \end{aligned}$$

(11)

According to the expectation framework, the complete data log-likelihood function is written as:

$$\begin{aligned}&{\text{ln}}p(\beta ,{\varvec{\alpha }},{\varvec{Y}},{\varvec{X}};{\varvec{\theta }})\\&\quad ={\text{ln}}[p({\varvec{Y}}|{\varvec{X}},\beta ;{\varvec{\theta }})p({\varvec{X}}|{\varvec{\alpha }})p(\beta )p({\varvec{\alpha }})]\\&\quad =-L\sum _{n=1}^{N}{\text{ln}}(\pi \alpha _{n})-\sum _{t=0}^{L-1}({{\varvec{x}}_{t}^{H}}{\varvec{\Lambda }}^{-1}{\varvec{x}}_{t})+LM{\text{ln}}\left( \frac{\beta }{\pi }\right) \\&\quad -\sum _{t=0}^{L-1}(\beta \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{x}}_{t}\Vert _{2}^{2})+\sum _{n=1}^{N}(-\rho \alpha _{n})+(a-1){\text{ln}}(\beta )\\&\quad -b\beta +N{\text{ln}}(\rho )+{\text{ln}}(b^{a}/\Gamma (a)) \end{aligned}$$

(12)

The above expression is consistent with the hierarchical Bayesian model of Fig. 2. In the E-step, under SBL framework, we then formulate the expectation function w.r.t. ${\varvec{X}}$, yielding:

$$\begin{aligned}&{\mathbb {E}}_{{\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }}}\{\text{ln}[p(\beta ,{\varvec{\alpha }},{\varvec{Y}},{\varvec{X}};{\varvec{\theta }})]\}\\&\quad =\left. \left\{ -\sum _{t=0}^{L-1}{{\mathbb {E}}}_{{\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }}}[{{\varvec{x}}_{t}^{H}}{\varvec{\Lambda }}^{-1}{\varvec{x}}_{t}+\beta \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{x}}_{t}\Vert _{2}^{2}\right] \right. \\&\quad +LM{\text{ln}}(\beta )-L\sum _{n=1}^{N}{\text{ln}}(\alpha _{n})+\sum _{n=1}^{N}(-\rho \alpha _{n})\\ {}&+(a-1){\text{ln}}(\beta )-b\beta +Const1\} \end{aligned}$$

(13)

Finally, the expectation function can be simplified as (See Appendix A for detailed derivation)

$$\begin{aligned}&{\text{ln}}\left[ p(\beta ,{\varvec{\alpha }},{\varvec{Y}};{\varvec{\theta }})\right] =-\sum _{t=0}^{L-1}[\beta \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}\\&\quad +{\varvec{\mu }}_{t}^{H}{{\varvec{\Lambda }}^{-1}}{\varvec{\mu }}_{t}]+L{\text{ln}}{|{\varvec{\Sigma }}|} +LM{\text{ln}}\left( \frac{\beta }{\pi }\right) \\&\quad -L\sum _{n=1}^{N}{\text{ln}}(\pi \alpha _{n})+\sum _{n=1}^{N}(-\rho \alpha _{n})\\&\quad +(a-1){\text{ln}}(\beta )-b\beta +Const2 \end{aligned}$$

(14)

where $Const1={N{\text{ln}}(\rho )+{\text{ln}}(b^{a}/\Gamma (a))}-LM{\text{ln}}\pi -LN{\text{ln}}\pi$ and $Const2=Const1+LN{\text{ln}}\pi$. Hence the Q function has the form $Q(\beta ,{\varvec{\alpha }};{\varvec{\theta }})={\text{ln}}[p(\beta ,{\varvec{\alpha }},{\varvec{Y}};{\varvec{\theta }})]$ [21]. In the M-step, taking the derivative of $Q(\beta ,{\varvec{\alpha }};{\varvec{\theta }})$ w.r.t. the hyperparameters ${\alpha }_{n}$ and $\beta$ and setting it to zero, results in:

$$\begin{aligned}&\frac{\partial {Q(\beta ,{\varvec{\alpha }};{\varvec{\theta }})} }{\partial \alpha _{n}} =\sum _{t=0}^{L-1}\langle ({\varvec{\mu }}_{t}{\varvec{\mu }}_{t}^{H})\rangle _{nn}\frac{1}{\alpha _{n}^{2}}\\&\quad +\langle {\varvec{\Sigma }}\rangle _{nn}\frac{1}{\alpha _{n}^{2}}-\frac{L}{\alpha _{n}}-\rho =0 \end{aligned}$$

(15)

$$\begin{aligned}&\frac{\partial {Q(\beta ,{\varvec{\alpha }};{\varvec{\theta }})}}{\partial \beta } =-\sum _{t=0}^{L-1}\Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}\\&\quad -L{\text{Tr}}({\varvec{A\Sigma }}{\varvec{A}}^{H})+\frac{LM+a-1}{\beta }-b=0 \end{aligned}$$

(16)

Herein, a corollary is adopted: $[{\text{ln}}|{\varvec{\Sigma }}|]'={\text{Tr}}({\varvec{\Sigma }}^{-1}[{\varvec{\Sigma }}]')$ [39], and then we have $L[{\text{ln}}{|{\varvec{\Sigma }}|}]'=L[{\text{ln}}{|{\varvec{Z}}|^{-1}}]'=L{\text{Tr}}({\varvec{Z}}[{\varvec{Z}}^{-1}]')=-L{\text{Tr}}({\varvec{Z}}{\varvec{Z}}^{-2}[{\varvec{Z}}]')=-L{\text{Tr}}({\varvec{\Sigma }}[{\varvec{Z}}]')$ by defining ${\varvec{\Sigma }}^{-1}={\varvec{Z}}=\beta {\varvec{A}}^{H}{\varvec{A}}+{\varvec{\Lambda }}^{-1}$. From Eqs. (15) and (16) we obtain:

$$\begin{aligned}&{\alpha }_{n}^{{\text{new}}} =\frac{-L+\sqrt{L^{2}+4\rho \sum _{t=0}^{L-1} \langle {\varvec{\Xi }}_{t}\rangle _{nn}}}{2\rho } \end{aligned}$$

(17)

$$\begin{aligned}&\beta ^{{\text{new}}} =\frac{LM+(a-1)}{b+\sum _{t=0}^{L-1}\Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}+L{\text{Tr}}({\varvec{A}}{\varvec{\Sigma }}{\varvec{A}}^{H})} \end{aligned}$$

(18)

where ${\varvec{\Xi }}_{t}={\varvec{\mu }}_{t}{\varvec{\mu }}_{t}^{H}+{\varvec{\Sigma }}$.

3.3 Find roots of an irregular polynomial

We now focus on the estimation of ${\varvec{\theta }}$. The update of ${\varvec{\theta }}$ follows a similar process as for ${\varvec{\alpha }}$ and $\beta$. Ignoring independent terms in Eq. (14), we just need to maximize

$$\begin{aligned}&{Q({\varvec{\theta }})}=-\beta \sum _{t=0}^{L-1}\Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}+L{\text{ln}}{|{\varvec{\Sigma }}|} \end{aligned}$$

(19)

Take the derivative of Eq. (19) w.r.t. $\theta _{k}$ and set it to zero, giving (See Appendix B for detailed derivation)

$$(ak^{'} )^{H} \underbrace {{\left( {L\sum\limits_{{j \ne k}} {a_{j} } \gamma _{{jk}} - \sum\limits_{{t = 0}}^{{L - 1}} {\mu _{{tk}}^{*} } y_{{\frac{t}{k}}} } \right)}}_{{\Delta \varphi _{k} }} + (ak^{'} )^{H} a_{k} \underbrace {{\left( {\sum\limits_{{t = 0}}^{{L - 1}} | \mu _{{tk}} |^{2} + L\gamma _{{kk}} } \right)}}_{{\Delta \phi _{k} }} = 0$$

(20)

where ${\varvec{y}}_{t{\setminus } k}={\varvec{y}}_{t}-\sum _{j\ne k}\mu _{tj}{\varvec{a}}_{j}$, ${\varvec{a}}_{k}$, $\mu _{tk}$, ${\varvec{\gamma }_{k}}$, and $\gamma _{jk}$ denote the k-th column, k-th element, k-th column, and the (j, k)-th element of ${\varvec{A}}$, ${\varvec{\mu }}_{t}$, ${\varvec{\Sigma }}$, and ${\varvec{\Sigma }}$, respectively. We define $v_{\theta _{k}}\triangleq \exp ({j2\pi d{\sin }(\theta _{k})/\lambda })$, $u_{\theta _{k}}\triangleq ({j2\pi d{\cos }(\theta _{k})/\lambda })$, and $({\varvec{a}}_{k}^{'})=u_{\theta _{k}}{{\mathbb {S}}}\odot {\varvec{a}}_{k}$. Right multiplying by vector ${\varvec{a}}_{k}$ results in $u_{\theta _{k}}({{\mathbb {S}}}^{T}\odot {\varvec{a}}_{k}^{H}){\varvec{a}}_{k}=u_{\theta _{k}}\Vert {{\mathbb {S}}}\Vert _{1}$, where $\Vert \cdot \Vert _{1}$ denotes the 1-norm which return the sum of elements in a vector. $\Vert {{\mathbb {S}}}\Vert _{1}=M(M-1)/2$ for an M elements ULA. Therefore, our algorithm can be applied to both ULA and non-ULA scenarios. It is worth noting that $\langle ({\varvec{a}}_{k}^{'})\rangle _{1}=u_{\theta _{k}}\langle {{\mathbb {S}}}\odot {\varvec{a}}_{k}\rangle _{1}=0$ since $\langle {{\mathbb {S}}}\rangle _{1}=0$, then the coefficient of the highest-order term of the polynomial can be written as $\Vert {\mathbb {S}}\Vert _{1}\phi _{k}+\langle ({\varvec{a}}_{k}^{'})\rangle _{1}\langle {\varvec{\varphi }}_{k}\rangle _{1}=\Vert {\mathbb {S}}\Vert _{1}\phi _{k}$, and $\langle {\varvec{\varphi }}_{k}\rangle _{m}$ denotes the mth element of the vector ${\varvec{\varphi }}_{k}$. After simple operations, we have

$$\begin{aligned} \begin{bmatrix}1&v_{\theta _{k}}^{\langle {{\mathbb {S}}}\rangle _{2}}&\cdots&v_{\theta _{k}}^{\langle {{\mathbb {S}}}\rangle _{M}}\end{bmatrix}\begin{bmatrix}\Vert {\mathbb {S}}\Vert _{1}\phi _{k}\\ \langle {{\mathbb {S}}}\rangle _{2}\langle {\varvec{\varphi }}_{k}\rangle _{2}\\ \vdots \\ \langle {{\mathbb {S}}}\rangle _{M}\langle {\varvec{\varphi }}_{k}\rangle _{M} \end{bmatrix}=0 \end{aligned}$$

(21)

The solution to (21) gives estimation of $\theta _{k}$. As a matter of fact, this equation can be reformulated as a ${{\mathbb {S}}}_{M}$ order polynomial rooting, and the corresponding polynomial coefficient ${\varvec{w}}\in {\mathbb {C}}^{(\langle {{\mathbb {S}}}\rangle _{M}+1)\times 1}$ can be written as

$$\begin{aligned} \langle {\varvec{w}}\rangle _{i}\!=\!{\left\{ \begin{array}{ll} \Vert {\mathbb {S}}\Vert _{1}\phi _{k}, &{} i=1\\ \langle {{\mathbb {S}}}\rangle _{m}\langle {\varvec{\varphi }}_{k}\rangle _{m}, &{} i=\langle {{\mathbb {S}}}\rangle _{m}+1,m\in [2,\cdots ,M]\\ 0, &{} i\notin ({{\mathbb {S}}}+1) \end{array}\right. } \end{aligned}$$

(22)

Clearly, as the the polynomial is of order $\langle {{\mathbb {S}}}\rangle _{M}+1$, it has ${{\mathbb {S}}}_{M}$ roots but only one is the solution. Due to the existence of noise, the root may not fall into the unit circle. Following the general scheme of the polynomial rooting finding method, we can directly select the root ${\hat{v}}_{\theta _{k}}$ closest to the unit circle. The angle for the refined grid can be obtained as:

$$\begin{aligned} \theta _{k}^{{\text{new}}}={\text{arcsin}}\left( \frac{-\lambda }{2\pi d}\dot{{\text{angle}}}\left( {\hat{v}}_{\theta _{k}}\right) \right) \end{aligned}$$

(23)

The intial coarse grid covers the true DOA range, the DOA estimate on the rough grid $\theta _{n}$ is near the true DOA $\theta _{k}^{{\text{new}}}$. Therefore, a valid solution $\theta _{k}^{{\text{new}}}$ should fall into the interval $[\theta _{n}-\kappa /2,\theta _{n}+\kappa /2]$, where $\kappa$ indicates the grid interval; otherwise, it should be rejected and the corresponding grid point remains unchanged.

3.4 Sparsification process with CFAR rule

The former root sparse Bayesian learning method aims at achieving the estimation of the sparse matrix ${\varvec{X}}$, but strict sparsity cannot be satisfied since there are numerous weights close to zero but not exactly equal to zero. Unfortunately, these small weights can deteriorate the estimation accuracy. Therefore, in this session, we try to formulate rules to reject these weights. According to the binary hypothesis theory, the signal is not present in the corresponding spatial grid under ${{\mathcal {H}}}_{0}$ hypothesis, while it exists under ${{\mathcal {H}}}_{1}$ hypothesis. Mathematically, the detection problem corresponds to choosing between these two hypotheses:

$$\begin{aligned} {\left\{ \begin{array}{ll} \alpha _{n}=0 &{} {{\mathcal {H}}}_{0}\\ \alpha _{n}>0 &{} {{\mathcal {H}}}_{1} \end{array}\right. } \end{aligned}$$

(24)

for $n=1,2,\cdots ,N$, where N is the size of dictionary. Since we treat the distribution over ${\varvec{x}}_{t}$ as a zero mean Gaussian distribution $\mathcal{C}\mathcal{N}({\varvec{x}}_{t}|0,{\varvec{\alpha }})$, once the variance $\alpha _{n}$ is equal to zero and thus the corresponding distribution approaches 1. It indicates that ${\varvec{x}}_{t}$ approaches 0 with probability 1, and in other words, choosing ${\mathcal {H}}_{0}$ as the true hypothesis is equivalent to pruning rows from the weight matrix $\varvec{X}$ [17, 31].

Collect all hyperparameters $\alpha _{n}$ to form a set $\varvec{\alpha }$ as the input of CFAR processor. The block diagram of CFAR processor is shown in Fig. 3. We then separate the set $\varvec{\alpha }$ as several training cells $\tilde{\varvec{\alpha }}_{n}=[\alpha _{n-Q/2-1},\cdots ,\alpha _{n-2},\alpha _{n+2},\cdots ,\alpha _{n+Q/2+1}]^{T}\in {\mathbb {R}}^{Q\times 1}$, $\tilde{\varvec{\alpha }}_{n}\subsetneq {\varvec{\alpha }}$. It is noted that $\alpha _{n}$ is the cell under test (CUT), while $\alpha _{n-1}$ and $\alpha _{n+1}$ are the guard cells. According to previous assumptions under sparse Bayesian learning framework, the hyperparameters $\varvec{\alpha }$ are independent and identical Gamma distributed $Gamma(\alpha _{q};1,\rho )$. Consequently, the joint probability density function of the vector $\tilde{\varvec{\alpha }}_{n}$ can be expressed as:

$$\begin{aligned} p(\tilde{\varvec{\alpha }}_{n})&=\prod _{q=1}^{Q}\rho {\text{exp}} (-\rho \alpha _{q})=\rho ^{Q}{\text{exp}}\left( -\rho \sum _{q=1}^{Q}\alpha _{q}\right) \end{aligned}$$

(25)

where $\alpha _{q}\in \tilde{\varvec{\alpha }}_{n},q\in [1,2,\cdots ,Q]$. According to the method of moments, $\rho$ can be estimated as:

$$\begin{aligned} \hat{\rho }^{-1}=\frac{1}{Q}\sum _{j=1}^{Q}\alpha _{q} \end{aligned}$$

(26)

In this paper, a well known detector, known as cell averaging constant false alarm rate (CA-CFAR) [40] is applied. The detection threshold $T(\tilde{\varvec{\alpha }}_{n})$ is calculated in terms of multiplier $\lambda$ and $\hat{\rho }$:

$$\begin{aligned} T(\tilde{\varvec{\alpha }}_{n})=\lambda \hat{\rho }^{-1}=\frac{\lambda }{Q}\sum _{j=1}^{Q}\alpha _{q} \end{aligned}$$

(27)

A false alarm occurs when the CUT exceeds the threshold, yielding:

$$\begin{aligned} p_{{\text{FA}}}=\int _{0}^{\infty }p({\text{CUT}}>T(\tilde{\varvec{\alpha }}_{n})) p(\tilde{\varvec{\alpha }}_{n})d{\tilde{\varvec{\alpha }}_{n}} \end{aligned}$$

(28)

Herein, by substituting Eq. (27), the first term in the integral function Eq. (28) is then calculated by:

$$\begin{aligned}&p({\text{CUT}}>T(\tilde{\varvec{\alpha }}_{n})) =\int _{T(\tilde{\varvec{\alpha }}_{n})}^{\infty }\rho {\text{exp}}(-\rho x)dx\\&\quad ={\text{exp}}(-\rho T(\tilde{\varvec{\alpha }}_{n})) ={\text{exp}}\left( -\frac{\rho \lambda \sum _{j=1}^{Q}\alpha _{q}}{Q}\right) \end{aligned}$$

(29)

Finally, substituting Eqs. (25) and (29) into Eq. (28), we can further obtain:

$$\begin{aligned}&p_{{\text{FA}}}=\int _{0}^{\infty }{\text{exp}}\left( -\frac{\rho \lambda \sum _{j=1}^{Q}\alpha _{q}}{Q}\right) \rho ^{Q}{\text{exp}}(-\rho \sum _{q=1}^{Q}\alpha _{q})d{\tilde{\varvec{\alpha }}_{n}}\\&\quad =\prod _{q=1}^{Q}\int _{0}^{\infty }\rho {\text{exp}}\left( -\frac{\rho (\lambda +Q)\alpha _{q}}{Q}\right) d\alpha _{q}\\&\quad =\left( 1+\frac{\lambda }{Q}\right) ^{-Q} \end{aligned}$$

(30)

Equation (30) clearly shows that the false alarm probability $p_{{\text{FA}}}$ is independent of $\rho$. Replacing Eq. (30) into Eq. (27), yields a rather simple expression for the threshold:

$$\begin{aligned} T(\tilde{\varvec{\alpha }}_{n})=(p_{{\text{FA}}}^{-1/Q}-1)\sum _{q=1}^{Q}\alpha _{q} \end{aligned}$$

(31)

As a result, for each test cell $\tilde{\varvec{\alpha }}_{n}\in {{\mathbb {R}}}^{Q\times 1}$, the detector makes decisions according to the following strategy:

$$\begin{aligned} \alpha _{n}\mathop {\gtrless }_{{\mathcal {H}}_{0}}^{{\mathcal {H}}_{1}}T(\tilde{\varvec{\alpha }}_{n}) \end{aligned}$$

(32)

The details of the proposed IRSBL method are summarized in Alg. 1.^{Footnote 2} , ^{Footnote 3} , ^{Footnote 4}

4 Simulation results

In this section, we present some numerical results obtained by Monte Carlo simulation. We investigate the performance of the proposed algorithm and compare them to those of previously proposed SBL [18, 22], off-grid SBL (OGSBL) [25], Root SBL (RSBL) [26], complex generalized double Pareto SBL (SBL-CGDP1 and SBL-CGDP2) [27], and real-valued variational Bayesian inference (RVBI) [28]. The difference between CGDP1 and CGDP2 method is that they adopt different expressions for noise variance update; CGDP2 is more efficient in some respect. In the experiment, the ULA and the NULA are composed by $M=6$ sensors with position set ${\mathbb {S}}_{{\text{ULA}}}=[0,1,2,3,4,5]$ and ${\mathbb {S}}_{{\text{NULA}}}=[0,1,2,5,8,11]$. $K=2$ statistically independent, equal-power signals with incident angles $[-7.2^{\circ },13.3^{\circ }]$ impinge on the array. The distance of initial candidate angle is $\kappa =6^{\circ }$, so the initial candidate angles of incidence should be $[-90^{\circ },-84^{\circ },\cdots ,-12^{\circ },-6^{\circ },0^{\circ }, 6^{\circ },12^{\circ },\cdots ,84^{\circ },90^{\circ }]$, and the total number of grid points is 31. In our case, the true DOAs are not on the initial grid. The false alarm probability $P_{{\text{FA}}}$ was empirically chosen to be $10^{-3}$. The root mean square error (RMSE) is calculated by

$$\begin{aligned} {\text{RMSE}}=20\log _{10}\sqrt{\frac{1}{VK}\sum _{v=1}^{V}\sum _{k=1}^{K} {(\hat{\theta }_{k,v}-\theta _{k})^{2}}} \end{aligned}$$

(33)

where V is the number of Monte-Carlo trials, $\hat{\theta }_{k,v}$ denotes the estimate of DOA in v-th trial, and $\theta _{k}$ is the corresponding ground truth. In the following experiments, the RMSE was obtained through 1000 independent Monte-Carlo trials. In our simulation, we compared the RMSE vs SNR, the number of snapshots, the grid interval of the initial dictionary, and the stochastic CRB [41].

4.1 Experiment 1

In this experiment, the number of snapshots is fixed at $L=320$, and the SNR is set to 10dB and 25 dB, respectively. ${\varvec{\mu }}_{t}$ is the expection of weight ${\varvec{x}}_{t}$, then the incident source power on updated grid points can be calculated by $\sum _{t=0}^{L-1}{\varvec{\mu }}_{t}\odot {\varvec{\mu }}_{t}^{*}/L$. After satisfying the stopping criterion, the incident source power on updated grid points is plotted in Fig. 4. Figure 4 shows that the proposed IRSBL method outperforms OGSBL, RSBL, CGDP1, CGDP2, and RVBI methods because the distribution of the source power is more accurate. Compared with the RSBL method, the IRSBL method enforces sparsity, and thus yields better estimation accuracy. Moreover, it can be observed that with the improvement of the SNR, the estimation performance of RSBL method for off-grid points is more accurate. For other methods, such as OGSBL, CGDP1, CGDP2, and RVBI methods, grid errors are inevitable. As mentioned above, compared with the OGSBL method, the RVBI method reduces the power of the wrong grid points by subspace projection, which makes DOA estimation performance better than the OGSBL method, but also leads to pseudo-spectral problems. In addition, both the above methods encounter grid point errors due to model truncation. Conversely, CGDP1 and CGDP2 methods are affected by two problems: i) the choice of search step size involves a trade-off between computational complexity and estimation accuracy. ii) Since the alternate iterative optimization operation is not performed on the hyperparameters and the angular grid, there is a mismatch in their estimated values. Therefore, neither the hyperparameters nor the angular grid are optimal, which inevitably leads to estimation errors.

We then provide the evolution of several parameters of interest as a function of the iteration number. The evolution of spatial angle grids of the RSBL, RVBI, and IRSBL methods are presented in Fig. (5)(a-f). The black circles and red lines indicate the spatial angle grids points and the true angles, respectively. As shown in Fig. 5, all methods can approach the true angles after several iteration. As we know, the initial value of the 15th grid point $(\theta _{15}=-6^\circ )$ and the 18th grid point $(\theta _{18}=12^\circ )$ indicate the direction of incident signal, since they are the the closest grid point to the true value $-7.2^{\circ }$ and $13.3^{\circ }$, respectively, where $|-6+7.2|<\kappa$, $|12-13.3|<\kappa$, and $\kappa =6^\circ$ in our simulation. In order to give a more intuitive impression of the 15th and 18th grid points being updated with iterations, we present the evolution of the 15th and the 18th grid point in Fig. 6(a–d). Meanwhile, in order to shown the estimation performance, the CRB region is calculated by $[-7.2-CRB_{-7.2^{\circ }},-7.2+CRB_{-7.2^{\circ }}]$ and $[13.3-CRB_{13.3^{\circ }},13.3+CRB_{13.3^{\circ }}]$, respectively, where $CRB_{-7.2^{\circ }}$ and $CRB_{13.3^{\circ }}$ denotes the CRB on DOA $-7.2^{\circ }$ and $13.3^{\circ }$, respectively. The corresponding CRB region is drawn in Fig. 6 as dashed lines. As depicted in Fig. 6(a–d), it can be clearly observed that the proposed IRSBL method yields more efficient convergence, and the final grids can fall well into the CRB region.

4.2 Experiment 2

4.2.1 RMSE versus SNR

In this scenario, the number of snapshots L is fixed at 320, and the SNR varies from -10 to 30dB with a 5dB step. The RMSE of the DOA estimate is illustrated in Figs. 7a and 7b, and the associated time consumption is depicted in Figs. 7c and 7d. Figures 7a and 7b show that the proposed IRSBL method has a remarkable performance compared with other existing methods. When the SNR is larger than 0dB for the ULA or -5dB for the NULA, the proposed IRSBL method achieves the CRB. In addition, the IRSBL method is superior to the RSBL method because it performs a sparsification process, which can avoid grid errors made by several small weight values, especially in a low SNR situation. The IRSBL method outperforms the RVBI method because it avoids grid point errors caused by model truncation. Furthermore, as expected for OGSBL, RSBL, CGDP1, CGDP2, RVBI, and IRSBL methods, the RMSE of the DOA estimates decreases monotonically with the SNR. Nevertheless, the SBL method can only converge to the nearest two grid points $-6^{\circ }$ and $12^{\circ }$, due to the limitation of the initial grid. As a result, the RMSE of the SBL method remains at $20\log _{10}\sqrt{(1.2^{2}+1.3^{2})/2}\approx 1.945{\text{dB}}$, when the SNR is large enough.

We also provide the average consumption time over 1000 Monte-Carlo trials, which is shown in Figs. 7c and 7d. When the SNR is large enough, the proposed IRSBL method has lower computational complexity than SBL, OGSBL, RSBL, RVBI, and CGDP methods since it satisfies the stopping criterion after a few iteration (see Fig. 8).

4.2.2 RMSE versus number of snapshots

In this scenario, the SNR was fixed at 10dB, and the number of snapshot spans from 10 to 2560. The RMSE of DOA estimate is illustrated in Figs. 9a and 9b in a log-log plot, and related time consumption is depicted in Figs. 9c and 9d. Figures 9a and 9b indicate that the RMSE of the proposed IRSBL method is not only superior to the other methods, but also close to the CRB. Furthermore, as expected, the RMSE of OGSBL, CGDP1, CGDP2, RSBL, RVBI, and IRSBL methods increase monotonically with the increase of snapshots. However, the SBL method can only converge to the nearest two grid points $-6^{\circ }$ and $12^{\circ }$, due to the limitation of the initial grid. As a result, the RMSE of the SBL method remains at $20\log _{10}\sqrt{(1.2^{2}+1.3^{2})/2}\approx 1.945{\text{dB}}$. As shown in Figs. 9c and 9d, compared with SBL, OGSBL, CGDP1, CGDP2, and RSBL methods, the proposed IRSBL yields low computational complexity. In addition, as expected, the computational complexity of the above methods increases with the number of snapshots, because both the dimensions of measurement matrix $\varvec{Y}$ and weight matrix ${\varvec{X}}$ increase. However, for the RVBI method, the computational complexity of the covariance matrix construction is almost negligible compared to the SBL-based hyperparameter update. Therefore, as the number of snapshots increases, the computational complexity remains stable to a certain extent.

5 Conclusion

In this paper, we proposed an improved root sparse Bayesian learning algorithm for off-grid DOA estimation for ULA and NULA which relies on a polynomial rooting method. A CFAR algorithm is integrated into the SBL framework to achieve a sparse result even in the presence of grid mismatch and additive noise. The proposed IRSBL method has remarkable performance both in terms of estimation accuracy and computational complexity. Numerical results show that proposed method is more efficient and has faster convergence, and so lower computational complexity, than existing SBL, OGSBL, CGDP1, CGDP2, and RSBL methods. However, when the SNR is very low, the convergence of the proposed IRSBL method becomes slower and so computational load increases. Moreover, to a certain extent, the computational complexity of the RVBI method is insensitive to the number of snapshots. As a results, when the number of snapshots is larger, the RVBI method has lower computational complexity than the proposed IRSBL method.

Notes

For marginal and conditional distributions that respectively obey the Gaussian distribution $p(\varvec{x})={\mathcal{C}\mathcal{N}({\varvec{x}}|{\varvec{\nu }},{\varvec{\Xi }})}$ and $p({\varvec{y}}|{\varvec{x}})={\mathcal{C}\mathcal{N}({\varvec{y}}|{\varvec{\Phi x+b}},{\varvec{\Delta }})}$, the posterior distribution w.r.t. $\varvec{x}$ can be written as $p({\varvec{x}}|{\varvec{y}})=\mathcal{C}\mathcal{N}({\varvec{x}}|{\mathcal {C}}{\mathcal {N}}({\textbf{x}}|{{({{\varvec{\Xi }}^{-1}}+{{\varvec{\Phi }}^{H}}{{\varvec{\Delta }}^{-1}}\varvec{\Phi })}^{-1}}\{{{\varvec{\Phi }}^{H}}{{\varvec{\Delta }}^{-1}}(\varvec{y}-\varvec{b})+{{\varvec{\Xi }}^{-1}}\varvec{\nu }\},{{({{\varvec{\Xi }}^{-1}}+{{\varvec{\Phi }}^{H}}{{\varvec{\Delta }}^{-1}}\varvec{\Phi })}^{-1}})$. Then we can easily obtain Eq. (9) from Eq.(4-5).
To avoid the distribution ${\mathcal{C}\mathcal{N}}({\varvec{x}}_{t}|0,{\varvec{\Lambda }})$ loss of physical meaning, we set $\alpha _{n}$ to $10^{-30}$ instead of 0 under ${\mathcal {H}}_{0}$ hypothesis [31]. Notably, after setting the variance to $10^{-30}$, the complex Gaussian distribution is almost close to 0 with a probability of 1.
K denotes the number of elements greater than $10^{-30}$ in the set ${\varvec{\alpha }}$.
${\Vert {\mathbb {S}}}\Vert _{1}={M(M-1)}/{2}$ for an M elements ULA.

References

H. Krim, M. Viberg, Two decades of array signal processing research: the parametric approach. IEEE Signal Process. Mag. 13(4), 67–94 (1996)
Article Google Scholar
A. Farina, F. Gini, M. Greco, Doa estimation by exploiting the amplitude modulation induced by antenna scanning. IEEE Trans. Aerosp. Electron. Syst. 38(4), 1276–1286 (2002)
Article Google Scholar
M. Greco, F. Gini, A. Farina, Joint use of sum and delta channels for multiple radar target doa estimation. IEEE Trans. Aerosp. Electron. Syst. 43(3), 1146–1154 (2007)
Article Google Scholar
M. Pardini, F. Lombardini, F. Gini, The hybrid cramér-rao bound on broadside doa estimation of extended sources in presence of array errors. IEEE Trans. Signal Process. 56(4), 1726–1730 (2008)
Article MathSciNet MATH Google Scholar
M. Greco, F. Gini, A. Farina, M. Rangaswamy, Doa estimation and multi-user interference in a two-radar system. Signal Process. 89(4), 355–364 (2009)
Article MATH Google Scholar
R. Roy, A. Paulraj, T. Kailath, Esprit-a subspace rotation approach to estimation of parameters of cisoids in noise. IEEE Trans. Acoust. Speech Signal Process. 34(5), 1340–1342 (1986)
Article Google Scholar
R. Roy, T. Kailath, Esprit-estimation of signal parameters via rotational invariance techniques. IEEE Trans. Acoust. Speech Signal Process. 37(7), 984–995 (1989)
Article MATH Google Scholar
R.O. Schmidt, Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propaga. 34(3), 276–280 (1986)
Article Google Scholar
D. Malioutov, M. Cetin, A.S. Willsky, A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Process. 53(8), 3010–3022 (2005)
Article MathSciNet MATH Google Scholar
M. Carlin, P. Rocca, G. Oliveri, F. Viani, A. Massa, Directions-of-arrival estimation through Bayesian compressive sensing strategies. IEEE Trans. Antennas Propag. 61(7), 3828–3838 (2013)
Article MathSciNet MATH Google Scholar
S. Fortunati, R. Grasso, F. Gini, M.S. Greco, K. LePage, Single-snapshot doa estimation by using compressed sensing. EURASIP J. Adv. Signal Process. 2014(1), 1–17 (2014)
Article Google Scholar
R. Tibshirani, Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. Ser. B 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
S.S. Chen, D.L. Donoho, M.A. Saunders, Atomic decomposition by basis pursuit. SIAM Rev. 43(1), 129–159 (2001)
Article MathSciNet MATH Google Scholar
D.L. Donoho, I.M. Johnstone, Ideal spatial adaptation by wavelet shrinkage. Biometrika 81(3), 425–455 (1994)
Article MathSciNet MATH Google Scholar
L. Stanković, M. Daković, On a gradient-based algorithm for sparse signal reconstruction in the signal/measurements domain. Math. Probl. Eng. 2016, 6212674 (2016)
Article MathSciNet MATH Google Scholar
M.E. Tipping, Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1(Jun), 211–244 (2001)
MathSciNet MATH Google Scholar
D.P. Wipf, B.D. Rao, Sparse Bayesian learning for basis selection. IEEE Trans. Signal Process. 52(8), 2153–2164 (2004)
Article MathSciNet MATH Google Scholar
S. Ji, Y. Xue, L. Carin, Bayesian compressive sensing. IEEE Trans. Signal Process. 56(6), 2346–2356 (2008)
Article MathSciNet MATH Google Scholar
Z. Zhang, B.D. Rao, Sparse signal recovery with temporally correlated source vectors using sparse Bayesian learning. IEEE J. Sel. Top. Signal Process. 5(5), 912–926 (2011)
Article Google Scholar
S. Huanfeng, L. Xinghua, Z. Liangpei, T. Dacheng, Z. Chao, Compressed sensing-based inpainting of aqua moderate resolution imaging spectroradiometer band 6 using adaptive spectrum-weighted sparse Bayesian dictionary learning. IEEE Trans. Geosci. Remote Sens. 52(2), 894–906 (2014)
Article Google Scholar
R. Giri, B. Rao, Type i and type ii Bayesian methods for sparse signal recovery using scale mixtures. IEEE Trans. Signal Process. 64(13), 3418–3428 (2016)
Article MathSciNet MATH Google Scholar
P. Gerstoft, C.F. Mecklenbräuker, A. Xenaki, S. Nannuru, Multisnapshot sparse Bayesian learning for Doa. IEEE Signal Process. Lett. 23(10), 1469–1473 (2016)
Article Google Scholar
Z. Zhang, B.D. Rao. Clarify some issues on the sparse Bayesian learning for sparse signal recovery. Technical report, University of California, San Diego (Sep. 2011). Available at http://dsp.ucsd.edu/~zhilin/papers/clarify.pdf
H. Zhu, G. Leus, G.B. Giannakis, Sparsity-cognizant total least-squares for perturbed compressive sampling. IEEE Trans. Signal Process. 59(5), 2002–2016 (2011)
Article MathSciNet MATH Google Scholar
Z. Yang, L. Xie, C. Zhang, Off-grid direction of arrival estimation using sparse Bayesian inference. IEEE Trans. Signal Process. 61(1), 38–43 (2012)
Article MathSciNet MATH Google Scholar
J. Dai, X. Bao, W. Xu, C. Chang, Root sparse Bayesian learning for off-grid Doa estimation. IEEE Signal Process. Lett. 24(1), 46–50 (2016)
Article Google Scholar
Q. Wang, H. Yu, J. Li, F. Ji, F. Chen, Sparse Bayesian learning using generalized double pareto prior for Doa estimation. IEEE Signal Process. Lett. 28, 1744–1748 (2021)
Article Google Scholar
J. Dai, H.C. So, Real-valued sparse Bayesian learning for Doa estimation with arbitrary linear arrays. IEEE Trans. Signal Process. 69, 4977–4990 (2021)
Article MathSciNet MATH Google Scholar
S.D. Babacan, R. Molina, A.K. Katsaggelos, Bayesian compressive sensing using laplace priors. IEEE Trans. Image Process. 19(1), 53–63 (2009)
Article MathSciNet MATH Google Scholar
P. Zhao, W. Si, G. Hu, L. Wang, Doa estimation for a mixture of uncorrelated and coherent sources based on hierarchical sparse Bayesian inference with a gauss-exp-chi2 prior. Int. J. Antennas Propag. 2018, 3505918 (2018)
Article Google Scholar
D.P. Wipf, B.D. Rao, An empirical Bayesian strategy for solving the simultaneous sparse approximation problem. IEEE Trans. Signal Process. 55(7), 3704–3716 (2007)
Article MathSciNet MATH Google Scholar
K. Qiu, A. Dogandzic, Sparse signal reconstruction via ecme hard thresholding. IEEE Trans. Signal Process. 60(9), 4551–4569 (2012)
Article MathSciNet MATH Google Scholar
D. Shutin, T. Buchgraber, S.R. Kulkarni, H.V. Poor, Fast variational sparse Bayesian learning with automatic relevance determination for superimposed signals. IEEE Trans. Signal Process. 59(12), 6257–6261 (2011)
Article MathSciNet MATH Google Scholar
A. Al Hilli, L. Najafizadeh, A. Petropulu, Weighted sparse Bayesian learning (wsbl) for basis selection in linear underdetermined systems. IEEE Trans. Veh. Technol. 68(8), 7353–7367 (2019)
Article Google Scholar
Y. Ma, X. Cao, X. Wang, M.S. Greco, F. Gini, Multi-source off-grid Doa estimation with single snapshot using non-uniform linear arrays. Signal Process. 189, 108238 (2021)
Article Google Scholar
X. Wang, M.S. Greco, F. Gini, Adaptive sparse array beamformer design by regularized complementary antenna switching. IEEE Trans. Signal Process. 69, 2302–2315 (2021)
Article MathSciNet MATH Google Scholar
T. Shan, M. Wax, T. Kailath, On spatial smoothing for direction-of-arrival estimation of coherent signals. IEEE Trans. Acoust. Speech Signal Process. 33(4), 806–811 (1985)
Article Google Scholar
A. Lapidoth. A foundation in digital communication, 2nd edn. Cambridge University Press, The Edinburgh Building, Cambridge CB2 8RU (2017)
J.R. Schott, Matrix Analysis for Statistics, 3rd edn. (John Wiley, Hoboken, New Jersey, 2016)
Google Scholar
H.M. Finn, Adaptive detection mode with threshold control as a function of spatially sampled clutter level estimates. RCA Review 29, 414–464 (1968)
Google Scholar
P. Stoica, R.L. Moses et al., Spectral Analysis of Signals (Pearson Prentice Hall, Upper Saddle River, NJ, 2005)
Google Scholar

Download references

Acknowledgements

The work of J. Shen was supported by the China Scholarship Council for 1 year’s study at the University of Pisa. The work of F. Gini and M.S. Greco has been partially supported by the Italian Ministry of Education and Research (MIUR) in the framework of the FoReLab project (Departments of Excellence). The work of T. Zhou was supported by the National Natural Science Foundation of China under Grant 42176192, U1709203, 41976176, the Open Research Project under Grant BY119C008, the Fundamental Research Funds for the Central Universities under Grant 3072020CFT0501.

Author information

Authors and Affiliations

College of Underwater Acoustic Engineering, Harbin Engineering University, Harbin, China
Jiajun Shen & Tian Zhou
Department of Information Engineering, University of Pisa, Pisa, Italy
Fulvio Gini & Maria Sabrina Greco
Peng Cheng Laboratory, Shenzhen, China
Tian Zhou

Authors

Jiajun Shen
View author publications
You can also search for this author in PubMed Google Scholar
Fulvio Gini
View author publications
You can also search for this author in PubMed Google Scholar
Maria Sabrina Greco
View author publications
You can also search for this author in PubMed Google Scholar
Tian Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors read and approved the final manuscript.

Corresponding author

Correspondence to Tian Zhou.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: the derivation on Eq. (14)

The first item on Eq.(13) can be simplified as:

$$\begin{aligned}&-\sum _{t=0}^{L-1}{{\mathbb {E}}}_{{\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }}}[{{\varvec{x}}_{t}^{H}}{\varvec{\Lambda }}^{-1}{\varvec{x}}_{t}+\beta \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{x}}_{t}\Vert _{2}^{2}]\\&\quad =-\sum _{t=0}^{L-1}{{\mathbb {E}}}_{{\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }}}[\beta \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}\\&\quad +\beta {\varvec{x}}_{t}^{H}{\varvec{A}}^{H}{\varvec{A}}{\varvec{x}}_{t}-\beta {\varvec{\mu }}_{t}^{H}{\varvec{A}}^{H}{\varvec{A}}{\varvec{\mu }}_{t}+{{\varvec{x}}_{t}^{H}}{\varvec{\Lambda }}^{-1}{\varvec{x}}_{t}]\\&\quad =-\sum _{t=0}^{L-1}{{\mathbb {E}}}_{{\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }}}[\beta \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}\\&\quad +{\varvec{x}}_{t}^{H}{\varvec{\Sigma }}^{-1}{\varvec{x}}_{t}-\beta {\varvec{\mu }}_{t}^{H}{\varvec{A}}^{H}{\varvec{A}}{\varvec{\mu }}_{t}]\\&\quad =-\sum _{t=0}^{L-1}{{\mathbb {E}}}_{{\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }}}[\beta \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}\\&\quad +{\varvec{x}}_{t}^{H}{\varvec{\Sigma }}^{-1}{\varvec{x}}_{t}+{\varvec{\mu }}_{t}^{H}{{\varvec{\Lambda }}^{-1}}{\varvec{\mu }}_{t}-{\varvec{\mu }}_{t}^{H}{{\varvec{\Sigma }}^{-1}}{\varvec{\mu }}_{t}]\\&\quad =-\sum _{t=0}^{L-1}{{\mathbb {E}}}_{{\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }}}[\beta \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}+{\varvec{x}}_{t}^{H}{\varvec{\Sigma }}^{-1}{\varvec{x}}_{t}\\&\quad +{\varvec{\mu }}_{t}^{H}{{\varvec{\Lambda }}^{-1}}{\varvec{\mu }}_{t}-2{\varvec{x}}_{t}^{H}{{\varvec{\Sigma }}^{-1}}{\varvec{\mu }}_{t}+{\varvec{\mu }}_{t}^{H}{{\varvec{\Sigma }}^{-1}}{\varvec{\mu }}_{t}] \end{aligned}$$

(34)

Notably, we used a trick: ${\mathbb {E}}_{{\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }}}[-2{\varvec{x}}_{t}^{H}{{\varvec{\Sigma }}^{-1}}{\varvec{\mu }}_{t}+{\varvec{\mu }}_{t}^{H}{{\varvec{\Sigma }}^{-1}}{\varvec{\mu }}_{t}]=-{\varvec{\mu }}_{t}^{H}{{\varvec{\Sigma }}^{-1}}{\varvec{\mu }}_{t}$, then the Eq. 34 can be rewitten as:

$$\begin{aligned}&-\sum _{t=0}^{L-1}{{\mathbb {E}}}_{{\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }}}[\beta \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}\\&\quad +({\varvec{x}}_{t}-{\varvec{\mu }}_{t})^{H}{\varvec{\Sigma }}^{-1}({\varvec{x}}_{t}-{\varvec{\mu }}_{t})+{\varvec{\mu }}_{t}^{H}{{\varvec{\Lambda }}^{-1}}{\varvec{\mu }}_{t}] \end{aligned}$$

(35)

and $Const1={N{\text{ln}}(\rho )+{\text{ln}}(b^{a}/\Gamma (a))}-LM{\text{ln}}\pi -LN{\text{ln}}\pi$ can be regarded as a constant since these terms are independent of ${\varvec{\alpha }}$, $\beta$, and ${\varvec{\theta }}$. Herein, the expectation ${{\mathbb {E}}}_{{\varvec{X}}|{\varvec{Y}},{\varvec{\alpha }},\beta ;{\varvec{\theta }}}[({\varvec{x}}_{t}-{\varvec{\mu }}_{t})^{H}{\varvec{\Sigma }}^{-1}({\varvec{x}}_{t}-{\varvec{\mu }}_{t})]$ can be evaluated by the standard result for a multivariate complex Gaussian distribution with the normalization coefficient, giving

$$\begin{aligned} \int {\text{exp}}[({\varvec{x}}_{t}-{\varvec{\mu }}_{t})^{H}{\varvec{\Sigma }}^{-1}({\varvec{x}}_{t}-{\varvec{\mu }}_{t})]d{\varvec{x}}_{t}=|{\varvec{\Sigma }}|\pi ^{N} \end{aligned}$$

(36)

Finally, the expectation function can be simplified as

$$\begin{aligned}&{\text{ln}}[p(\beta ,{\varvec{\alpha }},{\varvec{Y}};{\varvec{\theta }})] =-\sum _{t=0}^{L-1}[\beta \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}\\&\quad +{\varvec{\mu }}_{t}^{H}{{\varvec{\Lambda }}^{-1}}{\varvec{\mu }}_{t}]+L{\text{ln}}{|{\varvec{\Sigma }}|}+LM{\text{ln}}\left( \frac{\beta }{\pi }\right) \\&\quad -L\sum _{n=1}^{N}{\text{ln}}(\pi \alpha _{n})+\sum _{n=1}^{N}(-\rho \alpha _{n})\\&\quad +(a-1){\text{ln}}(\beta )-b\beta +Const2 \end{aligned}$$

(37)

where $Const2=Const1+LN{\text{ln}}\pi$.

Appendix B: detailed derivation on Eq. (20)

Take the derivative of Eq. (19) w.r.t. $\theta _{k}$, giving

$$\begin{aligned}&\frac{{\partial Q({\varvec{\theta }})}}{\partial \theta _{k}}=-\beta \sum _{t=0}^{L-1}\frac{\partial \Vert {\varvec{y}}_{t}-{\varvec{A}}{\varvec{\mu }}_{t}\Vert _{2}^{2}}{\partial \theta _{k}}-L\beta {\text{Tr}}\left( {\varvec{\Sigma }}\frac{\partial ({\varvec{A}}{\varvec{A}}^{H})}{\partial \theta _{k}}\right) \\&\quad =-2\beta ({\varvec{a}}_{k}^{'})^{H}\sum _{t=0}^{L-1}(|\mu _{tk}|^{2}{\varvec{a}}_{k}-\mu _{tk}^{*}{\varvec{y}}_{t}\\&\quad +\sum _{j\ne k}\mu _{tk}^{*}\mu _{tj}{\varvec{a}}_{j})-2L\beta ({\varvec{a}}_{k}^{'})^{H}{\varvec{A}}{\varvec{\gamma }}_{k}\\&\quad =-2\beta ({\varvec{a}}_{k}^{'})^{H}\sum _{t=0}^{L-1}\left( |\mu _{tk}|^{2}{\varvec{a}}_{k}-\mu _{tk}^{*}{\varvec{y}}_{t}+\sum _{j\ne k}\mu _{tk}^{*}\mu _{tj}{\varvec{a}}_{j}\right) \\&\quad -2L\beta ({\varvec{a}}_{k}^{'})^{H}\left[ {\varvec{a}}_{k}{\gamma }_{kk}+\sum _{j\ne k}{\varvec{a}}_{j}{\gamma }_{jk}\right] \end{aligned}$$

(38)

where $L[{\text{ln}}{|{\varvec{\Sigma }}|}]{'}=-L{\text{Tr}}({\varvec{\Sigma }}[{\varvec{Z}}]{'})=-L\beta {\text{Tr}}({\varvec{\Sigma }}[{\varvec{A}}^{H}{\varvec{A}}]{'})$. Simplify the Eq. (38) and set it to zero:

$$\begin{aligned}&({\varvec{a}}_{k}^{'})^{H}\underbrace{\left( L\sum _{j\ne k}{\varvec{a}}_{j}{\gamma }_{jk}-\sum _{t=0}^{L-1}\mu _{tk}^{*}{\varvec{y}}_{t\setminus k}\right) }_{\triangleq {\varvec{\varphi }}_{k}}\\&\quad +({\varvec{a}}_{k}^{'})^{H}{\varvec{a}}_{k}\underbrace{\left( \sum _{t=0}^{L-1}|\mu _{tk}|^{2}+L{\gamma }_{kk}\right) }_{\triangleq \phi _{k}}=0 \end{aligned}$$

(39)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Shen, J., Gini, F., Greco, M.S. et al. Off-grid DOA estimation using improved root sparse Bayesian learning for non-uniform linear arrays. EURASIP J. Adv. Signal Process. 2023, 34 (2023). https://doi.org/10.1186/s13634-023-00991-7

Download citation

Received: 13 November 2022
Accepted: 19 February 2023
Published: 14 March 2023
DOI: https://doi.org/10.1186/s13634-023-00991-7

Off-grid DOA estimation using improved root sparse Bayesian learning for non-uniform linear arrays

Abstract

1 Introduction

2 Problem formulation

3 Improved root SBL method

3.1 Hierarchical sparse Bayesian framework

3.2 Bayesian inference

3.3 Find roots of an irregular polynomial

3.4 Sparsification process with CFAR rule

4 Simulation results

4.1 Experiment 1

4.2 Experiment 2

4.2.1 RMSE versus SNR

4.2.2 RMSE versus number of snapshots

5 Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Appendices

Appendix A: the derivation on Eq. (14)

Appendix B: detailed derivation on Eq. (20)

Rights and permissions

About this article

Cite this article

Share this article

Keywords