Skip to main content

Sparse covariance fitting for direction of arrival estimation

Abstract

This article proposes a new algorithm for finding the angles of arrival of multiple uncorrelated sources impinging on a uniform linear array of sensors. The method is based on sparse signal representation and does not require either the knowledge of the number of the sources or a previous initialization. The proposed technique considers a covariance matrix model based on overcomplete basis representation and tries to fit the unknown signal powers to the sample covariance matrix. Sparsity is enforced by means of a l1-norm penalty. The final problem is reduced to an objective function with a non-negative constraint that can be solved efficiently using the LARS/homotopy algorithm. The method described herein is able to provide high resolution with a low computational burden. It proceeds in an iterative fashion solving at each iteration a small linear system of equations until a stopping condition is fulfilled. The proposed stopping criterion is based on the residual spectrum and arises in a natural way when the LARS/homotopy is applied to the considered objective function.

1. Introduction

Brief summary of classical direction of arrival estimators

The estimation of the directions of arrival (DoA) of multiple sources using sensor arrays is an old problem and plays a key role in array signal processing. During the last five decades, a plethora of methods have been proposed for finding the DoA of different narrowband signals impinging on a passive array of sensors. These methods can be divided into two categories: parametric and nonparametric estimators.

Nonparametric methods include beamforming and subspace methods. The former relies on scanning the power from different locations. Exponents of this category are conventional beam-former [1] and Capon's method [2]. Conventional beamformer, a.k.a. Barlett beamformer, suffers from poor spatial resolution and cannot resolve sources within the Rayleigh resolution limit [1]. As it is well known, this lack of resolution can be mitigated only by increasing the number of sensors of the array because improving the SNR or increasing the number of time observations does not increase the resolution. On the contrary, Capon's minimum variance method can resolve sources within the Rayleigh cell if the SNR is high enough, the number of observations is sufficient and the sources are not correlated. Unfortunately, in practice, Capon's power profile is strongly dependent on the beamwidth, which, on its turn, depends on the explored direction and in some scenarios this could lead to a resolution loss. To counteract this, an estimator of the spectral density obtained from the Capon's power estimate was derived in [3] achieving better resolution properties. Herein this method will be referred as Normalized Capon. Another well-known category of nonparametric DoA estimators is the one composed by subspace methods. These algorithms are able to provide high-resolution and outperform beamforming methods. The most prominent member of this family is MUltiple SIgnals Classification (MUSIC) [4], it relies on an appropriate separation between signal and noise subspaces. This characterization is costly and needs a previous estimation of the number of incoming signals.

Parametric methods based on the maximum likelihood criterion [5] exhibit a good performance at expenses of a high computational cost. These techniques estimate the parameters of a given model instead of searching the maxima of the spatial spectrum. Unfortunately, they often lead to difficult multidimensional optimization problems with a heavy computational burden.

An interesting algorithm that lies in between the class of parametric and nonparametric techniques is the CLEAN algorithm. This method was first introduced by Högbom [6] and have applications in several areas: array signal processing, image processing, radar and astronomy. Recently, Stoica and Moses throw light on the semiparametric nature of the algorithm [7]. In broad outline, it operates in a recursive manner subtracting at each iteration a fraction of the strongest signal from the observed spatial spectrum.

For those readers interested on a more detailed and comprehensive summary of angle of arrival estimators, the authors refer them to [1, 8].

Sparse signal representation

Sparse representation of signals over redundant dictionaries is a hot topic that has attracted the interest of researchers in many fields during the last decade, such as image reconstruction [9], variable selection [10], and compressed sensing [11]. The most basic problem aims to find the sparsest vector x such that y = Ax, where y is the measured vector and A is known. This matrix A is called dictionary and is overcomplete, i.e., it has more columns that rows. As a consequence, without imposing a sparsity prior on x, the set of equations y = Ax is underdetermined and admits many solutions. Formally, the objective is to minimize x0 subject to y = Ax, where ·0 denotes the l0-norm [12]. This is an intractable NP-hard combinatorial problem in general [13]. Fortunately, if the vector is sufficiently sparse the problem can be relaxed replacing the l0-norm by a l1-norm, defined as x1 = ∑ i |x i |, leading to a convex optimization problem with a lower computational burden. The conditions that ensure the uniqueness of the solution were studied in [14].

In case of an observation vector contaminated by noise, a natural variation is to relax the equality constraint to allow some error tolerance ε ≥ 0:

min x x 1 subject  to y - A x 2 2 ε
(1)

or alternatively,

min x y - A x 2 2 subject to x 1 β
(2)

where the constraint x1β with β ≥ 0 promotes sparsity. This formulation is known as Least Absolute Shrinkage and Selector Operator (LASSO) and was originally proposed by Tibishirani [15]. The augmented formulation of (2) is well-known in signal processing and is commonly called Basis Pursuit Denoising (BPDN) [16]:

min x y - A x 2 2  +  τ x 1 with τ 0
(3)

The three formulations (1)-(3) are equivalent in the sense that the sets of solutions are the same for all the possible choices of the parameters τ, ε, β. To go from one formulation to the other we only need a proper correspondence of the parameters. Nevertheless, even if the mapping between the regularization parameters exists, this correspondence is not trivial and it is possibly non-linear and discontinuous [17].

When the vector x is real, the LASSO problem (2), or its equivalent formulation (3), can be solved with standard quadratic programming techniques [15]. However, these techniques are time demanding and faster methods are preferred. Osborne et al. [18] and later Efron et al. [19] proposed an efficient algorithm for solving the LASSO. This algorithm is known as "homotopy method" [18] or LARS (Least Angle Regression) [19]. In this article this technique will be referred to as LARS/homotopy. A variant of the traditional LASSO problem, that will be specially useful in the covariance fitting that will be addressed later on, is the so-called positive LASSO. In this case, an additional constraint over the entries of the vector x is considered in the LASSO problem to enforce the components of the vector to be non-negative:

min x y - A x 2 2 subject to x 1 β and x i 0 i
(4)

The positive LASSO problem (4) can be solved in a efficient way introducing some slight modifications in the traditional LARS/homotopy. This approach was proposed by Efron et al. [19], but is not as widely known as the traditional one. Briefly, the algorithm starts with a very large value of τ, and gradually decreases the regularization parameter, until the desired value is attained. As τ evolves, the optimal solution for a given τ, x(τ), moves on a piecewise affine path. As the minimizer x(τ) is a piecewise-linear function of τ we only need to find the critical regularization parameters τ0, τ1, τ2, ..., τstop where the slope changes [17], these values are the so-called breakpoints. The algorithm starts with x = 0 and operates in an iterative fashion calculating the critical regularization parameters τ0 > τ1 > > τstop ≥ 0 and the associated minimizers x (τ0), x (τ1), ..., x (τstop) where an inactive component of x becomes positive or an active element becomes equal to zero. Normally, the number of active components increases as τ decreases. Nevertheless, this fact cannot be guaranteed: at some breakpoints, some entries may need to be removed from the active set.

Sparse representation in source location

Although there are some pioneering studies carried out in the late nineties, e.g., [20, 21], the application of sparse representation to direction finding has gained noticeable interest during the last decade. Recent techniques based on sparse representation show promising results that outperform conventional high-resolution methods such as MUSIC. In [20] a recursive weighted minimum-norm algorithm called FOCUSS was presented. This algorithm considers a single snapshot and requires a proper initialization. The extension to the multiple-snapshot case was carried out in [22] and it is known as M-FOCUSS. Unfortunately, as it is described in [23], this technique is computationally expensive and requires the tuning of two hyperparameters that can affect the performance of the method significantly.

If multiple snapshots can be collected in an array of sensors, they can be used to improve the estimation of the angles of arrival. Several approaches for summarizing multiple observations have been proposed in the literature. The first of these approaches is the so-called l1-SVD presented by Malioutov et al. [24]. This method is based on the application of a singular value decomposition (SVD) over the received data matrix and leads to a second-order cone optimization problem. This algorithm requires an initial estimation of the number of sources. Although it does not have to be exact, a small error is needed for a good performance. An underestimation or an overestimation of the number of sources provokes a degradation in the performance of the method. Even if the effect of an incorrect determination of the number of sources has no catastrophic consequences, such as the disappearance of the sources in MUSIC, the performance of the algorithm can be considerably degraded. Another important drawback is that l1-SVD depends on a user-defined parameter which is not trivial to select. An alternative approach to summarize multiple snapshots is the use of mixed norms over multiple measurement vectors (MMV) that share the same sparsity pattern [22, 25]. This formulation is useful in array signal processing, specially, when the number of snapshots is smaller than the number of sensors. If we assume that the snapshots are collected during the coherence time of the angles, the position of the sources keep identical among the snapshots; the only difference between them resides in the amplitudes of the impinging rays. Basically, this approach, which is out of the scope of the article, tries to combine multiple snapshots using the l2 norm and to promote sparsity on the spatial dimension by means of the l1-norm. Unfortunately, this joint optimization problem is complex and requires a high computational burden. When the number of snapshots increases, the computational load becomes too high for practical real-time source location. Recently, new techniques based on a covariance matrix fitting approach have been considered to summarize multiple snapshots, e.g., [2628]. Basically, these methods try to fit the covariance matrix to a certain model. The main advantage of covariance fitting approaches is that they lead to convex optimization problems with an affordable computational burden. Moreover, they do not require a previous estimation of the number of incoming sources or heavy computations such as SVD of the data. It should be also pointed out that as these methods work directly with the covariance matrix less storage space is needed because they do not need to store huge amounts of time data. The technique proposed by Yardibi et al. [26] leads to an optimization problem that can be solved efficiently using Quadratic Programming (QP). In the case of the approach exposed by Picard and Weiss [27], the solution is obtained by means of linear programming (LP). The main drawback of this last method is that it depends on a user defined parameter that is difficult to adjust. In the same way, Liu et al. [29] propose a new method which is based on a hyperparameter that has been heuristically determined. On the contrary, Stoica et al. [28, 30] propose an iterative algorithm named SParse Iterative Covariance-based Estimation approach (SPICE), that can be used in noisy data scenarios without the need for choosing any hyperparameter. The major drawback of this method is that it needs to be initialized.

Article contribution

This article proposes a simple, fast, and accurate algorithm for finding the angles of arrival of multiple sources impinging on a uniform linear array (ULA). In contrast to other methods in the literature, the proposed technique does not depend on user-defined parameters and does not require either the knowledge of the number of sources or initialization. It assumes white noise and that the point sources are uncorrelated.

The method considers a structured covariance matrix model based on over-complete basis representation and tries to fit the unknown signal powers of the model to the sample covariance. Sparsity is promoted by means of a l1-norm penalty imposed on the powers. The final problem is reduced to an objective function with a non-negative constraint that can be solved efficiently using the LARS/homotopy algorithm, which is, in general, faster than QP [19] and LP [17]. The method described herein proceeds in an iterative manner solving at each iteration a small linear system of equations until a stopping condition is fulfilled. The proposed stopping criterion is based on the residual spectrum and arises in a natural way when the LARS/homotopy is applied to the considered objective function. From the best of our knowledge this stopping condition has never been considered before in sparse signal representation.

2. The proposed method: sparse covariance fitting for source location

Consider L narrowband signals { x i [ k ] } i = 1 L impinging on an array of M sensors. The k th observation can be expressed as:

y [ k ] = S ( θ ) x [ k ] + w [ k ] k = 1 , , N
(5)

where x [k] = [x1 [k] x L [k]]Tis the vector of unknown source signals, the matrix S (θ) M× Lis the collection of the steering vectors corresponding to the angles of arrival of the sources θ= [θ1, ..., θ L ]T, that is, S (θ) = [s (θ1) s (θ L )], and w [k] M× 1 denotes a zero-mean additive noise, spatially, and temporally white, independent of the sources with covariance matrix σ w 2 I M , being I M the identity matrix of size M.

Taking into account (5) the spatial covariance matrix can be expressed as:

R = E { y [ k ] y H [ k ] } = S ( θ ) P S H ( θ ) + σ w 2 I M
(6)

being P = E {x [k] xH[k]}. The classical direction finding problem can be reformulated as a sparse representation problem. With this aim, let us consider an exploration grid of G equally spaced angles Φ = {ϕ1, ..., ϕ G } with G >> M and G >> L. If the set of angles of arrival of the impinging signals θ is a subset of Φ, the received signal model (5) can be rewritten in terms of an overcomplete matrix S G constructed by the horizontal concatenation of the steering vectors corresponding to all the potential source locations.

y [ k ] = S G x G [ k ] + w [ k ]
(7)

where S G M× Gcontains the steering vectors corresponding to the angles of the grid S G = [s1 s G ], with s i = s(ϕ i ), and x G [k] G× 1 is a sparse vector. The non-zero entries of x G [k] are the positions that corresponds to the source locations. In other words, the n th element of x G [k] is different from zero and equal to the q th component of the vector x [k] defined in (5), denoted by x q [k], if and only if ϕ n = θ q . It is important to point out that the matrix S G is known and does not depend on the source locations.

The assumption that the set of angles of arrival is a subset of Φ is only required for the derivation of the algorithm. Obviously, it does not always hold. Actually, this is a common assumption in many exploration methods in the direction finding literature (e.g., Capon, Normalized Capon, MUSIC, etc). In the case that θ Φ, the contribution of the sources leaks into the neighboring elements of the grid.

Bearing in mind (7) and assuming a white noise with covariance matrix σ w 2 I M , the spatial covariance matrix of (5) can be rewritten in terms of S G and takes the form:

R = E { y [ k ] y H [ k ] } = S G D S G H + σ w 2 I M
(8)

with D=E { x G [ k ] x G H [ k ] } . An important remark is that D G× Gis different to the source covariance matrix P L× Lintroduced in (6). Actually, since only L2 entries out of G2 can differ from zero, D is a sparse matrix.

A common assumption in many direction finding problems is that sources are uncorrelated. Under this assumption, the matrix D is a diagonal matrix with only L non-zero entries given by diag (D) = [p1 p G ]T= p, being p + G × 1 .

Note that p is a G × 1 sparse vector with non-zero entries at positions corresponding to source locations. Furthermore, the elements of p are real-valued and non-negative.

To cast the problem into a positive LASSO with real variables let us make some manipulations on (8). Applying vectorization to (8) it yields:

vec { R } = S G * S G vec { D } + σ w 2 vec { I M }
(9)

where and vec {·} denote the Kronecker product and the vectorization operator. It should be remarked that the result of S G * S G M 2 × G 2 .

Since D is a diagonal matrix because the sources are uncorrelated, only G columns of S G * S G have to be taken into account. Using this fact, the dimensionality of the problem can be reduced. In this way, it is straightforward to rewrite the expression (9) in terms of vector p just removing some columns of S G * S G :

vec { R } = A ̃ p + σ w 2 vec { I M }
(10)

with A ̃ = [ s 1 * s 1 s 2 * s 2 s G * s G ] . Note that A ̃ M 2 × G .

Separating real and imaginary parts the above equation takes the form:

r r r i = A ̃ r A ̃ i p + σ w 2 vec { I M } 0 M 2 × 1
(11)

where

r r = Re { vec [ R ] } A ̃ r = Re  { A ̃ } r i = Im { vec [ R ] } A ̃ i = Im { A ̃ }

In the expression (11), vec{I M } denotes the vectorization of the identity matrix of dimensions M × M and 0 M 2 × 1 is a vector of zeros of size M2 × 1. More compactly, the expression (11) can be rewritten as:

r = A p + n
(12)

with obvious definitions for r, A, p, and n. Note that r and n 2 M 2 × 1 and A 2 M 2 × G .

Unfortunately, the spatial covariance matrix is unknown in practice and is normally replaced by the sample covariance matrix obtained from a set of N observations R ^ = 1 N k = 1 N y [ k ] y H [ k ] . A possible method for finding p is the following constrained least squares problem:

min p r ^ - A p 2 2 subject to p i 0 i = 1 , , G p 1 = j = 1 G p i β w i t h β 0
(13)

Where r ^ = Re  vec R ^ Im  vec R ^ .

Note that (13) is positive LASSO problem. The main idea behind (13) is to fit the unknown powers to the model such that the solution is sparse. The method tries to minimize the residual, or in other words, tries to maintain the fidelity of the sparse representation with the received data subject to a non-negative constraint on the powers and j = 1 G p i β. This last constraint promotes sparsity, as it was exposed in (2), but also imposes a bound in the received signal power. Unfortunately, the parameter β is unknown and has to be estimated. Even worse, the solution of (13) is very sensitive to the parameter β, a little error in the estimation of the parameter can lead to a wrong solution vector.

Instead of solving (13) let us consider the next equivalent formulation:

min p r ^ - A p 2 2 + τ p 1 subject to τ 0 , p i 0 i = 1 , , G
(14)

The problems (13) and (14) are equivalent in the sense that the path of solutions of (13) parametrized by a positive β matches with the solution path (14) as τ varies. To go from one formulation to the other one we need a proper correspondence between the parameters.

The problem (14) can be solved in an efficient way with the LARS/homotopy algorithm for positive LASSO. The method operates in an iterative fashion computing the critical regularization parameters τ0 > τ1 > > τstop ≥ 0 and the associated minimizers p (τ0), p (τ1), ..., p (τstop), where an inactive component of p becomes positive or an active element becomes equal to zero. Let us remark that there is only one new candidate to enter or leave the active set at each iteration (this is the "one at a time condition" described by Efron et al. [19]).

The algorithm is based on the computation of the so-called vector of residual correlations, or just residual correlation, b ( τ ) = A T ( r ^ - A p ( τ ) ) at each iteration. The method starts with p = 0 which is the solution of (14) for all the τ τ 0 =2 max i ( A T r ^ ) i , being ( A T r ^ ) i the i th component of the vector A T r ^ , and proceeds in an iterative manner solving reduced-order linear systems. The whole algorithm is summarized in Algorithm 1 (see [19, 31] for further details). This iterative procedure must be halted when a stopping condition is satisfied. This stopping criterion, which is the main contribution of this article, will be described later in Section 3.

It should be pointed out that the least squares error of the covariance fitting method exposed in (14) decreases at each iteration of the LARS/homotopy algorithm. This result is justified by the next two theorems.

Theorem 1: The sum of the powers increases monotonically at each iteration of the algorithm. Given two vectors with non-negative elements p(τn+1) and p(τ n ) that are minimizers of (14) for two breakpoints τn+1and τ n , respectively, with τ n > τn+ 1, it can be stated that ║p(τn+1)║1 ≥ ║p(τ n )║1.

Proof: See Appendix 1.

Theorem 2: The least squares error r ^ - A p ( τ ) 2 2 decreases at each iteration of LARS/homotopy algorithm. Given two vectors with non-negative elements p(τ n ) and p(τn+1) that are minimizers of (14) for two consecutive breakpoints τ n and τn+1of the LARS/homotopy, with τ n > τn+1, it can be stated that r ^ - A p ( τ n + 1 ) 2 2 r ^ - A p ( τ n ) 2 2 .

Proof: See Appendix 2.

Algorithm 1 Proposed method

INITIALIZATION: p=0, τ 0 =2 max i ( A T r ^ ) i ,n=0

J = active set = , I = inactive set = Jc

while ≠ stopping criterion and i I such that b i > 0 do

  1. 1)

    Compute the residual correlation b= A T ( r ^ - A p )

  2. 2)

    Determine the maximal components of b. These will be the non-zero elements of p(τ n ) (active components).

    J = arg max { b j } , I = J c
  3. 3)

    Calculate the update direction u such that all the active components lead to an uniform decrease of the residual correlation (equiangular direction).

    u J = ( A J T A J ) - 1 1 J
  4. 4)

    Compute the step size γ such that a new element of the b becomes equal to the maximal ones ( i I such that b i (τn+1) = bjJn+1)) or one non-zero component of p becomes zero ( j J such that p j n+1) = 0).

  5. 5)

    Actualize pp + γu, τn+1= τ n - 2γ, n = n + 1

end while

3. Stopping criterion: the cumulative spectrum

The definition of an appropriate stopping criterion is of paramount importance because it determines the final regularization parameter τstop and consequently the number of active positions in the solution vector. In general, larger values of τ produce sparser solutions. Nevertheless, this fact cannot be guaranteed: at some breakpoints, some entries may need to be removed from the active set.

Most of the traditional approaches exposed in the literature for choosing the regularization parameter in discrete ill-posed problems are based on the norm of the residual error in one way or another, e.g., discrepancy principle, cross-validation, and the L-curve. Nevertheless, recent publications [32, 33] suggest the use of a new parameter-choice method based on the residual spectrum. This technique is based on the evaluation of the shape of the Fourier transform of the residual error. From the best of authors' knowledge, this approach has never been used as a stopping criterion in sparse representation problems. The method exposed herein is inspired in the same idea with some slight modifications. The main difference resides in the fact that no Fourier transform needs to be computed over the residual, as it will be exposed later on, the spatial spectrum of the residual arises in a natural way when the LARS/homotopy is applied to (14). The following result is the key point of the stopping criterion proposed in this article.

Theorem 3: When the LARS/homotopy is applied to the problem (14), the residual correlation obtained at the k th iteration of the algorithm, expressed as b ( τ k ) = A T ( r ^ - A p ( τ k ) ) , is equivalent to the Barlett estimator applied to the residual covariance matrix C ^ k = R ^ - i = 1 G p i ( τ k ) s i s i H . Then, the i th component of the vector of residual correlations satisfies b i ( τ k ) = s i H C ^ k s i .

Proof: See Appendix 3.

This theorem provides an alternative interpretation of the residual correlation at the k th iteration b (τ k ) which can be seen as a residual spatial spectrum. Bearing in mind this idea and under the assumption that the noise is zero-mean and spatially white the following parameter-choice method is proposed: to stop as soon as the residual correlation resembles white noise.

Under the assumption that the noise is spatially white, the power is distributed uniformly over all the angles of arrival and the spatial spectrum has to be flat. To determine whether the residual correlation corresponds to a white noise spectrum a statistical tool has to be considered. Several tests are available in the literature to test the hypothesis of white noise. Herein the metric that will be considered to see if the residual looks like noise is:

c k ( l ) = i = 1 l b i ( τ k ) i = 1 G b i ( τ k ) l = 1 , , G
(15)

where the subindex k, with k = 0, ..., kstop, denotes the k th iteration of the LARS/homotopy algorithm. The metric c k is a slight modification of the conventional normalized cumulative periodogram proposed by Barlett [34] and later by Durbin [35]. Traditionally, the cumulative periodogram has been defined for real-valued time series. In the real case, the spectrum is symmetric and only half of the spectrum needs to be computed. However, it can be easily extended to embrace complex-valued vectors as it is shown in (15). Throughout this entire document c k will be referred to as normalized cumulative spectrum (NCS).

For an ideal white noise the plot of the NCS is a straight line and resembles the cumulative distribution of a uniform distribution. Thus, any distributional test, such as the Kolmogorov-Smirnov (K-S) test, can be considered to determine the "goodness of fit" between the cumulative spectrum and the theoretical straight line. In [34], Barlett proposed the use of the K-S test which is based on the largest deviation in absolute value between the cumulative spectrum and the theoretical straight line. The K-S test rejects the hypothesis of white noise whenever the maximum deviation between the cumulative spectrum and the straight line is too large. On the contrary, the cumulative spectrum is considered white noise if it lies within the K-S limits. The upper and the lower K-S limits, as a function of index l, are given by

l G ± δ M N
(16)

where δ = 1.36 for the 95% confidence band and δ = 1.63 for the 99% band.

Notice that the NCS does not require an accurate estimation of the noise power at the receiver. Since the cumulative spectrum (15) is normalized with respect to the average power at each k th iteration, the decision metric only depends on the shape of the spatial spectrum.

The proposed stopping condition is: to stop as soon as the residual correlation resembles white noise, that is, when the NCS lies within the K-S limits.

4. Numerical results

The aim of this section is to analyze the performance of the covariance fitting method proposed in this article. To carry out this objective, some simulations have been done in Matlab. Throughout the simulations, a uniform grid with 1° of resolution has been considered for all the analyzed techniques. Furthermore, a zero-mean white Gaussian noise with power σ w 2 =1 has been considered. The generated source signals are uncorrelated and distributed as circularly symmetric i.i.d complex Gaussian variables with zero mean. Since the same power P will be considered for all the sources, throughout this entire section the signal to noise ratio (SNR) is defined by SNR ( dB ) = 10 log 10 ( P σ w 2 ) .

To illustrate the algorithm and the new stopping condition based on the cumulative spectrum, we have considered four uncorrelated sources located at -36°, -30°, 30°, 50° that impinge on a ULA with M = 10 sensors separated by half the wavelength. The SNR is set to 0dB and the sample covariance matrix is computed with N = 600 snapshots. Figures 1 and 2 show the evolution of the NCS and the vector of residual correlations, respectively. As it is shown in Figure 1, the algorithm is stopped after 16 iterations when the NCS lies within the K-S limits of the 99% confidence band. The final solution p is shown in the Figure 3. Note that the residual spectrum of the final solution in Figure 2 is almost flat and the residual correlation resembles white noise.

Figure 1
figure 1

Cumulative Spectrum as a function of the angle index. The scenario is composed by four sources located at θ= [-36°, -30°, 30°, 50°], M = 10 sensors, N = 600 snapshots, SNR = 0dB. The final solution is achieved after 16 iterations of the LARS/homotopy and it is chosen as the first one that lies within the K-S limits of the 99% confidence band.

Figure 2
figure 2

Evolution of the vector of residual correlations b with the iterations of the LARS/homotopy. The scenario is composed by four sources located at θ= [-36°, -30°, 30°, 50°], M = 10 sensors, N = 600 snapshots, SNR = 0dB. The final solution is achieved after 16 iterations of the LARS/homotopy and it is chosen as the first one that lies within the K-S limits of the 99%. Note that the residual correlation is almost flat.

Figure 3
figure 3

The power estimate of the proposed covariance matrix fitting method. Final solution p obtained by the LARS/homotopy after 16 iterations. The settings are: four sources located at θ= [-36°, -30°, 30°, 50°], M = 10 sensors, N = 600 snapshots, SNR = 0dB.

Next, the probability of resolution of the covariance fitting method as a function of the SNR is investigated. With this aim we have considered two uncorrelated sources located at -36° and -30° that impinge on a ULA with M = 9 sensors. Both sources transmit with the same power and the sample covariance has been computed with N = 1000 snapshots. Figure 4 shows the results of the covariance fitting method compared to other classical estimators: MUSIC [4], Capon [2] and Normalized Capon [3]. In order to make a fair comparison between the different techniques, the number of sources of the MUSIC algorithm has been estimated with the Akaike information criterion (AIC) [7]. The curves in Figure 4 are averaged over 300 independent simulation runs. From this figure, it is clear that the proposed covariance fitting technique outperforms the other classical estimators and it is about 6dB better than the MUSIC algorithm and about 12dB better than the Normalized Capon method.

Figure 4
figure 4

Probability of resolution against SNR. θ= [-36°, -30°], M = 9 sensors, N = 1000 snapshots. The curves were obtained by averaging the results of 300 independent simulation runs.

Next, the performance of the proposed method in terms of root mean square error (RMSE) is analyzed and presented in Figure 5. Two uncorrelated sources separated by Δθ = 6° that impinge on an array of M = 9 sensors were taken into account in the simulations. In this case, the positions of the sources do not correspond to the angles of the grid. With this aim, the angle of the first source θ1 is generated as a random variable following a uniform distribution between -80° and 80° and the angle of the second source is generated as θ2 = θ1 + Δθ. The sample covariance has been computed with 900 snapshots. Figure 5 shows the RMSE of the proposed method and MUSIC as a function of the SNR as long as the two sources are resolved with a probability equal to 1. In the case of MUSIC the determination of the number of signal sources is performed by the AIC. The two curves are based on the average of 300 independent runs. From Figure 5 it can be concluded that at low SNR the proposed method outperforms MUSIC. When the SNR increases both methods tends to exhibit the same performance.

Figure 5
figure 5

Root mean square error as a function of the signal-to-noise ratio. Two uncorrelated sources separated by Δθ = 6°. M = 10 sensors, N = 900 snapshots, SNR = 0dB. The curves were obtained by averaging the results of 300 independent simulation runs.

Finally, the resolution capability of the method as a function of the number of snapshots is investigated. The scenario considered for this purpose is the following: two sources located at θ 1 = -36° and θ1 = -30° that impinge on a ULA with M = 9 sensors. In this case, the transmitted signals have constant modulus, which is a common situation in communications applications, s 1 ( t ) = e j φ 1 ( t ) and s 2 ( t ) = e j φ 2 ( t ) . The signal phases { φ k ( t ) } k = 1 2 are independent and follow a uniform distribution in [0, 2π]. Figure 6 shows the probability of resolution of the proposed method and MUSIC as a function of the number of snapshots N. In this case the signal to noise ratio is fixed to 1 dB. As in the previous cases, in order to make a fair comparison between the two techniques, the number of sources of the MUSIC algorithm has been determined using AIC. The curves were obtained by averaging the results of 500 independent trials. Note that the covariance fitting method clearly outperforms MUSIC and is able to resolve the two sources with a probability greater than 95% if N ≥ 30.

Figure 6
figure 6

Probability of resolution as a function of the number of snapshots. Two uncorrelated sources located at θ= [-36°, -30°], M = 9 sensors. The curves were obtained by averaging the results of 500 independent trials.

5. Conclusions

A new method for finding the DoA of multiple sources that impinge on a ULA has been presented in this article. The proposed technique is based on sparse signal representation and outperforms classical direction finding algorithms, even subspace methods, in terms of RMSE and probability of resolution. The proposed technique assumes white noise and uncorrelated point sources. Furthermore, it does not require either the knowledge of the number of sources or a previous initialization.

Appendix 1: proof of Theorem 1

The LARS/homotopy provides all the breakpoints τ0 > τ1 > > τstop ≥ 0 and the associated solutions p(τ0), p(τ1), ..., p(τstop) where a new component enter or leaves the support (the set of active elements) of p(τ). It can be proved that the sum of powers increases monotonically at each iteration of the algorithm. Suppose two non-negative vectors p(τ n ) and p(τn+1) that are minimizers of (14) for the regularization parameters τ n and τn+1, respectively, with τ n > τn+1≥ 0. The following inequality holds for the breakpoint τ n :

r ^ - A p ( τ n ) 2 2 + τ n p ( τ n ) 1 r ^ - A p ( τ n + 1 ) 2 2 + τ n p ( τ n + 1 ) 1
(17)

Note that the regularization parameter τ n is the same on both sides of the inequality. The expression on the right-hand side of the inequality (17) is equal to r ^ - A p ( τ n + 1 ) 2 2 + τ n + 1 p ( τ n + 1 ) 1 + ( τ n - τ n + 1 ) p ( τ n + 1 ) 1 . Therefore, the expression (17) can be rewritten as:

r ^ - A p ( τ n ) 2 2 + τ n p ( τ n ) 1 r ^ - A p ( τ n + 1 ) 2 2 + τ n + 1 p ( τ n + 1 ) 1 + ( τ n - τ n + 1 ) p ( τ n + 1 ) 1
(18)

By using minimization properties, if pn+1) is the minimizer of (14) for the regularization parameter τn+1. Then, next inequality holds:

r ^ - A p ( τ n + 1 ) 2 2 + τ n + 1 p ( τ n + 1 ) 1 r ^ - A p ( τ n ) 2 2 + τ n + 1 p ( τ n ) 1
(19)

Note that the regularization parameter τn+1is the same on both sides of the inequality. Bearing in mind (19) and (18), it is straightforward to obtain

r ^ - A p ( τ n ) 2 2 + τ n p ( τ n ) 1 r ^ - A p ( τ n ) 2 2 + τ n + 1 p ( τ n ) 1 + ( τ n - τ n + 1 ) p ( τ n + 1 ) 1
(20)

If the term τ n p (τ n )║1 is added and subtracted from expression on the right-hand side of the inequality (20), the next expression is obtained

r ^ - A p ( τ n ) 2 2 + τ n p ( τ n ) 1 r ^ - A p ( τ n ) 2 2 + τ n p ( τ n ) 1 + ( τ n - τ n + 1 ) ( p ( τ n + 1 ) 1 - p ( τ n ) 1 )
(21)

From this expression we can conclude that (τ n - τn+1) (║p(τn+1)║1 - ║p(τ n )║1) ≥ 0. As τ n > τn+1≥ 0, then ║p(τn+1)║1 - ║p(τ n )║1 ≥ 0. Finally, we obtain ║p(τn+1)║1 ≥ ║p(τ n )║1.

Appendix 2: proof of Theorem 2

If p(τn+1) is a vector with non-negative components that minimizes the problem (14) for τn+ 1> 0, then the following inequality is fulfilled:

r ^ - A p ( τ n + 1 ) 2 2 + τ n + 1 p ( τ n + 1 ) 1 r ^ - A p ( τ n ) 2 2 + τ n + 1 p ( τ n ) 1
(22)

which can be rewritten as:

τ n + 1 ( p ( τ n + 1 ) 1 - p ( τ n ) 1 ) r ^ - A p ( τ n ) 2 2 - r ^ - A p ( τ n + 1 ) 2 2
(23)

Since τn+1> 0 and ║p(τn+1)║1 - ║p(τ n )║1 ≥ 0, as it was proved in Theorem 1, the following in-equality is fulfilled r ^ - A p ( τ n ) 2 2 - r ^ - A p ( τ n + 1 ) 2 2 0 . Finally, we obtain r ^ - A p ( τ n ) 2 2 r ^ - A p ( τ n + 1 ) 2 2 .

Appendix 3: an alternative interpretation of the residual

The residual correlation b that appears when the LARS/homotopy algorithm is applied to the problem (14), has a clear physical interpretation.

Bearing in mind (11), the residual correlation b when LARS/homotopy is applied to (14) takes the form

b ( τ ) =  A T ( r ^ - A p ( τ )) =  A ̃ r T A ̃ i T r ^ r r ^ i - A ̃ r A ̃ i p ( τ )
(24)

which can be rewritten in terms of complex matrices à exposed in (10) and the sample covariance R ^ .

b ( τ ) = Re A ̃ H vec R ^ - A ̃ p ( τ )
(25)

The term Ãp(τ) can be expressed as

A ̃ p ( τ ) =  s 1 * s 1 s 2 * s 2 s G * s G p 1 ( τ ) p 2 ( τ ) p G ( τ ) = i = 1 G p i ( τ ) s i * s i
(26)

Since s i * s i = vec ( s i s i H ) , then A ̃ p ( τ ) = vec i = 1 G p i ( τ ) s i s i H

Applying from (26) to (25) the residual correlation at breakpoint τ yields:

b ( τ ) = Re A ̃ H vec R ^ - vec i = 1 G p i ( τ ) s i s i H = Re  A ̃ H vec R ^ - i = 1 G p i ( τ ) s i s i H
(27)

Bearing in mind the matrix à presented in (10), the last expression can be rewritten as:

b ( τ ) = Re s 1 T s 1 H s 2 T s 2 H s G T s G H vec C ^ τ = Re  s 1 T s 1 H vec C ^ τ s 2 T s 2 H vec C ^ τ s G T s G H vec C ^ τ = Re  s 1 H C ^ τ s 1 s 2 H C ^ τ s 2 s G H C ^ τ s G
(28)

being C ^ τ = R ^ - i = 1 G p i ( τ ) s i s i H .

The i th component of b(τ) is real because it fulfills s 1 H C ^ τ s 1 = s 1 H C ^ τ H s 1 . Therefore, the residual correlation yields:

b ( τ ) =  s 1 H C ^ τ s 1 s 2 H C ^ τ s 2 s G H C ^ τ s G T
(29)

This result provides an alternative interpretation of the residual correlation. At each breakpoint τ, the corresponding residual b(τ) can be seen as the Barlett estimator applied to the residual covariance matrix C ^ τ = R ^ - i = 1 G p i ( τ ) s i s i H .

References

  1. Trees HLV: Detection, Estimation, and Modulation Theory, Part IV: Optimum Array Processing. John Wiley & Sons, New York, USA; 2002.

    Google Scholar 

  2. Capon J: High-resolution frequency-wavenumber spectrum analysis. Proc IEEE 1969, 57(8):1408-1418.

    Article  Google Scholar 

  3. Lagunas MA, Gasull A: An improved maximum likelihood method for power spectral density estimation. IEEE Trans Acoustics Speech Signal Process 1984, ASSP-32(1):170-173.

    Article  MATH  Google Scholar 

  4. Schmidt R: Multiple emitter location and signal parameter estimation. IEEE Trans Antennas Propag 1986, 34(3):276-280. 10.1109/TAP.1986.1143830

    Article  Google Scholar 

  5. Stoica P, Nehorai A: Performance study of conditional and unconditional direction-of-arrival estimation. IEEE Trans Acoustics Speech Signal Process 1990, 38(10):1783-1795. 10.1109/29.60109

    Article  MATH  Google Scholar 

  6. Högbom J: Aperture synthesis with an non-regular distribution of interferomer baselines. Astron Astrophys 1974, 15: 417-426.

    Google Scholar 

  7. Stoica P, Moses R: Spectral Analysis of Signals. Prentice Hall, NJ, USA; 2005.

    Google Scholar 

  8. Tuncer TE, Friedlander B: Classical and Modern Direction-of-Arrival Estimation. Elsevier Academic Press, Burlington, USA; 2009.

    Google Scholar 

  9. Charbonnier P, Blanc-Feraud L, Aubert G, Barlaud M: Deterministic edge-preserving regularization in computed imaging. IEEE Trans Image Process 1997, 6: 298-311. 10.1109/83.551699

    Article  Google Scholar 

  10. Zou H, Hastie T: Regularization and variable selection via the elastic net. J Royal Stat Soc Ser B 2005, 67(2):301-320. 10.1111/j.1467-9868.2005.00503.x

    Article  MathSciNet  MATH  Google Scholar 

  11. Donoho DL: Compressed sensing. IEEE Trans Inf Theory 2006, 52(4):1289-1306.

    Article  MathSciNet  MATH  Google Scholar 

  12. Boyd S, Vandenberghe L: Convex Optimization. Cambridge University Press, Cambridge; 2004.

    Book  MATH  Google Scholar 

  13. Donoho DL, Elad M: Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization. Proc Nat Aca Sci 2003, 100(5):2197-2202. 10.1073/pnas.0437847100

    Article  MathSciNet  MATH  Google Scholar 

  14. Donoho DL: For most large underdetermined systems of equations, the minimal l1-norm near-solution approximates the sparsest near-solution. Commun Pure Appl Math 2006, 59(7):907-934. 10.1002/cpa.20131

    Article  MathSciNet  MATH  Google Scholar 

  15. Tibshirani R: Regression shrinkage and selection via the lasso. J Royal Stat Soc Ser B 1996, 58: 267-288.

    MathSciNet  MATH  Google Scholar 

  16. Chen SS, Donoho DL, Saunders MA: Atomic decomposition by basis pursuit. SIAM J Sci Comput 1998, 20: 33-61. 10.1137/S1064827596304010

    Article  MathSciNet  MATH  Google Scholar 

  17. Donoho DL, Tsaig Y: Fast solution of l1-norm minimization problems when the solution may be sparse. IEEE Trans Inf Theory 2008, 54: 4789-4812.

    Article  MathSciNet  MATH  Google Scholar 

  18. Osborne MR, Presnell B, Turlach BA: A new approach to variable selection in least squares problems. IMA J Numer Anal 2000, 20(3):389-403. 10.1093/imanum/20.3.389

    Article  MathSciNet  MATH  Google Scholar 

  19. Efron B, Hastie T, Johnstone I, Tibshirani R: Least angle regression. Annals Stat 2004, 32: 407-499. 10.1214/009053604000000067

    Article  MathSciNet  MATH  Google Scholar 

  20. Gorodnitsky IF, Rao BD: Sparse signal reconstruction from limited data using focuss: a re-weighted minimum norm algorithm. IEEE Trans Signal Process 1997, 45(3):600-616. 10.1109/78.558475

    Article  Google Scholar 

  21. Fuchs J: Linear programming in spectral estimation: application to array processing. In International Conference on Acoustics, Speech and Signal Processing (ICASSP). Atlanta, GA; 1996:3161-3164. 6

    Google Scholar 

  22. Cotter SF, Rao BD, KE , Kreutz-delgado K: Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Trans Signal Process 2005, 53(7):2477-2488.

    Article  MathSciNet  Google Scholar 

  23. Yardibi T, Li J, Stoica P, Xue M, Baggeroboer AB: Source localization and sensing: A nonparametric iterative adaptive approach based on weighted least squares. IEEE Trans Aerospace Electron Syst 2010, 46(1):425-443.

    Article  Google Scholar 

  24. Malioutov DM, Çetin M, Willsky AS: A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans Signal Process 2005, 53: 3010-3022.

    Article  MathSciNet  Google Scholar 

  25. Eldar YC, Rauhut H: Saverage case analysis of multichannel sparse recovery using convex relaxation. IIEEE Trans Inf Theory 2010, 56(1):505-519.

    Article  MathSciNet  Google Scholar 

  26. Yardibi T, Li J, Stoica P, Cattafesta LN: Sparsity constrained deconvolution approaches for acoustic source mapping. J Acoust Soc Am 2008, 123(5):2631-2642. 10.1121/1.2896754

    Article  Google Scholar 

  27. Picard JS, Weiss AJ: Direction finding of multiple emitters by spatial sparsity and linear programming. In International conference on Communications and information technologies (ISCIT). Incheon, Korea; 2009:1258-1262.

    Google Scholar 

  28. Stoica P, Babu P, Li J: Spice: A sparse covariance-based estimation method for array processing. IEEE Trans Signal Process 2011, 59(2):629-638.

    Article  MathSciNet  Google Scholar 

  29. Liu Z, Huang Z, Zhou Y: Direction-of-arrival estimation of wideband signals via covariance matrix sparse representation. IEEE Trans Signal Process 2011, 59(9):4256-4270.

    Article  MathSciNet  Google Scholar 

  30. Stoica P, Babu P, Li J: A sparse covariance-based method for direction of arrival estimation. In International Conference on Acoustics, Speech and Signal Processing (ICASSP). Prague, Czech Republic; 2011:2844-2847.

    Google Scholar 

  31. Mørup M, Madsen KH, Hansen LK: Approximate L0constrained non-negative matrix and tensor factorization. In International Conference on Circuits and Systems (ISCAS). Seattle, WA; 2008:1328-1331.

    Google Scholar 

  32. Rust BW, O'Leary DP: Residual periodograms for choosing regularization parameters for ill-posed problems. Inverse Probl 2008, 24(3):1-30.

    Article  MathSciNet  MATH  Google Scholar 

  33. Hansen P, Kilmer M, Kjeldsen R: Exploiting residual information in the parameter choice for discrete ill-posed problems. BIT Numer Math 2006, 46(1):41-59. 10.1007/s10543-006-0042-7

    Article  MathSciNet  MATH  Google Scholar 

  34. Barlett M: An introduction to Stochastic Processes. Cambridge University Press, Cambridge; 1966.

    Google Scholar 

  35. Durbin J: Tests for serial correlation in regression analysis based on the periodogram of least-squares residuals. Biometrika 1969, 56(1):1-15. 10.1093/biomet/56.1.1

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This study was partially supported by the Spanish Ministry of Economy and Competitiveness under projects TEC2011-29006-C03-01 (GRE3N-PHY) and TEC2011-29006-C03-02 (GRE3N-LINK-MAC) and by the Catalan Government under grant 2009 SGR 891.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis Blanco.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Blanco, L., Nájar, M. Sparse covariance fitting for direction of arrival estimation. EURASIP J. Adv. Signal Process. 2012, 111 (2012). https://doi.org/10.1186/1687-6180-2012-111

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1687-6180-2012-111

Keywords