 Research
 Open access
 Published:
Realified L1PCA for directionofarrival estimation: theory and algorithms
EURASIP Journal on Advances in Signal Processing volume 2019, Article number: 30 (2019)
Abstract
Subspacebased directionofarrival (DoA) estimation commonly relies on the PrincipalComponent Analysis (PCA) of the sensorarray recorded snapshots. Therefore, it naturally inherits the sensitivity of PCA against outliers that may exist among the collected snapshots (e.g., due to unexpected directional jamming). In this work, we present DoAestimation based on outlierresistant L1norm principal component analysis (L1PCA) of the realified snapshots and a complete algorithmic/theoretical framework for L1PCA of complex data through realification. Our numerical studies illustrate that the proposed DoA estimation method exhibits (i) similar performance to the conventional L2PCAbased method, when the processed snapshots are nominal/clean, and (ii) significantly superior performance when the snapshots are faulty/corrupted.
1 Introduction
Directionofarrival (DoA) estimation is a fundamental problem in signal processing theory with important applications in localization, navigation, and wireless communications [1–6]. Existing DoAestimation methods can be broadly categorized as (i) likelihood maximization methods [7–13], (ii) spectral estimation methods, as in the early works of [14, 15], and (iii) subspacebased methods [16–19]. Subspacebased methods have enjoyed great popularity in applications, mostly due to their favorable tradeoff between angle estimation quality and computational simplicity in implementation.
In their most common form, subspacebased DoA estimation methods rely on the L2norm principal components (L2PCs) of the recorded snapshots, which can be simply obtained by means of singularvalue decomposition (SVD) of the sensorarray data matrix, or by eigenvalue decomposition (EVD) of the receivedsignal autocorrelation matrix [20]. Importantly, under nominal system operation (i.e., no faulty measurements or unexpected jamming/interfering sources), in additive white Gaussian noise (AWGN) environment, such methods are known to offer unbiased, asymptotically consistent DoA estimates [21–23] and exhibit high targetangle resolution (“superresolution” methods).
However, in many realworld applications, the collected snapshot record may be unexpectedly corrupted by faulty measurements, impulsive additive noise [24–26], and/or intermittent directional interference. Such interference may appear either as an endogenous characteristic of the underlying communication system, as for example in frequencyhopped spreadspectrum systems [27], or as an exogenous factor (e.g., jamming). In cases of such snapshot corruption, L2PCbased methods are well known to suffer from significant performance degradation [28–30]. The reason is that, as squared errorfitting minimizers, L2PCs respond strongly to corrupted snapshots that appear in the processed data matrix as points that lie far from the nominal signal subspace [29]. Accordingly, DoA estimators that rely upon the L2PCs are inevitably misled.
At the same time, research in signal processing and data analysis has shown that absolute errorfitting minimizers place much less emphasis on individual data points that diverge from the nominal signal subspace than squarefittingerror minimizers. Based on this observation, in the past few years, there have been extended documented research efforts toward defining and calculating L1norm principal components (L1PCs) of data under various forms of L1norm optimality, including absoluteerror minimization and projection maximization [31–46]. Recently, Markopoulos et al. [47, 48] calculated optimally the maximumprojection L1PCs of realvalued data, for which up to that point only suboptimal approximations were known [36–38]. Experimental studies in [47–53] demonstrated the sturdy resistance of optimal L1norm principalcomponent analysis (L1PCA) against outliers, in various signal processing applications. Recently, [43, 45] introduced a heuristic algorithm for L1PCA that was shown to attain stateoftheart performance/cost tradeoff. Another popular approach for outlierresistant PCA is “Robust PCA” (RPCA), as introduced in [29] and further developed in [54, 55].
In this work, we consider system operation in the presence of unexpected, intermittent directional interference and propose a new method for DoAestimation that relies on the L1PCA of the recorded complex snapshots. Importantly, this work introduces a complete paradigm on how L1PCA, defined and solved over the real field [47, 48], can be used for processing complex data, through a simple “realification" step. An alternative approach for L1PCA of complexvalued data was presented in [46], where the authors reformulated complex L1PCA into unimodular nuclearnorm maximization (UNM) and estimated its solution through a sequence of converging iterations. It is noteworthy that for the UNM introduced in [46], no general exact solver exists to date.
Our numerical studies show that the proposed L1PCAbased DoAestimation method attains performance similar to the conventional L2PCAbased one (i.e., MUSIC [16]) in the absence of jamming sources, while it offers significantly superior performance in the case of unexpected, sporadic contamination of the snapshot record.
Preliminary results were presented in [56]. The present paper is significantly expanded to include (i) an Appendix section with all necessary technical proofs, (ii) important new theoretical findings (Proposition 3 on page 7), (iii) new algorithmic solutions (Section 3.5), and (iv) extensive numerical studies (Section 4).
The rest of the paper is organized as follows. In Section 2, we present the system model and offer a preliminary discussion on subspacebased DoA estimation. In Section 3, we describe in detail the proposed L1PCAbased DoAestimation method and present three algorithms for L1PCA of the snapshot record. Section 4 presents our numerical studies on the performance of the proposed DoA estimation method. Finally, Section 5 holds some concluding remarks.
1.1 Notation
We denote by \(\mathbb {R}\) and \(\mathbb {C}\) the set of real and complex numbers, respectively, and by j the imaginary unit (i.e., j^{2}=−1). ℜ{(·)},I{(·)},(·)^{∗},(·)^{⊤}, and (·)^{H} denote the real part, imaginary part, complex conjugate, transpose, and conjugate transpose (Hermitian) of the argument, respectively. Bold lowercase letters represent vectors and bold uppercase letters represent matrices. diag(·) is the diagonal matrix formed by the entries of the vector argument. For any \(\mathbf {A} \in \mathbb {C}^{m \times n}, [\mathbf {A}]_{i,q}\) denotes its (i,q)th entry, [A]_{:,q} its qth column, and [A]_{i,:} its ith row; \(\left \ \mathbf {A} \right \_{p} \stackrel {\triangle }{=} \left (\sum \nolimits _{i=1}^{m} \sum \nolimits _{q=1}^{n}  [\mathbf {A}]_{i,q} ^{p}\right)^{\frac {1}{p}}\) is the pth entrywise norm of A,∥A∥_{∗} is the nuclear norm of A (sum of singular values), span(A) represents the vector subspace spanned by the columns of A,rank(A) is the dimension of span(A), and null(A^{⊤}) is the kernel of span(A) (i.e., the nullspace of A^{⊤}). For any square matrix \(\mathbf {A} \in \mathbb {C}^{m \times m}, \text {det} (\mathbf {A})\) denotes its determinant, equal to the product of its eigenvalues. ⊗ and ⊙ are the Kronecker and entrywise (Hadamard) product operators [57], respectively. 0_{m×n},1_{m×n}, and I_{m} are the m×n allzero, m×n allone, and sizem identity matrices, respectively. Also, \(\mathbf {E}_{m} \stackrel {\triangle }{=} \left [ \begin {array}{cc} 0 & 1 \\ 1 & 0 \end {array} \right ] \otimes \mathbf {I}_{m}\), for \(m \in \mathbb {N}_{\geq 1}\), and e_{i,m} is the ith column of I_{m}. Finally, E{·} is the statisticalexpectation operator.
2 System model and preliminaries
We consider a uniform linear antenna array (ULA) of D elements. The lengthD response vector to a farfield signal that impinges on the array with angle of arrival \(\theta \in (\frac {\pi }{2}, \frac {\pi }{2}]\) with respect to (w.r.t.) the broadside is defined as
where f_{c} is the carrier frequency, c is the signal propagation speed, and d is the fixed interelement spacing of the array. We consider that the uniform interelement spacing d is no greater than half the carrier wavelength, adhering to the Nyquist spatial sampling theorem; i.e., \(d \leq \frac {c}{2 f_{c}}\). Accordingly, for any two distinct angles of arrival \(\theta, \theta ' \in (\frac {\pi }{2}, \frac {\pi }{2}]\), the corresponding array response vectors s(θ) and s(θ^{′}) are linearly independent.
The ULA collects N narrowband snapshots from K sources of interest (targets) arriving from distinct DoAs \(\theta _{1}, \theta _{2}, \ldots, \theta _{K} \in \left (\frac {\pi }{2}, \frac {\pi }{2} \right ], K < D \leq N\). We assume that the system may also experience intermittent directional interference from L independent sources (jammers), at angles \(\theta _{1}^{\prime }, \theta _{2}^{\prime }, \ldots, \theta _{L}^{\prime } \in \left (\frac {\pi }{2}, \frac {\pi }{2} \right ]\). A schematic illustration of the targets and jammers is given in Fig. 1. We assume that \(\theta _{i} \neq \theta _{q}^{\prime }\), for any i∈{1,2,…,K} and q∈{1,2,…,L}. For any l∈{1,2,…,L}, the lth jammer may be active during any of the N snapshots with some fixed and unknown to the receiver probability p_{l}. Accordingly, the nth downconverted received data vector is of the form
where, x_{n,k} and \(x_{n,l}^{\prime } \in \mathbb {C}\) denote the statistically independent signal values of target k and jammer l, respectively, comprising powerscaled information symbols and flatfading channel coefficients, and γ_{n,l} is the activity indicator for jammer l, modeled as a {0,1}Bernoulli random variable with activation probability p_{l}. \(\mathbf {n}_{n} \in \mathbb {C}^{D \times 1}\) accounts for additive white Gaussian noise (AWGN) with mean equal to zero and perelement variance σ^{2}; i.e., \(\mathbf {n}_{n} \sim \mathcal {CN} \left (\mathbf {0}_{D}, \sigma ^{2} \mathbf {I}_{D}\right)\). Henceforth, we refer to the case of targetonly presence in the collected snapshots (i.e., γ_{n,l}=0 for every n=1,2,…,N and every l=1,2,…,L) as normal system operation.
Defining \(\mathbf {x}_{n}\! \stackrel {\triangle }{=} [x_{n,1}, x_{n,2}, \ldots, x_{n,K}]^{\top }, \mathbf {x}_{n}^{\prime } \stackrel {\triangle }{=} [x_{n,1}^{\prime }, x_{n,2}^{\prime }, \ldots, x_{n,L}^{\prime }]^{\top }, {\mathbf {\Gamma }}_{n} \stackrel {\triangle }{=} \mathbf {diag}\left ([\gamma _{n,1}, \gamma _{n,2}, \ldots, \gamma _{n,L}]^{\top }\right)\), and \( \mathbf {S}_{\Phi } \stackrel {\triangle }{=} \left [ \mathbf {s} (\phi _{1}), \mathbf {s} (\phi _{2}), \ldots, \mathbf {s}(\phi _{m}) \right ] \in \mathbb {C}^{D \times m} \) for any sizem set of angles \(\Phi \stackrel {\triangle }{=} \{{\phi }_{1}, {\phi }_{2}, \dots, {\phi }_{m} \} \in \left (\frac {\pi }{2}, \frac {\pi }{2} \right ]^{m}\),^{Footnote 1} (2) can be rewritten as
for \(\Theta \stackrel {\triangle }{=} \left \{{\theta }_{1}, {\theta }_{2}, \dots, {\theta }_{K} \right \}\) and \(\Theta ^{\prime } \stackrel {\triangle }{=} \left \{{\theta }_{1}^{\prime }, {\theta }_{2}^{\prime }, \dots, {\theta }_{L}^{\prime }\right \}\). The goal of a DoA estimator is to identify correctly all angles in the DoA set Θ. Importantly, by the Vandermonde structure of S_{Θ}, it holds that
for any \( \phi \in \left (\frac {\pi }{2}, \frac {\pi }{2} \right ]\) [16]. That is, given \(\mathcal {S} \stackrel {\triangle }{=} \text {span}(\mathbf {S}_{\Theta })\), the receiver can decide accurately for any candidate angle \( \phi \in \left (\frac {\pi }{2}, \frac {\pi }{2} \right ]\) whether it is a DoA in Θ, or not.
2.1 DoA estimation under normal system operation.
Considering for a moment p_{l}=0 for every l∈{1,2,…,L}, (2) becomes
with autocorrelation matrix \( \mathbf {R} \stackrel {\triangle }{=} E \left \{\mathbf {y}_{n} \mathbf {y}_{n}^{\mathrm {H}} \right \} = \mathbf {S}_{\Theta } E \left \{\mathbf {x}_{n} \mathbf {x}_{n}^{\mathrm {H}}\right \} \mathbf {S}_{\Theta }^{\mathrm {H}} + \sigma ^{2} \mathbf {I}_{D} \). Certainly, \(\mathcal {S} = \text {span}(\mathbf {S}_{\Theta })\) coincides with the Kdimensional principal subspace of R, spanned by its K highesteigenvalue eigenvectors [5]. Therefore, being aware of R, the receiver could obtain \(\mathcal {S}\) through standard EVD and then conduct accurate DoA estimation by means of (4). However, in practice, the nominal receivedsignal autocorrelation matrix R is unknown to the receiver and sampleaverage estimated as \(\hat {\mathbf {R}} = \frac {1}{N} \sum \nolimits _{n=1}^{N} \mathbf {y}_{n} \mathbf {y}_{n}^{\mathrm {H}}\) [5, 16]. Accordingly, \(\mathcal {S}\) is estimated by the span of the K highesteigenvalue eigenvectors of \(\hat {\mathbf {R}}\), which coincide with the K highestsingularvalue left singularvectors of Y=△[y_{1},y_{2},…,y_{N}]. The eigenvectors of \(\hat {\mathbf {R}}\), or left singularvectors of Y, are also commonly referred to as the L2PCs of Y, since they constitute a solution to the L2PCA problem
In accordance to (4), the DoA set Θ is estimated by the arguments that yield the K local maxima (peaks) of the familiar MUSIC [16] spectrum
which clarifies why MUSIC is, in fact, an L2PCAbased DoA estimation method. Certainly, as N increases asymptotically, \(\hat {\mathbf {R}}\) tends to R,Q_{L2} tends to span S_{Θ}, and P(ϕ) goes to infinity for every ϕ∈Θ and finding its peaks becomes a criterion equivalent to (4). Therefore, for sufficient N, L2PCAbased MUSIC is wellknown to attain high performance in normal system operation.
2.2 Complications in the presence of unexpected jamming
In this work, we focus on the case where p_{l}>0 for all l, so that some snapshots in Y are corrupted by unexpected, unknown, directional interference, as modeled in (2). In this case, the K eigenvectors of \(\mathbf {R} = E \left \{\mathbf {y}_{n} \mathbf {y}_{n}^{\mathrm {H}}\right \}\) do not span \(\mathcal {S}\) any more. Thus, the K eigenvectors of \(\hat {\mathbf {R}}\) or singularvectors of Y would be of no use, even for very high samplesupport N. In fact, interferencecorrupted snapshots in Y may constitute outliers with respect to \(\mathcal {S}\). Accordingly, due to the well documented high responsiveness of L2PCA in (6) to outlying data, Q_{L2} may diverge significantly from \(\mathcal {S}\) [29, 48], rendering DoA estimation by means of (7) highly inaccurate. Below, we introduce a novel method that exploits the outlierresistance of L1PCA [36, 47, 48] to offer improved DoA estimates.
3 Proposed DoA estimation method
3.1 Operation on realified snapshots
In order to employ L1PCA algorithms that are defined for the processing of realvalued data, the proposed DoA estimation method operates on realvalued representations of the recorded complex snapshots in (2), similar to a number of previous works in the field [58–60]. In particular, we define the realvalued representation of any complexvalued matrix \(\mathbf {A} \in \mathbb {C}^{m \times n}\), by concatenating its real and imaginary parts, as
In Lie algebras and representation theory, this transition from C^{m×n} to \( \mathbb {R}^{2m \times 2n}\) is commonly referred to as complexnumber realification [61, 62] and is a method that allows for any complex system of equations to be converted into (and solved through) a corresponding real system [63]. Lemmas 1, 2, and 3 presented in the Appendix provide three important properties of realification. By (8) and Lemma 1, the nth complex snapshot y_{n} in (3) can be realified as
In accordance with Lemma 2, the rank of \( \overline {\mathbf {S}}_{\Theta } \) is 2K and, hence, \(\mathcal {S}_{R} \stackrel {\triangle }{=} \text {span} \left (\overline {\mathbf {S}}_{\Theta }\right) \) is a 2Kdimensional subspace wherein the K realified signal components of interest with angles of arrival in Θ lie. The following Proposition, deriving straightforwardly from (4) by means of Lemma 1 and Lemma 2, highlights the utility of \(\mathcal {S}_{R}\) for estimating the target DoAs.
Proposition 1
For any \( \phi \in \left (\frac {\pi }{2}, \frac {\pi }{2} \right ]\), it holds that
Set equality may hold only if K=1. ■
By Proposition 1, given an orthonormal basis \(\mathbf {Q}_{R} \in \mathbb {R}^{2D \times 2K}\) that spans \(\mathcal {S}_{R}\), the receiver can decide accurately whether some \(\phi \in \left (\frac {\pi }{2}, \frac {\pi }{2} \right ]\) is a target DoA, or not, by means of the criterion
Similar to the complexdata case presented above, in normal system operation, \(\mathcal {S}_{R}\) coincides with the span of the K dominant eigenvectors of \( \mathbf {R}_{R} \stackrel {\triangle }{=} \mathrm {E} \left \{\overline {\mathbf {y}}_{n} \overline {\mathbf {y}}_{n}^{\top } \right \}\). When the receiver, instead of R_{R}, possesses only the realified snapshot record \(\overline {\mathbf {Y}}, \mathcal {S}_{R}\) can be estimated as the span of
Then, in accordance with (11), the target DoAs can be estimated as the arguments that yield the K highest peaks of the spectrum
Similar to (6), the solution to (12) can be obtained by singularvalue decomposition (SVD) of \(\overline {\mathbf {Y}}\). Interestingly, the L2PCAbased DoA estimator of (13) is equivalent to the complexfield MUSIC estimator presented in Section 2. In fact, as we prove in the Appendix,
Hence, exhibiting performance identical to that of MUSIC, (12) can offer highly accurate estimates of the target DoAs under normal system operation. However, when Y contains corrupted snapshots, the L2PCAcalculated span(Q_{R,L2}) is a poor approximation to \(\mathcal {S}_{R}\) and DoA estimation by means of P_{R}(ϕ;Q_{R,L2}) tends to be highly inaccurate. In the following subsection, we present an alternative, L1PCAbased method for obtaining an outlierresistant estimate of Θ.
3.2 DoA estimation by realified L1PCA
Over the past few years, L1PCA has been shown to be far more resistant than L2PCA against outliers in the data matrix [31–40, 47, 48]. In this work, we propose the use of a DoAestimation spectrum analogous to that in (13) that is formed by the L1PCs of \(\overline {\mathbf {Y}}\). Specifically, the proposed method has two steps. First, we obtain the L1PCs of \(\overline {\mathbf {Y}}\), solving the L1PCA problem
That is, (15) searches for the subspace that maximizes data presence, quantified as the aggregate L1norm of the projected points.
Then, similarly to MUSIC, we estimate the target angles in Θ by the K highest peaks of the L1PCAbased spectrum
In accordance to standard practice, to find the K highest peaks of (16), we examine every angle in \(\left \{\phi =\frac {\pi }{2}+k \Delta \phi :~ k\in \left \{1, 2, \ldots, \left \lfloor \frac {\pi }{\Delta \phi } \right \rfloor ~ \right \}\right \}\), for some small scanning step Δϕ>0. Next, we place our focus on solving the L1PCA in (15).
3.3 Principles of realified L1PCA
Although L1PCA is not a new problem in the literature (see, e.g., [36–38]), its exact optimal solution was unknown until the recent work in [48], where the authors proved that (15) is formally NPhard and offered the first two exact algorithms for solving it. Proposition 2 below, originally presented in [48] for realvalued data matrices of general structure (i.e., not having necessarily the realified structure of \(\overline {\mathbf {Y}}\)) translates L1PCA in (15) to a nuclearnorm maximization problem over the binary field.
Proposition 2
If B_{opt} is a solution to
and \(\overline {\mathbf {Y}} \mathbf {B}_{\text {opt}}\) admits SVD \(\overline {\mathbf {Y}} \mathbf {B}_{\text {opt}} \overset {\text {}}{=} \mathbf {U} \mathbf {\Sigma }_{2K \times 2K} \mathbf {V}^{\top }\), then
is a solution to ( 15 ). Moreover, \(\left \ {\mathbf {Q}}_{R,L1}^{\top } \overline {\mathbf {Y}}\right \_{1} = \left \ \overline {\mathbf {Y}} \mathbf {B}_{\text {opt}}\right \_{*}\). ■
Since Q_{R,L1} can be obtained by B_{opt} via standard SVD, L1PCA is in practice equivalent to a combinatorial optimization problem over the 4N,K binary variables in B. The authors in [48] presented two algorithms for exact solution of (17), defined upon realvalued data matrices of general structure.
In this work, for the first time, we simplify the solutions of [48] in view of the special, realified structure of \(\overline {\mathbf {Y}}\). Specifically, in the following Proposition 3, we show that for K=1 we can exploit the special structure of \(\overline {\mathbf {Y}}\) and reduce (17) to a binary quadraticform maximization problem over half the number of binary variables (i.e., 2N instead of 4N). A proof for Proposition 3 is provided in the Appendix.
Proposition 3
If b_{opt} is a solution to
then [b_{opt}, E_{N}b_{opt}] is a solution to
with \( \ \overline {\mathbf {Y}}~[\mathbf {b}_{\text {opt}}, \mathbf {E}_{N} \mathbf {b}_{\text {opt}}]\_{*}^{2} = 4~ \\overline {\mathbf {Y}} \mathbf {b}_{\text {opt}} \_{2}^{2}. \label {nucnorm5} \) ■
In view of Propositions 2 and 3, Q_{R,L1} derives easily from the solution of
for m=1, if K=1, or m=2K, if K>1.
Since (21) is a combinatorial problem, the conceptually simplest approach for solving it is an exhaustive search (possibly in parallel fashion) over all elements of its feasibility set {±1}^{2N×m}. By means of this method, one should conduct 2^{2Nm} nuclear norm evaluations (e.g., by means of SVD of \(\overline {\mathbf {Y}} \mathbf {B}\)) to identify the optimum argument in the feasibility set; thus, the asymptotic complexity of this method is \(\mathcal {O}\left (2^{2Nm}\right)\). Exploiting the wellknown nuclearnorm properties of columnpermutation and columnnegation invariance, we can expedite practically the exhaustive procedure by searching for a solution to (21) in the set of all binary matrices that are columnwise built by the elements of a sizem multiset^{Footnote 2} of {b∈{±1}^{2N}: [b]_{1}=1}. By this modification, the exact number of binary matrices examined (thus, the number of nuclearnorm evaluations) decreases from 2^{2Nm} to \({{2^{2N1}+2K1}\choose {m}}\). Of course, exhaustivesearch approaches, being of exponential complexity in N, become impractical as the number of snapshots increases. For completeness, in Fig. 2, we provide a pseudocode for the exhaustivesearch algorithm presented above.
For the case of engineering interest where N>D and D is a constant, the authors in [48] presented a polynomialcost algorithm that solves (21) with complexity \(\mathcal {O}(N^{2Dm})\). In the following subsection, we exploit further the structure of \(\overline {\mathbf {Y}}\) and reduce significantly the computational cost of this algorithm.
3.4 Polynomialcost realified L1PCA
The authors in [48] showed that, according to Proposition 2, a solution to (21) can be found among the binary matrices that draw columns from
where \(\Omega _{2D} \stackrel {\triangle }{=} \left \{\mathbf {a} \in \mathbb {R}^{2D \times 1}:~ \\mathbf {a} \_{2}=1, [\!\mathbf {a}]_{2D} >0\right \}\) –with the positivity constraint in the last entry of a deriving from the invariance of the nuclear norm to column negations of its matrix argument. That is, a solution to (21) belongs to the mth Cartesian power of \(\mathcal {B}, \mathcal {B}^{m} \subseteq \{\pm 1 \}^{2N \times m}\).
In addition, [48] pointed out that, since the nuclearnorm maximization is also invariant to column permutations of the argument, we can maintain problem equivalence while further narrowing down our search to the elements of a set \( \tilde {\mathcal {B}}\), subset of \(\mathcal {B}^{m}\), that contains the \({{\mathcal {B} +m1}\choose {m}}\) binary matrices that are built by the elements of all sizem multisets of \(\mathcal {B}\). That is, we can obtain a solution to (21) by solving instead
Importantly, \(\tilde {\mathcal {B}} = {{\mathcal {B} +m1}\choose {m}} < \mathcal {B}^{m} = \mathcal {B}^{m}\). The exact multisetextraction procedure for obtaining \(\tilde {\mathcal {B}}\) from \(\mathcal {B}\) follows.
Calculation of\(\tilde {\boldsymbol{\boldsymbol{\mathcal {B}}}}\) from\(\boldsymbol{\boldsymbol{\mathcal {B}}}\)[48]. For every \( i \in \left \{1,2,\ldots, \binom {\mathcal {B} + m1 }{m} \right \}\), we define a distinct indicator function \( f_{i} : \mathcal {B} \mapsto \{0, 1, \ldots, m\}\) that assigns to every \( \mathbf {b} \in \mathcal {B}\) a natural number f_{i}(b)≤m, such that \(\sum \nolimits _{\mathbf {b}\in \mathcal {B}} f_{i}(\mathbf {b})=m\). Then, for every \(i \in \left \{1,2,\ldots, {{\mathcal {B} +m1}\choose {m}}\right \}\), we define a unique binary matrix B_{i}∈{±1}^{2N×m} such that every \(\mathbf {b} \in \mathcal {B}\) appears exactly f_{i}(b) times among the columns of B_{i}. Finally, we define the soughtafter set as \(\tilde {\mathcal {B}} \stackrel {\triangle }{=} \left \{\mathbf {B}_{1}, \mathbf {B}_{2}, \ldots, \mathbf {B}_{{{\mathcal {B} +m1}\choose {m}}}\right \}\).
Evidently, the cost to solve (23), and thus (21), amounts to the cost of constructing the feasibility set \(\tilde {\mathcal {B}}\) added to the cost of conducting nuclearnorm evaluations (through SVD) over all its elements. Therefore, the cost to solve (23) depends on the construction cost and cardinality of \(\tilde {\mathcal {B}}\). As seen above, \(\tilde {\mathcal {B}} = {{\mathcal {B} + m1}\choose {m}}\) and \(\tilde {\mathcal {B}}\) can be constructed online, by multiset selection on \(\mathcal {B}\), with negligible computational cost. Therefore, for determining the cardinality and construction cost of \(\tilde {\mathcal {B}}\), we have to find the cardinality and construction cost of \(\mathcal {B}\).
Next, we present a novel method to construct \({\mathcal {B}}\), different than the one in [48], that exploits the realified structure of \(\overline {\mathbf {Y}}\) to achieve lower computational cost.
Construction of \(\boldsymbol{\boldsymbol{\mathcal {B}}}\) , in view of the structure of \(\overline {\mathbf {Y}}\) .
Considering that any group of m≤2D columns of \(\overline {\mathbf {Y}}\) spans a mdimensional subspace, for each index set \(\mathcal {X} \subseteq \{1, 2, \ldots, 2N \}\) –elements in ascending order (e.a.o.)– of cardinality \(\mathcal {X} = 2D1\), we denote by \(\mathbf {z}(\mathcal {X})\) the unique leftsingular vector of \(\left [\overline {\mathbf {Y}}\right ]_{:,\mathcal {X}}\) that corresponds to zero singular value. Calculation of \(\mathbf {z}(\mathcal {X})\) can be achieved either by means of SVD or by simple GramSchmidt Orthonormalization (GMO) of \(\left [\overline {\mathbf {Y}}\right ]_{:,\mathcal {X}}\) –both SVD and GMO are of constant cost with respect to N. Accordingly, we define
Being a scaled version of \(\mathbf {z}(\mathcal {X}), \mathbf {c}(\mathcal {X}) \) also belongs to \( \text {null}\left (\left [\overline {\mathbf {Y}}\right ]_{:,\mathcal {X}}^{\top }\right) \), satisfying \( \left [\overline {\mathbf {Y}}\right ]_{:,\mathcal {X}}^{\top } \mathbf {c}(\mathcal {X}) = \mathbf {0}_{2D1} \label {c}. \) Next, we define the set of binary vectors
of cardinality \(\mathcal {B}(\mathcal {X}) = 2^{2D1}\), where \(\mathcal {X}^{c} \stackrel {\triangle }{=} \{1, 2, \ldots, 2N \} \setminus \mathcal {X}\) (e.a.o.) is the complement of \(\mathcal {X}\). In [48], the authors showed that
Since \(\mathcal {X}\) can take \({{2N}\choose {2D1}}\) different values, \(\mathcal {B}\) can be built by (26) through \({{2N}\choose {2D1}}\) nullspace calculations in the form of (24), with cost \({{2N}\choose {2D1}}D^{3} \in \mathcal {O}\left (N^{2D}\right)\). Accordingly, \(\mathcal {B}\) consists of
elements. In fact, in view of [64], the exact cardinality of \(\mathcal {B}\) is
Next, we show for the first time how we can reduce the cost of calculating \(\mathcal {B}\), exploiting the realified structure of \(\overline {\mathbf {Y}}\).
Consider \(\mathcal {X}_{1} \subseteq \{1, 2, \ldots, N \}\) (e.a.o.), \(\mathcal {X}_{2} \subseteq \{N+1, N+2, \ldots, 2N \}\) (e.a.o.), and their union \(\mathcal {X}_{A} = \{\mathcal {X}_{1}, \mathcal {X}_{2} \} \) (e.a.o.), such that \(\mathcal {X}_{1} < D\) and \(\mathcal {X}_{A}=\mathcal {X}_{1} + \mathcal {X}_{2} = 2D1\). Define also the set of indices \(\mathcal {X}_{B} = \{\mathcal {X}_{1} + N, \mathcal {X}_{2}  N\}\) (e.a.o.) with \(\mathcal {X}_{B} = 2D1\). By the structure of \(\overline {\mathbf {Y}}\), it is straightforward that
In turn, by the definition in (25) and (29), it holds that
The proof of (29) and (30) is offered in the Appendix. Notice now that, for every \(\mathcal {X} \subset \{1, 2, \ldots, 2N \}\) with \(\mathcal {X} = 2D1\), there exist \(\mathcal {X}_{1} \subset \{1, 2, \ldots, N\}\) and \(\mathcal {X}_{2} \subset \{N+1, N+2, \ldots, 2N\}\), satisfying \(\mathcal {X}_{1} < D\) and \(\mathcal {X}_{1} + \mathcal {X}_{2}=2D1\), such that
Thus, by (26) and (31), \(\mathcal {B}\) can constructed as
In view of (30), \(\mathcal {B} (\{\mathcal {X}_{1} +N, \mathcal {X}_{2} N\}) \) can be directly constructed from \(\mathcal {B} (\{\mathcal {X}_{1}, \mathcal {X}_{2} \})\) with negligible computational overhead. In addition, for \(\mathcal {X}_{1} < D\) and \(\mathcal {X}_{1}+\mathcal {X}_{2} =2D1\), by the ChuVandermonde binomialcoefficient property [65], \(\{\mathcal {X}_{1}, \mathcal {X}_{2} \}\) can take
values. Therefore, exploiting the structure of \(\overline {\mathbf {Y}}\), the proposed algorithm constructs \(\mathcal {B}\) by (32), avoiding half the nullspace calculations needed in the generic method of [48], presented in (26).
In view of (28), the feasibility set of (23), \(\tilde {\mathcal {B}}\), consists of exactly
elements. Thus, \(\mathcal {O}(N^{2Dm  m})\) nuclearnorm evaluations suffice to obtain a solution to (17). The asymptotic complexity for solving (15) by the presented algorithm is then \(\mathcal {O}\left (N^{2Dmm}\right)\). The described polynomialtime algorithm is presented in detail in Fig. 3, including elementbyelement construction of \(\mathcal {B}\).
3.5 Iterative realified L1PCA
For large problem instances (large N,D), the above presented optimal L1PCA calculators could be computationally impractical. Therefore, at this point we present and employ the bitflippingbased iterative L1PCA calculator, originally introduced in [43] for processing general realvalued data matrices. Given \(\overline {\mathbf {Y}} \in \mathbb {R}^{2D \times 2N}\), and some m<2 min(D,N), the algorithm presented below attempts to solve (21), by conducting a converging sequence of optimal singlebit flips.
Specifically, the algorithm is initializes at a 2N×m binary matrix B^{(1)} and conducts optimal singlebit flipping iterations. Specifically, at the tth iteration step (t>1), the algorithm generates the new matrix B^{(t)} so that (i) B^{(t)} differs from B^{(t−1)} in exactly one entry (bit flipping) and (ii) \(\ \overline {\mathbf {Y}} \mathbf {B}^{(t)} \_{*} > \ \overline {\mathbf {Y}} \mathbf {B}^{(t1)} \_{*}\). Mathematically, we notice that if we flip at the tth iteration the (n,k)th bit of B^{(t−1)} setting B^{(t)}=B^{(t−1)}−2[B^{(t−1)}]_{n,k}e_{n,2N}e_{k,m}^{⊤}, it holds that
Therefore, at step t, the presented algorithm searches for a solution (n,k) to
The constraint set \(\mathcal {L}^{(t)} \subseteq \{1,2, \ldots, 2Nm \}\), employed to restrain the greediness of the presented iterations, contains the indices^{Footnote 3} of bits that have not been flipped before and, thus, is initialized as \(\mathcal {L}^{(1)} = \{1,2, \ldots, 2Nm \}\). Having obtained the solution to (36), (n,k), the algorithm proceeds as follows. If \( \left \ \overline {\mathbf {Y}} \mathbf {B}^{(t1)}  2 \left [\mathbf {B}^{(t1)}\right ]_{n,k} [\!\overline {\mathbf {Y}}]_{:,n} \mathbf {e}_{k,m}^{\top } \right \_{*} > \left \ \overline {\mathbf {Y}} \mathbf {B}^{(t1)} \right \_{*}\), the algorithm generates \( \mathbf {B}^{(t)} = \mathbf {B}^{(t1)}  2 \left [\mathbf {B}^{(t1)}\right ]_{n,k} \mathbf {e}_{n,2N}\mathbf {e}_{k,m}^{\top }\) and updates \(\mathcal {L}^{(t+1)}\) to \(\mathcal {L}^{(t)} \setminus \{ (k1)2N+n\}\). If, otherwise, \( \left \ \overline {\mathbf {Y}} \mathbf {B}^{(t1)}  2 \left [\mathbf {B}^{(t1)}\right ]_{n,k}[\!\overline {\mathbf {Y}}]_{:,n} \mathbf {e}_{k,m}^{\top } \right \_{*} \leq \left \ \overline {\mathbf {Y}} \mathbf {B}^{(t1)} \right \_{*}\), the algorithm obtains a new solution (n,k) to (36) after resetting \(\mathcal {L}^{(t)}\) to {1,2,…,2Nm}. If this new (n,k) is such that \( \left \ \overline {\mathbf {Y}} \mathbf {B}^{(t1)}  2 \left [\mathbf {B}^{(t1)}\right ]_{n,k} [\!\overline {\mathbf {Y}}]_{:,n} \mathbf {e}_{k,m}^{\top } \right \_{*} > \left \ \overline {\mathbf {Y}} \mathbf {B}^{(t1)} \right \_{*}\), then the algorithm sets \(\mathbf {B}^{(t)} = \mathbf {B}^{(t1)}  2 [\mathbf {B}^{(t1)}]_{n,k} \mathbf {e}_{n,2N}\mathbf {e}_{k,m}^{\top }\) and updates \(\mathcal {L}^{(t+1)} = \mathcal {L}^{(t)} \setminus \{(k1)2N+n\}\). Otherwise, the iterations terminate and the algorithm returns B^{(t)} as a heuristic solution to (21). Notice that since at each iteration, the optimization metric increases. At the same time, the metric is certainly upperbounded by \(\ \overline {\mathbf {Y}} \mathbf {B}_{opt}\_{*}\). Therefore, the iterations are guaranteed to terminate in a finite number of steps, for any initialization B^{(1)}. Our studies have shown that, in fact, the iterations terminate for t<2Nm, with very high frequency of occurrence.
For solving (36), one has to calculate \( \left \{\vphantom {\mathbf {e}_{l,m}^{\top }}} \overline {\mathbf {Y}} \mathbf {B}^{(t1)} 2 \left [\mathbf {B}^{(t1)}\right ]_{j,l} [\!\overline {\mathbf {Y}}]_{:,j} \mathbf {e}_{l,m}^{\top } \right \_{*}\), for all (j,l)∈{1,2,…,2N}×{1,2,…,m} such that \( (l1)2N+j \in \mathcal {L}\). At worst case, \(\mathcal {L} = \{1, 2, \ldots, 2Nm \}\) and this demands 2Nm independent singularvalue/nuclearnorm calculations. Therefore, the total cost for solving (36) is \(\mathcal {O}\left (N^{2}m^{3}\right)\). If we limit the number of iterations to 2Nm, for the sake of practicality, then the total cost for obtaining a heuristic solution to (21) is \(\mathcal {O} \left (N^{3} m^{4}\right)\) –significantly lower than the cost of the polynomialtime optimal algorithm presented above, \(\mathcal {O}\left (N^{2Dmm}\right)\). When the iterations terminate, the algorithm returns the bitflippingderived L1PC matrix \( {\mathbf {Q}}_{R,BF} \stackrel {\triangle }{=} \mathbf {U} \mathbf {V}^{\top } \), where \( \mathbf {U} \mathbf {\Sigma }_{2K \times 2K} \mathbf {V}^{\top } \overset {\text {svd}}{=} \overline {\mathbf {Y}} \mathbf {B}_{\text {opt}} \). Formal performance guarantees for the presented bitflipping procedure were offered in [43], for general realvalued matrices and K=1.
A pseudocode of the presented algorithm for the calculation of the 2K L1PCs \(\overline {\mathbf {Y}}_{2D \times 2N}\) is presented in Fig. 4.
4 Numerical results and discussion
We present numerical studies to evaluate the DoA estimation performance of realified L1PCA, compared to other PCA calculation counterparts. Our focus lies on cases where a nominal source (the DoA of which we are looking for) operates in the intermittent presence of a jammer located at a different angle. Ideally, we would like the DoA estimator to be able to identify successfully the DoA of the source of interest, despite the unexpected directional interference.
To offer a first insight into the performance of the proposed method, in Fig. 5 we present a realization of the DoAestimation spectra P_{R}(ϕ;Q_{R,L1}) and P_{R}(ϕ;Q_{R,L2}), as defined in (14) and (16), respectively. In this study, we calculate the exact L1PCs of \(\overline {\mathbf {Y}}\), using the polynomialcost optimal algorithm of Fig. 3. The receiver antennaarray is equipped with D=3 elements and collects N = 8 snapshots. All snapshots contain a signal from the single source of interest (K=1) impinging on the array with DoA − 20^{∘}. One out of the eight snapshots is corrupted by two jamming sources with DoAs 31^{∘} and 54^{∘}. The signaltonoise ratio (SNR) is set to 2 dB for the target source and to 5 dB for each of the jammers. We observe that standard MUSIC (L2PCA) is clearly misled by the two jammercorrupted measurement. Interestingly, the proposed L1PCAbased method manages to identify the target location successfully.
Next, we generalize our study to include probabilistic presence of an jammer. Specifically, we keep D=3 and N=8 and consider K=1 target at θ=− 41^{∘} with SNR 2 dB, and L=1 jammer at θ^{′}=24^{∘} with activation probability p taking values in {0,.1,.2,.3,.4,.5}.
In Fig. 6, we plot the rootmeansquareerror (RMSE)^{Footnote 4}, calculated over 5000 independent realizations, vs. jammer SNR, for three DoA estimators: (a) the standard L2PCAbased one (MUSIC), (b) the proposed L1PCA DoA estimator with the L1PCs calculated optimally by means of the polynomialcost algorithm of Fig. 3, and (c) the proposed L1PCA estimator with the L1PCs found by means of the algorithm of Fig. 4. For all three methods, we plot the performance attained for each value of p∈{0,.1,.2,.3,.4,.5}. Our first observation is that the two L1PCAbased estimators exhibit almost identical performance for every value of p and jammer SNR. Then, we notice that, in normal system operation (p=0) the RMSEs of the L2PCAbased and L1PCAbased estimators are extremely close to each other and low, with slight (almost negligible) superiority of the L2PCAbased method. Quite interestingly, for any nonzero jammer activation probability p and over the entire range of jammer SNR values, the RMSE attained by the proposed L1PCAbased methods is lower than that attained by the L2PCAbased one. For instance, for jammer SNR 12 dB and p=.1, the proposed methods offer 8^{∘} smaller RMSE than MUSIC. Of course, at high jammer SNR values and p =.5 the RMSE of both methods approaches 65^{∘}, which is the angular distance of the target and the jammer; i.e. both methods tend to peak the significantly (18 dB) stronger jammer present in half the snapshots.
In Fig. 7, we change the metric and study the more general Subspace Representation Ratio (SRR), attained by L2PCA and L1PCA. For any orthonormal basis \(\mathbf {Q} \in \mathbb {R}^{2D \times 2K}\), SRR is defined as
In Fig. 7, we plot SRR(Q_{R,L2}) (L2PCA), SRR(Q_{R,L1}) (optimal L1PCA), and SRR(Q_{R,BF}) averaged over 5000 realizations, for multiple values of p, versus the jammer SNR. We observe that, again, the performance of the optimal and heuristic L1PCA calculators almost coincides for every value of p and jammer SNR. Also, we notice that under normal system operation (p=0) the spans of Q_{R,L2},Q_{R,L1}, and Q_{R,BF} are equally good approximations to \(\mathcal {S}_{R}\) and their respective SRR curves lie close (as close as the target SNR and the number of snapshots allow) to the benchmark of SRR(U), where U is an orthonormal basis for the exact \(\mathcal {S}_{R}\). On the other hand, when half the snapshots are jammer corrupted (p=.5) both methods capture more of the interference. Similar to Fig. 6, for any jammer activation probability and over the entire range of jammer SNR values, the SRR attained by L1PCA (both algorithms) is superior to that attained by conventional L2PCA.
Next, we set D=4,N=10,θ=− 20^{∘} and θ^{′}=50^{∘}. The source SNR is set to 5 dB. In Fig. 8, we plot the RMSE vs. jamming SNR performance attained by L2PCA, RPCA^{Footnote 5} (algorithm of [29]), and L1PCA (proposed –computed by efficient Algorithm 3), all computed on the realified snapshots. We observe that for p = 0 (i.e., no jamming corruption) all methods perform well; in particular, L2PCA and L1PCA demonstrate almost identical performance of about 3^{∘} RMSE. For jammer operation probability p>0, we observe that the proposed L1PCA method outperforms clearly all counterparts, exhibiting from 5^{∘} (for jammer SNR 6 dB) to 20^{∘} (for jammer SNR 11 dB) lower RMSE.
In Fig. 9. we plot the RMSE attained by the three counterparts, this time fixing jamming SNR to 10 dB and varying the snapshot corruption probability p∈{0,.1,.2,.3,.4,.5,.6}. Once again, we observe that, for p=0 (no jamming activity), all methods perform well. For p>0, L1PCA outperforms both counterparts across the board.
Finally, in the study of Fig. 9, we measure the computation time expended by the three PCA methods, for p=0 and p=0.5. We observe that standard PCA, implemented by SVD, is the fastest method, with average computation time about 4·10^{−5} s, for both values of p. The computation time of RPCA is 1.5·10^{−2} s for p = 0 and 1.9·10^{−2} s for p=0.5. L1PCA (Algorithm 3) computation takes, on average, 4.3·10^{−2} s for both values of p, comparable to RPCA.^{Footnote 6}
5 Conclusions
We considered the problem of DoA estimation in the possible presence of unexpected, intermittent directional interference and presented a new method that relies on the L1PCA of the recorded snapshots. Accordingly, we presented three algorithms (two optimal ones and one iterative/heuristic) for realified L1PCA; i.e., L1PCA of realified complex data matrices. Our numerical studies showed that the proposed method attains performance similar to conventional L2PCAbased DoA estimation (MUSIC) in normal system operation (absence of jammers), while it attains significantly superior performance in the case of unexpected, sporadic corruption of the snapshots.
6 Appendix
6.1 Useful properties of realification
Lemma 1 below follows straightforwardly from the definition in (8).
Lemma 1
For any \(\mathbf {A}, \mathbf {B} \in \mathbb {C}^{m \times n}\), it holds that \(\overline {(\mathbf {A} + \mathbf {B})} = \overline {\mathbf {A}} + \overline {\mathbf {B}}\). For any \(\mathbf {A} \in \mathbb {C}^{m \times n}\) and \(\mathbf {B} \in \mathbb {C}^{n \times q}\), it holds that \(\overline {(\mathbf {A} \mathbf {B})} = \overline {\mathbf {A}}\; \overline {\mathbf {B}}\) and \(\overline {\left (\mathbf {A}^{\mathrm {H}} \right)} = \overline {\mathbf {A}}^{\top }\). ■
Lemma 2 below was discussed in [70] and [20], in the form of problem 8.6.4. Here, we also provide a proof, for the sake of completeness.
Lemma 2
For any \(\mathbf {A} \in \mathbb {C}^{m \times n}, \text {rank}(\overline {\mathbf {A}}) =2~ \text {rank}(\mathbf {A})\). In particular, each singular value of A will appear twice among the singular values of \(\overline {\mathbf {A}}\). ■
Proof
Consider a complex matrix \(\mathbf {A} \in \mathbb {C}^{m \times n}\) of rank k≤ min{m,n} and its singular value decomposition \(\mathbf {A} \overset {\text {SVD}}{=} \mathbf {U}_{m \times m} \mathbf {\Sigma }_{m \times n} \mathbf {V}_{n \times n}^{\mathrm {H}}\), where
and \(\boldsymbol \sigma \stackrel {\triangle }{=} [\sigma _{1}, \sigma _{2}, \ldots, \sigma _{k}]^{\top } \in \mathbb {R}_{+}^{k}\) is the length k vector containing (in descending order) the positive singular values of A. By Lemma 1,
with \(\overline {\mathbf {U}}^{\top } \overline {\mathbf {U}} = \overline {\mathbf {U}} \; \overline {\mathbf {U}}^{\top } = \mathbf {I}_{2m}\) and \(\overline {\mathbf {V}}^{\top } \overline {\mathbf {V}} = \overline {\mathbf {V}} \; \overline {\mathbf {V}}^{\top } = \mathbf {I}_{2n}\). Define now, for every \(a,b \in \mathbb {N}_{\geq 1}\), the ab×ab permutation matrix
where \({\mathbf {e}_{i}^{b}} \stackrel {\triangle }{=} [\mathbf {I}_{b}]_{:,i}\), for every i∈{1,2,…,b}. Then,
By (39),
where \(\check {\mathbf {U}} \stackrel {\triangle }{=} \overline {\mathbf {U}} \mathbf {Z}_{2,m}^{\top }, \check {\mathbf {V}} \stackrel {\triangle }{=} \overline {\mathbf {V}} \mathbf {Z}_{2,n}^{\top }\), and \(\check {\boldsymbol \Sigma } \stackrel {\triangle }{=} \mathbf {Z}_{2,m} (\mathbf {I}_{2} \otimes \boldsymbol \Sigma) \mathbf {Z}_{2,n}^{\top }\). It is easy to show that \(\check {\mathbf {U}}^{\top } \check {\mathbf {U}} = \check {\mathbf {U}} \check {\mathbf {U}}^{\top } = \mathbf {I}_{2m}, \check {\mathbf {V}}^{\top } \check {\mathbf {V}} = \check {\mathbf {V}} \check {\mathbf {V}}^{\top } = \mathbf {I}_{2n}\), and \(\check {\boldsymbol \Sigma } = \boldsymbol \Sigma \otimes \mathbf {I}_{2}\). Therefore, (42) constitutes the standard (sorted singular values) SVD of \(\overline {\mathbf {A}}\) and \(\text {rank}(\overline {\mathbf {A}}) = 2k\). □
Lemma 3 below follows from Lemmas 1 and 2.
Lemma 3
For any \(\mathbf {A} \in \mathbb {C}^{m \times n}, \ \mathbf {A}\_{2}^{2} = \frac {1}{2}\ \overline {\mathbf {A}} \_{2}^{2}\). ■
6.2 Proof of (14)
We commence our proof with the following auxiliary Lemma 4.
Lemma 4
For any matrix \(\mathbf {A} \in \mathbb {C}^{m \times n}\), if \(\mathbf {Q}_{real} \in \mathbb {R}^{2m \times 2l}, l < m \leq n\) is a solution to
and \(\mathbf {Q}_{comp.} \in \mathbb {C}^{m \times l}\) is a solution to
then \(\mathbf {Q}_{{\text {real}}} \mathbf {Q}_{{\text {real}}}^{\top } = \overline {\left (\mathbf {Q}_{{\mathrm {comp.}}} \mathbf {Q}_{{\mathrm {comp.}}}^{\mathrm {H}}\right)} = \overline {\mathbf {Q}}_{{\mathrm {comp.}}}\vspace *{2pt} \overline {\mathbf {Q}}_{{\mathrm {comp.}}}^{\top } \). ■
Proof
Orthonormal basis \([\check {\mathbf {U}}]_{:,1:2l}\), defined in the proof of Lemma 2 above, contains the 2l highestsingularvalue leftsingular vectors of \(\overline {\mathbf {A}}\), and, thus, solves (43) [20]. Since the objective value in (43) is invariant to column permutations of the argument Q, any column permutation of \([\check {\mathbf {U}}]_{:,1:2l}\) is still a solution to (43). Next, we define the permutation matrix W_{m,l}=△[I_{l}, 0_{l×(m−l)}]^{⊤} and notice that \( [\check {\mathbf {U}}]_{:,1:2l} = \overline {\mathbf {U}} [\mathbf {I}_{2} \otimes {\mathbf {e}_{1}^{m}}, \mathbf {I}_{2} \otimes {\mathbf {e}_{2}^{m}}, \ldots, \mathbf {I}_{2} \otimes {\mathbf {e}_{l}^{m}}] \) is a column permutation of \(\overline {\mathbf {U}} (\mathbf {I}_{2} \otimes \mathbf {W}_{m,l}) =\overline {\mathbf {U}} \; \overline {\mathbf {W}_{m,l}} \), which, by Lemma 1, equals \( \overline {\left (\mathbf {U} \mathbf {W}_{m,l} \right)} = \overline {[\mathbf {U}]_{:,1:l}}\). Thus, \( \overline {[\mathbf {U}]_{:,1:l}}\) solves (43) too. At the same time, by (39), [U]_{:,1:l} contains the l highestsingularvalue leftsingular vectors of A and solves (44) [20]. By the above, we conclude that a realification per (8) of any solution to (44) constitutes a solution to (43) and, thus, \(\mathbf {Q}_{{\text {real}}} \mathbf {Q}_{{\text {real}}}^{\top } = \overline {\left (\mathbf {Q}_{{\mathrm {comp.}}} \mathbf {Q}_{{\mathrm {comp.}}}^{\mathrm {H}}\right)} = \overline {\mathbf {Q}}_{{\mathrm {comp.}}} \overline {\mathbf {Q}}_{{\mathrm {comp.}}}^{\top } \). □
By Lemmas 1, 3, and 4, (14) holds true.
6.3 Proof of (29)
We commence our proof by defining \(d = \mathcal {X}_{1}\) and the sets \({\mathcal {X}_{A}^{c}} \stackrel {\triangle }{=} \{1, 2, \ldots, 2N \}\setminus \mathcal {X}_{A}\) (e.a.o.) and \({\mathcal {X}_{B}^{c}} \stackrel {\triangle }{=} \{1, 2, \ldots, 2N \}\setminus \mathcal {X}_{B}\) (e.a.o.) Then, we notice that
where \(\mathbf {P} \stackrel {\triangle }{=} \left [ [\mathbf {I}_{2D1}]_{:,d+1:2D1}, ~[\mathbf {I}_{2D1}]_{:,1:d} \right ] \). Similarly,
where \(\mathbf {P}_{c} \stackrel {\triangle }{=} \left [ [\mathbf {I}_{2N 2D+1}]_{:,Nd+1:2N2D+1}, ~[\mathbf {I}_{2N 2D + 1}]_{:,1:Nd} \right ] \). Then,
Consider now \(\mathbf {z} = \text {sgn}{([\mathbf {c}(\mathcal {X}_{A})]_{D})} \mathbf {E}_{D} \mathbf {c}(\mathcal {X}_{A})\). It holds that [z]_{2D}>0 and
Therefore, \(\mathbf {z} = \mathbf {c}(\mathcal {B}) = \in \text {null} \left ([\!\overline {\mathbf {Y}}]_{:,\mathcal {X}_{B}}^{\top }\right) \cap \Omega _{2D}\) and, hence, (29) holds true.
6.4 Proof of Prop. 3
We begin by rewriting the maximization argument of (20) as
where b_{1} and b_{2} are the first and second columns of B, respectively. Evidently, the maximum value attained at (17) is upper bounded as
Considering now a solution b_{opt} to \( {\text {maximize}}_{\mathbf {b} \in \{\pm 1\}^{2N \times 1}}~ \\overline {\mathbf {Y}} \mathbf {b} \_{2}^{2}, \) and defining \(\mathbf {b}_{\text {opt}}^{\prime } = \mathbf {E}_{N} \mathbf {b}_{\text {opt}} \), we notice that \( \\overline {\mathbf {Y}} \mathbf {b}_{\text {opt}}^{\prime } \_{2}^{2} = \\overline {\mathbf {Y}} \mathbf {E}_{N} \mathbf {b}_{\text {opt}} \_{2}^{2} = \\mathbf {E}_{D} \overline {\mathbf {Y}} \mathbf {b}_{\text {opt}} \_{2}^{2} = \ \overline {\mathbf {Y}} \mathbf {b}_{\text {opt}} \_{2}^{2} \) and \( \mathbf {b}_{\text {opt}}^{\top } \overline {\mathbf {Y}}^{\top } \overline {\mathbf {Y}} \mathbf {b}_{\text {opt}}^{\prime } = \mathbf {b}_{\text {opt}}^{\top } \overline {\mathbf {Y}}^{\top } \overline {\mathbf {Y}} \mathbf {E}_{N} \mathbf {b}_{\text {opt}} = \mathbf {b}_{\text {opt}}^{\top } \overline {\mathbf {Y}}^{\top } \mathbf {E}_{D} \overline {\mathbf {Y}} \mathbf {b}_{\text {opt}} = 0. \) Therefore, \( \ \overline {\mathbf {Y}}~\left [\mathbf {b}_{\text {opt}}, \mathbf {b}_{\text {opt}}^{\prime }\right ]\_{*}^{2} = 4~ \\overline {\mathbf {Y}} \mathbf {b}_{\text {opt}} \_{2}^{2} \) and, in view of (50), [b_{opt}, E_{N}b_{opt}] is a solution to (20).
6.5 Proof of (30)
By (29) and (46), it holds that
Consider now some \(\mathbf {b} \in \mathcal {B}(\mathcal {X}_{A})\) and define \(\mathbf {b}^{\prime } \stackrel {\triangle }{=} \text {sgn}{([\mathbf {c}(\mathcal {X}_{A})]_{D})} \mathbf {E}_{N} \mathbf {b}\). By (46), (51), and the definition in (25), it holds that
Hence, b^{′} belongs to \(\mathcal {B}(\mathcal {X}_{B})\) and (30) holds true.
Notes
\(\mathbf {S}_{\Phi } \in \mathbb {C}^{D \times m}\) is a transposed Vandermonde matrix [66] and has rank m if Φ=m<D.
For any underlying set of elements \(\mathcal {A}\) of cardinality n, a size m multiset may be defined as a pair \(\left (\mathcal {A}, f\right)\) where \(f:~ \mathcal {A} \to \mathbb {N}_{\geq 1}\) is a function from \(\mathcal {A}\) to \(\mathbb {N}_{\geq 1}\) such that \(\sum \nolimits _{a \in \mathcal {A}} f(a) =m\); the number of all distinct size m multisets defined upon \(\mathcal {A}\) is \({{n + m 1}\choose {m}}\) [67].
The (n,k)th bit of the binary matrix argument in (21) has the corresponding singleinteger index (k−1)2N+n∈{1,2,…,2Nm}.
Denoting by θ the DoA and \(\hat {\theta }\) the DoA estimate, RMSE is defined as the square root of the average value of \(\hat {\theta }  {\theta }^{2}\).
Given \(\overline {\mathbf {Y}}\), RPCA minimizes ∥L∥_{∗}+λ∥O∥_{1}, over L and O, subject to \(\overline {\mathbf {Y}} = \mathbf {L} + \mathbf {O}\) and \(\lambda = {\sqrt {\max \{2D, 2N \}}}^{1}\) [29]. Then, it returns the 2K L2PCs of L (computed by SVD).
Abbreviations
 AWGN:

Additive white Gaussian noise
 DoA:

Directionofarrival
 EVD:

Eigenvalue decomposition
 L1PCA:

L1norm principalcomponent analysis
 L1PCs:

L1norm principal components
 L2PCs:

L2norm principal components
 MUSIC:

Multiple signal classification
 PCA:

Principalcomponent analysis
 RMSE:

Rootmeansquarederror
 RPCA:

Robust PCA
 SNR:

Signaltonoise ratio
 SRR:

Subspacerepresentation ratio
 SVD:

Singularvalue decomposition
 ULA:

Uniform linear array
 UNM:

Unimodular nuclearnorm maximization
References
S. A. Zekavat, M. Buehrer, Handbook of Position Location: Theory, Practice and Advances (WileyIEEE Press, New York, 2011).
W. J. Zeng, X. L. Li, Highresolution multiple wideband and nonstationary source localization with unknown number of sources. IEEE Trans. Signal Process.58:, 3125–3136 (2010).
M. G. Amin, W. Sun, A novel interference suppression scheme for global navigation satellite systems using antenna array. IEEE J. Select Areas Commun.23:, 999–1012 (2005).
S. Gezici, Z. Tian, G. B. Giannakis, H. Kobayashi, A. F. Molisch, H. V. Poor, Z. Sahinoglu, Localization via ultrawideband radios. IEEE Signal Process. Mag.22:, 70–84 (2005).
L. C. Godara, Application of antenna arrays to mobile communications, part ii: Beamforming and directionofarrival considerations. Proc. IEEE. 85:, 1195–1245 (1997).
H. Krim, M. Viberg, Two decades of array signal processing research. IEEE Signal Process. Mag., 67–94 (1996).
J. M. Kantor, C. D. Richmond, D. W. Bliss, B. C. Jr., Meansquarederror prediction for bayesian directionofarrival estimation. IEEE Trans. Signal Process.61:, 4729–4739 (2013).
P. Stoica, A. B. Gershman, Maximumlikelihood doa estimation by datasupported grid search. IEEE Signal Process. Lett.6:, 273–275 (1999).
P. Stoica, K. C. Sharman, Maximum likelihood method for direction of arrival estimation. IEEE Trans. Acoust. Speech Signal Process., 1132–1143 (1990).
J. Sheinvald, M. Wax, A. J. Weiss, On maximumlikelihood localization of coherent signals. IEEE Trans. Signal Process.44:, 2475–2482 (1996).
B. Ottersten, M. Viberg, P. Stoica, A. Nehorai, in Radar Array Processing, ed. by S. Haykin, J. Litva, and T. J. Shepherd. Exact and large sample maximum likelihood techniques for parameter estimation and detection in array processing (SpringerBerlin, 1993), pp. 99–151.
M. I. Miller, D. R. Fuhrmann, Maximum likelihood narrowband direction finding and the em algorithm. IEEE Trans. Acoust., Speech, Signal Process.38:, 1560–1577 (1990).
F. C. Schweppe, Sensor array data processing for multiplesignal sources. IEEE Trans. Inf. Theory. 14:, 294–305 (1968).
D. H. Johnson, The application of spectral estimation methods to bearing estimation problems. Proc. IEEE. 70:, 1018–1028 (1982).
R. T. Lacoss, Data adaptive spectral analysis method. Geophysics. 36:, 661–675 (1971).
R. O. Schmidt, Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propag.34:, 276–280 (1986).
S. Haykin, Advances in Spectrum Analysis and Array Processing (PrenticeHall, Englewood Cliffs, 1995).
H. L. V. Trees, Optimum Array Processing, Part IV of Detection, Estimation, and Modulation Theory (Wiley, New York, 2002).
R. Grover, D. A. Pados, M. J. Medley, Subspace direction finding with an auxiliaryvector basis. IEEE Trans. Signal Process.55:, 758–763 (2007).
G. H. Golub, C. F. V. Loan, Matrix Computations, 4th edn. (The Johns Hopkins Univ. Press, Baltimore, 2012).
P. Stoica, A. Nehorai, Music, maximum likelihood, and cramerrao bound. IEEE Trans. Acoust. Speech Signal Process.37:, 720–741 (1989).
P. Stoica, A. Nehorai, Music, maximum likelihood, and cramerrao bound: further results and comparisons. IEEE Trans. Acoust. Speech Signal Process.38:, 2140–2150 (1990).
H. Abeida, J. P. Delmas, Efficiency of subspacebased doa estimators. Signal Process. (Elsevier). 87:, 2075–2084 (2007).
K. L. Blackard, T. S. Rappaport, C. W. Bostian, Measurements and models of radio frequency impulsive noise for indoor wireless communications. IEEE J. Select Areas Commun.11:, 991–1001 (1993).
J. B. Billingsley, Ground clutter measurements for surfacesited radar. Massachusetts Inst. Technol. Cambridge, MA, Tech. Rep. 780 (1993). https://apps.dtic.mil/docs/citations/ADA262472.
F. Pascal, P. Forster, J. P. Ovarlez, P. Larzabal, Performance analysis of covariance matrix estimates in impulsive noise. IEEE Trans. Signal Process.56:, 2206–2217 (2008).
R. L. Peterson, R. E. Ziemer, D. E. Borth, Introduction to Spread Spectrum Communications (Prentice Hall, Englewood Cliffs, 1995).
O. Besson, P. Stoica, Y. Kamiya, Direction finding in the presence of an intermittent interference. IEEE Trans. Signal Process.50:, 1554–1564 (2002).
E. J. Candes, X. Li, Y. Ma, J. Wright, Robust principal component analysis?. J. ACM. 58(11), 1–37 (2011).
F. De la Torre, M. J. Black, in Proc. Int. Conf. Computer Vision (ICCV). Robust principal component analysis for computer vision (Vancouver, 2001), pp. 1–8.
Q. Ke, T. Kanade, Robust subspace computation using L1 norm,. Internal Technical Report, Computer Science Department, Carnegie Mellon University, CMUCS03–172 (2003). http://www2.cs.cmu.edu/~ke/publications/CMUCS03172.pdf.
Q. Ke, T. Kanade, in Proc. IEEE Conf. Comput. Vision Pattern Recog. (CVPR). Robust l _{1} norm factorization in the presence of outliers and missing data by alternative convex programming (SPIESan Diego, 2005), pp. 739–746.
L. Yu, M. Zhang, C. Ding, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP). An efficient algorithm for l _{1}norm principal component analysis (IEEEKyoto, 2012), pp. 1377–1380.
J. P. Brooks, J. H. Dulá, The l _{1}norm bestfit hyperplane problem. Appl. Math. Lett.26:, 51–55 (2013).
J. P. Brooks, J. H. Dulá, E. L. Boone, A pure l _{1}norm principal component analysis. J. Comput. Stat. Data Anal.61:, 83–98 (2013).
N. Kwak, Principal component analysis based on l _{1}norm maximization. IEEE Trans. Pattern Anal. Mach. Intell.30:, 1672–1680 (2008).
F. Nie, H. Huang, C. Ding, D. Luo, H. Wang, in Proc. Int. Joint Conf. Artif. Intell. (IJCAI). Robust principal component analysis with nongreedy l _{1}norm maximization (IJCAI, 2011), pp. 1433–1438.
M. McCoy, J. A. Tropp, Electron JS. Two proposals for robust PCA using semidefinite programming. 5:, 1123–1160 (2011).
C. Ding, D. Zhou, X. He, H. Zha, in Proc. Int. Conf. Mach. Learn. (ICML). r _{1}PCA: Rotational invariant l _{1}norm principal component analysis for robust subspace factorization (Proc. ICMLPittsburgh, 2006), pp. 281–288.
X. Li, Y. Pang, Y. Yuan, l _{1}normbased 2DPCA. IEEE Trans. Syst. Man. Cybern. B. Cybern.40:, 1170–1175 (2009).
Y. Liu, D. A. Pados, Compresssenseddomain L1PCA video surveillance. IEEE Trans. Mult.18:, 351–363 (2016).
P. P. Markopoulos, D. A. Pados, G. N. Karystinos, M. Langberg, in Proc. SPIE Compressive Sensing Conference, Defense and Commercial Sensing (SPIE DCS 2017). L1norm principalcomponent analysis in l2normreducedrank data subspaces (Anaheim, 2017), pp. 1021104–1102110410.
P. P. Markopoulos, S. Kundu, S. Chamadia, D. A. Pados, Efficient L1norm principalcomponent analysis via bit flipping. IEEE Trans. Signal Process.62:, 4252–4264 (2017).
N. Tsagkarakis, P. P. Markopoulos, D. A. Pados, in Proc. IEEE International Conference on Machine Learning and Applications (IEEE ICMLA 2016). On the L1norm approximation of a matrix by another of lower rank (IEEEAnaheim, 2016), pp. 768–773.
P. P. Markopoulos, S. Kundu, S. Chamadia, D. A. Pados, in Proc. IEEE International Conference on Machine Learning and Applications (IEEE ICMLA 2016). L1norm principalcomponent analysis via bit flipping (IEEEAnaheim, 2016), pp. 326–332.
N. Tsagkarakis, P. P. Markopoulos, G. Sklivanitis, D. A. Pados, L1norm principalcomponent analysis of complex data. IEEE Trans. Signal Process.66:, 3256–3267 (2018).
P. P. Markopoulos, G. N. Karystinos, D. A. Pados, in Proc. 10th Int. Symp. on Wireless Commun. Syst. (ISWCS). Some options for l _{1}subspace signal processing (IEEEIlmenau, 2013), pp. 622–626.
P. P. Markopoulos, G. N. Karystinos, D. A. Pados, Optimal algorithms for l _{1}subspace signal processing. IEEE Trans. Signal Process.62:, 5046–5058 (2014).
P. P. Markopoulos, S. Kundu, D. A. Pados, in Proc. IEEE Int. Conf. Image Process. (ICIP). L1fusion: Robust lineartime image recovery from few severely corrupted copies (IEEEQuebec City, 2015), pp. 1225–1229.
P. P. Markopoulos, F. Ahmad, in Proc. IEEE Radar Conference (IEEE Radarcon 2017). Indoor human motion classification by L1norm subspaces of microdoppler signatures (IEEESeattle, 2017).
D. G. Chachlakis, P. P. Markopoulos, R. J. Muchhala, A. Savakis, in Proc. SPIE Compressive Sensing Conference, Defense and Commercial Sensing (SPIE DCS 2017). Visual tracking with L1grassmann manifold modeling (SPIEAnaheim, 2017), pp. 1021102–1102110210.
P. P. Markopoulos, F. Ahmad, in Int. Microwave Biomed. Conf. (IEEE IMBioC 2018), IEEE. Robust radarbased human motion recognition with L1norm linear discriminant analysis (IEEEPhiladelphia, 2018), pp. 145–147.
A. Gannon, G. Sklivanitis, P. P. Markopoulos, D. A. Pados, S. N. Batalama, Semiblind signal recovery in impulsive noise with L1norm PCA (IEEE, Pacific Grove, 2018). to appear.
X. Ding, L. He, L. Carin, Bayesian robust principal component analysis. IEEE Trans. Image Process.20(12), 3419–3430 (2011).
J. Wright, A. Ganesh, S. Rao, Y. Peng, Y. Ma, in Adv. Neural Info. Process. Syst. (NIPS). Robust principal component analysis: Exact recovery of corrupted lowrank matrices via convex optimization, (2009), pp. 2080–2088.
P. P. Markopoulos, N. Tsagkarakis, D. A. Pados, G. N. Karystinos, in Proc. IEEE Phased Array Systems and Technology (PAST 2016). Directionofarrival estimation by L1norm principal components (Waltham, 2016), pp. 1–6.
A. Graham, Kronecker Products and Matrix Calculus with Applications (Ellis Horwood, Chichester, 1981).
M. Haardt, J. Nossek, Unitary esprit: How to obtain increased estimation accuracy with a reduced computational burden. IEEE Trans. Signal Process.43:, 1232–1242 (1995).
M. Pesavento, A. B. Gershman, M. Haardt, Unitary rootmusic with a realvalued eigendecomposition: A theoretical and experimental performance study. IEEE Trans. Signal Process.48:, 1306–1314 (2000).
S. A. Vorobyov, A. B. Gershman, Z. Q. Luo, Robust adaptive beamforming using worstcase performance optimization. IEEE Trans. Signal Process.51:, 313–324 (2003).
J. E. Humphreys, Introduction to Lie Algebras and Representation Theory (Springer, New York, 1972).
A. W. Knapp, Lie Groups Beyond an Introduction vol. 140 (Birkhauser, Boston, 2002).
L. W. Ehrlich, Complex matrix inversion versus real. Commun. ACM. 13:, 561–562 (1970).
G. N. Karystinos, A. P. Liavas, Efficient computation of the binary vector that maximizes a rankdeficient quadratic form. IEEE Trans. Inf. Theory. 56:, 3581–3593 (2010).
E. S. Andersen, Two summation formulae for product sums of binomial coefficients. Math. Scand.1:, 261–262 (1953).
C. D. Meyer, Matrix Analysis and Applied Linear Algebra (SIAM, Philadelphia, 2001).
R. P. Stanley, Enumerative Combinatorics vol. 1, 2nd edn. (Cambridge University Press, New York, 2012).
P. P. Markopoulos, L1PCA Toolbox. https://www.mathworks.com/matlabcentral/fileexchange/64855l1PCAtoolbox. Accessed 18 July 2019.
D. Laptev, RobustPCA. https://github.com/dlaptev. Accessed 18 July 2019.
D. Day, M. A. Heroux, Solving complexvalued linear systems via equivalent real formulations. SIAM J. Sci. Comput.23(2), 480–498 (2001).
Acknowledgements
The authors would like to thank the U.S. NSF, the U.S. AFOSR, and the Ministry of Education, Research, and Religious Affairs of Greece for their support in this research work.
Funding
This work was supported in part by the U.S. National Science Foundation (NSF) under grants CNS1117121, ECCS1462341, and OAC1808582, the U.S. Air Force Office of Scientific Research (AFOSR) under the Dynamic Data Driven Applications Systems (DDDAS) program, and the Ministry of Education, Research, and Religious Affairs of Greece under Thales Program Grant MIS379418DISCO.
Author information
Authors and Affiliations
Contributions
Authors’ contributions
All the authors have contributed to the presented theoretical, algorithmic, and numerical results, as well as the drafting of the manuscript. All the authors read and approved the final manuscript.
Authors’ information
Panos P. Markopoulos (pxmeee@rit.edu) is an Assistant Professor with the Dept. of Electrical and Microelectronic Engineering, Rochester Institute of Technology, Rochester, NY, USA. Nicholas Tsagkarakis (nikolaos.tsagkarakis@ericsson.com) is a System Engineer with Ericsson, Gothenburg, Sweden. Dimitris A. Pados (dpados@fau.edu) is an ISENSE Fellow and Professor with the Dept. of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL, USA. George N. Karystinos (karystinos@telecom.tuc.gr) is an Professor with the School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable. The manuscript does report any studies involving human participants, human data, or human tissue.
Consent for publication
Not applicable. The manuscript does not contain any individual person’s data.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Markopoulos, P.P., Tsagkarakis, N., Pados, D.A. et al. Realified L1PCA for directionofarrival estimation: theory and algorithms. EURASIP J. Adv. Signal Process. 2019, 30 (2019). https://doi.org/10.1186/s1363401906255
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1363401906255