3.1 Temporal smoothing
The authors in [4] have found that the maximum number of arbitrary electromagnetic sources uniquely identifiable by one vector sensor is two. That is, the data matrix in (11) is rank deficient if the number of incoming signals is greater than two. In this subsection, we will apply the temporal smoothing technique [26] to deal with this rank deficiency problem. We will also show that under certain conditions, the temporal smoothing technique can restore the rank of the data matrix.
Define a 6×N data matrix Z=[ z(Δ
T
),⋯,z(NΔ
T
)], where z(Δ
T
), z(2Δ
T
), ⋯,z(NΔ
T
) are the N snapshots sampled at time instants Δ
T
,2Δ
T
,⋯,NΔ
T
, respectively. For simplicity of analysis, we will neglect the noise terms. Then, we define P temporally shifted data subsets of Z, where each contains N−P−1 data samples. The first and the pth temporally shifted data subsets can be expressed as
$$\begin{array}{@{}rcl@{}} \mathbf{Z}_{1}\!\! & \,=\, & \![\!\mathbf{z}(\Delta_{T}), \cdots, \mathbf{z}(\!(N\,-\,P\,+\,1)\Delta_{T})\!] \\ & \,=\, & \!\mathbf{M}\mathbf{A}\! \underbrace{\left[\! \begin{array}{cccc} \beta_{1,1}e^{j \omega_{1,1} \Delta_{T} + \psi_{1,1}} &\! \!\beta_{1,1}e^{j \omega_{1,1} 2 \Delta_{T} + \psi_{1,1}} & \cdots\! \!& \!\!\beta_{1,1}e^{j \omega_{1,1} (N-P+1) \Delta_{T} + \psi_{1,1}} \\ \beta_{1,2}e^{j \omega_{1,2} \Delta_{T} + \psi_{1,2}} &\! \!\beta_{1,2}e^{j \omega_{1,2} 2 \Delta_{T} + \psi_{1,2}} & \cdots\! \!&\! \!\beta_{1,2}e^{j \omega_{1,2} (N-P+1) \Delta_{T} + \psi_{1,2}} \\ \vdots & \vdots & \cdots & \vdots \\ \beta_{K,1}e^{j \omega_{K,1} \Delta_{T} + \psi_{K,1}}\! \!&\!\! \beta_{K,1}e^{j \omega_{K,1} 2 \Delta_{T} + \psi_{K,1}} & \cdots \!\!&\! \!\beta_{K,1}e^{j \omega_{K,1} (N-P+1) \Delta_{T} + \psi_{K,1}} \\ \beta_{K,2}e^{j \omega_{K,2} \Delta_{T} + \psi_{K,2}}\! \!&\! \!\beta_{K,2}e^{j \omega_{K,2} 2 \Delta_{T} + \psi_{K,2}} & \cdots &\! \!\beta_{K,2}e^{j \omega_{K,2} (N-P+1) \Delta_{T} + \psi_{K,2}} \\ \end{array} \!\right]}_{\mathbf{S}^{T}} \end{array} $$
(13)
$$\begin{array}{@{}rcl@{}} \mathbf{Z}_{p}\! & \!\!= \!& \!\![\!\mathbf{z}(p\Delta_{T}), \!\cdots\!,\! \mathbf{z}((N\!\,-\,P\,+\,p)\Delta_{T}\!)] \\\! \!&\! \,=\, & \!\mathbf{\!M}\mathbf{A}\!\! \!\left[\!\! \begin{array}{cccc} \beta_{1,1}e^{j \omega_{1,1} p\Delta_{T} + \psi_{1,1}}\!\! &\!\! \beta_{1,1}e^{j \omega_{1,1} (p+1) \Delta_{T} + \psi_{1,1}}\!\! &\! \!\cdots\! \!&\!\! \beta_{1,1}e^{j \omega_{1,1} (N-P+p) \Delta_{T} + \psi_{1,1}} \\ \beta_{1,2}e^{j \omega_{1,2} p\Delta_{T} + \psi_{1,2}} & \beta_{1,2}e^{j \omega_{1,2} (p+1) \Delta_{T} + \psi_{1,2}} \!\!&\!\! \cdots \!\!& \!\!\beta_{1,2}e^{j \omega_{1,2} (N-P+p) \Delta_{T} + \psi_{1,2}} \\ \vdots & \vdots \!\!&\!\! \cdots \!\!&\!\! \vdots \\ \beta_{K,1}e^{j \omega_{K,1} p\Delta_{T} + \psi_{K,1}} \!\!&\! \!\beta_{K,1}e^{j \omega_{K,1} (p+1) \Delta_{T} + \psi_{K,1}}\! \!&\! \!\cdots\! \!&\! \!\beta_{K,1}e^{j \omega_{K,1} (N-P+p) \Delta_{T} + \psi_{K,1}} \\ \beta_{K,2}e^{j \omega_{K,2} p\Delta_{T} + \psi_{K,2}}\! \!& \!\!\beta_{K,2}e^{j \omega_{K,2} (p+1) \Delta_{T} + \psi_{K,2}} \!\!&\!\! \cdots \!\!&\! \!\beta_{K,2}e^{j \omega_{K,2} (N-P+p) \Delta_{T} + \psi_{K,2}} \\ \end{array}\!\! \right] \\\! \!&\! \,=\, \!&\! \mathbf{M}\mathbf{A} \!\!\underbrace{\left[ \begin{array}{ccccc} e^{j \omega_{1,1} (p - 1)\Delta_{T}} & & & & \\ & e^{j \omega_{1,2} (p - 1)\Delta_{T}} & & & \\ & & \ddots & & \\ & & & e^{j \omega_{K,1} (p - 1)\Delta_{T}} & \\ & & & & e^{j \omega_{K,2} (p - 1)\Delta_{T}} \\ \end{array} \right]}_{\boldsymbol{\Phi}_{p}} \mathbf{S}^{T} \end{array} $$
(14)
where
$$ \begin{aligned} {}\boldsymbol{\Phi}_{p} = \text{diag}&\left[e^{j\omega_{1,1} (p-1)\Delta_{T}}, e^{j\omega_{1,2} (p-1)\Delta_{T}}, \cdots,\right.\\ &\quad\left. e^{j\omega_{K,1} (p-1)\Delta_{T}}, e^{j\omega_{K,2} (p-1)\Delta_{T}} \right] \end{aligned} $$
(15)
is a diagonal matrix that is only dependent on the time delay and the frequencies of the signals, and
$$\begin{array}{@{}rcl@{}} \mathbf{S} = [\!\mathbf{s}(\Delta_{T}), \cdots, \mathbf{s}((N - P + 1)\Delta_{T})]^{T} \end{array} $$
(16)
is a (N−P+1)×2K signal matrix. Then, for p = 1,⋯,P, we will have P different data sets {Z1,⋯,Z
P
}. Note that these P data sets differ from one another in view of the fact the the matrices Φ
p
differ from one set to another. Next, the 6P×(N−P+1) temporally smoothed data matrix is defined by stacking Z
p
for p = 1,⋯,P as
$$\begin{array}{@{}rcl@{}} \mathbf{Z}_{\text{TS}} = \left[\mathbf{Z}_{1}^{T}, \cdots, \mathbf{Z}_{P}^{T}\right]^{T} \end{array} $$
(17)
Theorem 1
If P≥2K and N≥4K−1, then the temporally smoothed data matrix ZTS is of full rank 2K.
Proof
The matrix ZTS can be expressed in a column-wise Kronecker matrix product form as
$$\begin{array}{@{}rcl@{}} \mathbf{Z}_{\text{TS}}~=~(\boldsymbol{\Psi} \odot \mathbf{MA}) \mathbf{S}^{T} \end{array} $$
(18)
where
$$ {{\boldsymbol{\Psi} = \left[ \begin{array}{ccccc} 1 & 1 & \cdots & 1 & 1 \\ e^{j\omega_{1,1} \Delta_{T}} & e^{j\omega_{1,2} \Delta_{T}} & \cdots & e^{j\omega_{K,1} \Delta_{T}} & e^{j\omega_{K,2} \Delta_{T}} \\ \vdots & \vdots & \cdots & \vdots & \vdots \\ e^{j\omega_{1,1} (P-1) \Delta_{T}} & e^{j\omega_{1,2} (P-1) \Delta_{T}} & \cdots & e^{j\omega_{K,1} (P-1) \Delta_{T}} & e^{j\omega_{K,2} (P-1) \Delta_{T}} \\ \end{array} \right]}} $$
(19)
Since all the signals are assumed to be IP and have distinct frequencies, the Vandermonde matrix S is of full column rank 2K if and only if (N − P + 1) ≥ 2K. Next, by results in [27], we have
$$\begin{array}{@{}rcl@{}} \text{rank}(\boldsymbol{\Psi} \odot \mathbf{MA})~\leq~\min\{2K, \text{rank}(\boldsymbol{\Psi})\cdot\text{rank}(\mathbf{MA})\} \end{array} $$
and a sufficient condition for equality is to have Ψ and/or MA tall and full rank. Then, if P ≥ 2K, the Vandermonde matrix Ψ is tall and is of rank 2K. In this case,
$$\begin{array}{@{}rcl@{}} \text{rank}(\boldsymbol{\Psi} \odot \mathbf{MA})~=~\min\{2K, 2K\cdot\mbox{rank}(\mathbf{MA})\}~=~2K \end{array} $$
Finally, combining P ≥ 2K with (N − P + 1) ≥2K, we have ZTS to be of full rank 2K, if P ≥ 2K and N≥4K − 1, since (Ψ⊙MA) is of full column rank and ST is of full row rank. This concludes the proof. □
Theorem 1 establishes sufficient but not necessary conditions for constructing temporally smoothed matrices to resolve K IP monochromatic signals with a single-vector sensor. Specially, on the basis of Theorem 1, an infinite number of uncorrelated signals with distinct frequencies may potentially be resolved as N approaches infinity.
3.2 Angle and mutual coupling matrix estimation
In this subsection, we propose an ESPRIT-based algorithm to estimate the angles and the coupling matrix from the data matrix ZTS. For analytical purposes, we consider the ideal noiseless case. Let E
s
be the 6P × 2K signal-subspace eigenvector matrix, whose columns are the 6P × 1 signal-subspace eigenvectors associated with the 2K largest eigenvalues of \(\mathbf {Z}_{\text {TS}}\mathbf {Z}_{\text {TS}}^{H}\). Using the basic idea of ESPRIT [28], we have
$$\begin{array}{@{}rcl@{}} \mathbf{E}_{s}~=~(\boldsymbol{\Psi} \odot \mathbf{M}\mathbf{A}) \mathbf{T} = \mathbf{B} \mathbf{T} \end{array} $$
(20)
where B = Ψ⊙MA, and T is a unique 2K × 2K non-singular matrix. Next, define the following two selection matrices
$$\begin{array}{@{}rcl@{}}{} \mathbf{J}_{1}=\ [\!\mathbf{I}_{6P~-~6}, \mathbf{0}_{(6P~-~6) \times 6}], \mathbf{J}_{2} = [\!\mathbf{0}_{(6P~-~6)~\times~6}, \mathbf{I}_{6P~-~6}] \end{array} $$
(21)
and let B1 = J1B and B2 = J2B. The shift invariance structure in B indicates that
$$\begin{array}{@{}rcl@{}} \mathbf{B}_{2}~=~\mathbf{B}_{1} \boldsymbol{\Phi} \end{array} $$
(22)
where
$$\begin{array}{@{}rcl@{}} \boldsymbol{\Phi}~=~\text{diag}\left[\!e^{j\omega_{1,1}\Delta_{T}}, e^{j\omega_{1,2} \Delta_{T}}, \cdots, e^{j\omega_{K,1}\Delta_{T}}, e^{j\omega_{K,2} \Delta_{T}}\right] \end{array} $$
(23)
From (20) and (22), we obtain
$$\begin{array}{@{}rcl@{}} \mathbf{T}^{-1} \boldsymbol{\Phi} \mathbf{T}~=~\mathbf{E}_{1}^{\dag} \mathbf{E}_{2} \end{array} $$
(24)
where E1 = J1E
s
and E2 = J2E
s
. Consequentially, the ESPRIT’s eigenvalues, i.e., the eigenvalues of \(\mathbf {E}_{1}^{\dag } \mathbf {E}_{2}\) equal the diagonal elements of Φ, and the ESPRIT’s right eigenvectors constitute the columns of T. Thus, the matrix B1 may be estimated as
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{B}}_{1} = \frac{1}{2}\left\{\mathbf{E}_{1} \mathbf{T}^{-1} + \mathbf{E}_{2} \mathbf{T}^{-1} \boldsymbol{\Phi}^{-1}\right\} \end{array} $$
(25)
Note that the matrix B1 has the form
$$\begin{array}{@{}rcl@{}} \mathbf{B}_{1} & = & \left[\mathbf{Q}_{1}^{T}, \mathbf{Q}_{2}^{T}, \cdots, \mathbf{Q}_{P-1}^{T} \right]^{T} \\ & = & \left[\mathbf{Q}^{T}, (\mathbf{Q}\boldsymbol{\Phi})^{T}, \cdots, \left(\mathbf{Q}\boldsymbol{\Phi}^{(P-2)}\right)^{T} \right]^{T} \end{array} $$
(26)
where Q = MA. Therefore, the matrix Q can be estimated from \(\hat {\mathbf {B}}_{1}\) as
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{Q}}~=~\frac{1}{P~-~1} \sum_{p~=~1}^{P~-~1} \mathbf{Q}_{p} \mathbf{\Phi}^{-~(p~-~1)} \end{array} $$
(27)
It should be pointed out that the estimated \(\hat {\mathbf {Q}}\) would suffer the unknown scaling ambiguities of the columns. That is, the columns of the estimated \(\hat {\mathbf {Q}}\) in fact satisfy
$$\begin{array}{@{}rcl@{}} \mathbf{q}_{2k~-~1}~=~\alpha_{1,k} \mathbf{C} \mathbf{v}_{1,k}, \ \ \mathbf{q}_{2k}~=~\alpha_{2,k} \mathbf{C} \mathbf{v}_{2,k} \end{array} $$
(28)
$$\begin{array}{@{}rcl@{}} \bar{\mathbf{q}}_{2k~-~1}~=~\bar \alpha_{1,k} \mathbf{C} \mathbf{v}_{2,k}, \ \ \bar{\mathbf{q}}_{2k}~=~\bar \alpha_{2,k} \mathbf{C} \mathbf{v}_{1,k} \end{array} $$
(29)
where q2k − 1, q2k and \(\bar {\mathbf {q}}_{2k~-~1}\), \(\bar {\mathbf {q}}_{2k}\), k = 1,⋯,K, respectively, denote the top three and bottom three rows of the (2k − 1)th and (2k)th columns of \(\hat {\mathbf {Q}}\), αi,j and \(\bar \alpha _{i,j}\), i = 1,2,j = 1,⋯,K represent the unknown scalars. Note that since q
k
≠ 1, αi,j is in general unequal to \(\bar \alpha _{i,j}\).
The scaling ambiguities can be easily eliminated in the proposed method. Using q2k = α2,kCv2,k, we can form the following three equations:
$$\begin{array}{*{20}l} \alpha_{2,k} (-~c_{1}\sin \phi_{k}~+~c_{2} \cos \phi_{k}) &~=~& q_{2k,1} \end{array} $$
(30)
$$\begin{array}{*{20}l} \alpha_{2,k} (-~c_{2}\sin \phi_{k}~+~c_{1} \cos \phi_{k}) &~=~& q_{2k,2} \end{array} $$
(31)
$$\begin{array}{*{20}l} \alpha_{2,k} (-~c_{2}\sin \phi_{k}~+~c_{2} \cos \phi_{k}) &~=~& q_{2k,3} \end{array} $$
(32)
where q2k,1, q2k,2, and q2k,3 are, respectively, the first, second, and third entries of q2k. Solving these three equations yields the azimuth angle and coupling coefficient estimates:
$$\begin{array}{@{}rcl@{}} \hat \phi_{k}~=~\arctan\left(\frac{q_{2k,1}~-~q_{2k,3}}{q_{2k,3}~-~q_{2k,2}}\right) \end{array} $$
(33)
$$ {{\begin{aligned} {}\hat c~=~c_{2}/c_{1}~&=~\frac{1}{2}\left(\frac{q_{2k,1} \cos \hat \phi_{k}~+~q_{2k,2} \sin \hat \phi_{k}}{q_{2k,1} \sin \hat \phi_{k}~+~q_{2k,2} \cos \hat \phi_{k}}\right.\\ &\left.\quad+\frac{q_{2k,3} \cos \hat \phi_{k}} {q_{2k,2} \cos \hat \phi_{k}~+~q_{2k,3} \sin \hat \phi_{k}~-~q_{2k,2} \sin \hat \phi_{k}}\right) \end{aligned}}} $$
(34)
With the estimation of the coupling coefficient \(\hat c\), we can construct an estimate of the mutual coupling matrix \(\hat {\mathbf {C}}\) as
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{C}}~=~\left[ \begin{array}{ccc} 1 & \hat c & \hat c \\ \hat c & 1 & \hat c \\ \hat c & \hat c & 1 \\ \end{array} \right] \end{array} $$
(35)
It is easy to see that the matrix product \(\hat {\mathbf {C}} \mathbf {C}\) becomes a scaled identity matrix. This means that the mutual coupling coefficients, which constitute the non-diagonal elements of C, are completely eliminated. With the estimation of \(\hat \phi _{k}\) and \(\hat c\), using \(\bar {\mathbf {q}}_{2k}~=~\bar \alpha _{2,k} \mathbf {C} \mathbf {v}_{1,k}\), we can form the following three equations:
$$ \begin{aligned} {}\bar \alpha_{2,k} \left(\cos \hat \phi_{k} \cos \theta~+~\hat c \sin \hat \phi_{k} \cos \theta~-~\hat c \sin \theta\right) = \bar q_{2k,1} \\ \end{aligned} $$
(36)
$$ \begin{aligned} {}\bar \alpha_{2,k} \left(\hat c \cos \hat \phi_{k} \cos \theta~+~\sin \hat \phi_{k} \cos \theta~-~\hat c \sin \theta\right)=\bar q_{2k,2} \\ \end{aligned} $$
(37)
$$ \begin{aligned} {}\bar \alpha_{2,k} \left(\hat c \cos \hat \phi_{k} \cos \theta~+~\hat c \sin \hat \phi_{k} \cos \theta~-~\sin \theta\right)=\bar q_{2k,3} \end{aligned} $$
(38)
Solving these three equations leads to the elevation angle estimates
$$ \begin{aligned} {}\hat \theta_{k}&=\arctan\left(\hat c \left(\cos \hat \phi_{k}~+~\sin \hat \phi_{k}\right) - \frac{\bar q_{2k,3}}{\bar q_{2k,1} - \bar q_{2k,2}} \right.\\ &\quad\times\left.(1~-~\hat c) \left(\cos \hat \phi_{k}~-~\sin \hat \phi_{k}\right)\right) \end{aligned} $$
(39)
Note that the estimation of \(\hat \theta _{k}\) and \(\hat \phi _{k}\) are automatically paired without any additional processing.
In practice, apart from the scaling ambiguities, the estimated \(\hat {\mathbf {Q}}\) may also suffer from some permutation ambiguities. In this case, q2k may not be the estimate of α2,kCv2,k. Thus, the estimation of \(\hat \phi _{k}\) and \(\hat c\) obtained by using (33) and (34) from q2k may be erroneous. These may further result in the erroneous estimation of \(\hat \theta _{k}\). Unlike the scaling ambiguities, the permutation ambiguities are not resolvable. Here, we provide a solution to deal with this permutation ambiguity problem as follows: first, for all k = 1,⋯,2K, obtain a set of 2K different azimuth angle estimates from q
k
. Each of these 2K azimuth angle estimates is then used to produce its own coupling coefficient and elevation angle estimates. Thus, the kth azimuth angle, elevation angle, and coupling coefficient estimates are automatically matched. We know that only a set of K estimates are true estimates. Theoretically, the K true coupling coefficient estimates are identical, while the K erroneous coupling coefficient estimates are, in general, distinct from one another and from the K true estimates. Therefore, we can take homogeneity in coupling coefficient estimates as a criterion for determining the true estimates of the angles and coupling coefficients, i.e., we take a set of K angle estimates associated with K identical coupling coefficient estimates as the true estimates. Without loss of generality, let us assume that the first K estimates are true and the last K estimates are erroneous; then, we have \(\hat c_{1} = \cdots = \hat c_{K}~=~\hat c~\neq ~\hat c_{K~+~1} \neq \cdots ~\neq ~\hat c_{2K}\). Finally, we obtain the estimates \((\hat \theta _{k}, \hat \phi _{k}), k~=~1, \cdots, K\) as the angle estimates of the K signals.
3.3 Remarks
In the presence of noise, the estimation procedures in Section 3.2 becomes approximate. Specially, with noise, the set of K coupling coefficient estimates are in general different. Nevertheless, we can search for a set of K coupling coefficient estimates with “most similar values” as the “identical” estimates.
Also note that the vector cross product estimator has been widely used for direction finding with a single-vector sensor [2, 3, 7]. However, this estimator cannot be exploited directly in the presence of mutual coupling among the vector sensor components. Obviously, with the estimation of \(\hat c\), the vector sensor can be calibrated by using the calibration matrix defined as \(\hat {\mathbf {M}}~=~\mathbf {I}_{2} \otimes \hat {\mathbf {C}}\). Therefore, the vector cross product estimator can be applied to the calibrated data matrix \(\hat {\mathbf {M}} \mathbf {Z}\) to extract the angle estimates of the incoming signals. Although the proposed method is designed for vector sensors with mutual coupling, it can also be applied to ideal vector sensors, where the measurement of each component is independent of the others.
The proposed method shares all the advantages indicated in [3]. For example, it offers automatically paired azimuth and elevation angle estimates, does not restrict Δ
T
to be constricted by the Nyquist sampling rate, does not need the signal frequencies to be known a priori, and suffers no frequency-DOA ambiguity. It should be noted that the method in [3] assumes CP signals, whereas the proposed method assumes IP ones.
Lastly, it should be pointed out that the application of ESPRIT technique for vector sensor mutual coupling calibration has been studied in works [22] and [23]. However, the differences between these two works and the present work are that (1) the former requires a coupling-free auxiliary vector sensor and design of a reference signal, while the latter does not, (2) the former does not apply the temporally smoothing technique to improve the identifiability limit of a vector sensor, and (3) the former assumes the incoming signals are completely polarized, while the latter considers the incompletely polarized signals.