- Research
- Open access
- Published:
Kalman filter for radio source power and direction of arrival estimation
EURASIP Journal on Advances in Signal Processing volume 2024, Article number: 66 (2024)
Abstract
Images are an important source of information for spacecraft navigation. Based on an image and a known attitude, triangulation techniques (intersection or resection) are often used for positioning and navigation. In the resection problem, an observer estimate its unknown location by using angle measurements to points at known locations (i.e., landmarks), the localization performance depending on the accuracy of the angle measurements. As a contribution to resection for spacecraft navigation, we considers the dynamic image estimation problem based on radio interferometry, i.e., image of radio source power, where the measurements are sample covariance matrices (SCMs). Considering the case where several measurements are available as well as a known dynamic linear model of image evolution, a.k.a a linear state model, the minimum mean-squared error image estimator (MMSE) is given by the Kalman filter (KF) or one of its variants. However standard Kalman-like filters are not a priori suitable for the problem at hand since the measurements (i.e., SCMs) cannot be formulated analytically as a function of state parameters to be estimated (i.e., radio source power). In fact, this lack of analytical formulation can be circumvented by a statistical linear fitting allowing the SCMs to be expressed in terms of the state. This linear fitting introduces an additive residual noise, equivalent to a measurement noise, whose covariance matrix depends on the current state, a non-standard case for a measurement model. The covariance matrix of the residual noise is derived whatever the distributions of the radio sources and of the additive noise at the samples level, unveiling the contribution of their multivariate kurtosis. The proposed method is evaluated on simulated data representative of a dynamic radio interferometry framework. The results show that the proposed method is capable of effectively tracking moving radio sources in complex scenes with theoretical guaranties when the signal multivariate kurtosis is known.
1 Introduction
Cameras, telescopes, and similar tools provide essential information for today’s spacecraft. These devices serve various purposes, such as guiding, navigation and positioning. Relying on a known object attitude, digital images are used to identify known landmarks in the observed scene to their apparent pixel locations. In the context of guiding spacecraft, it is also known as “angles-only” optical navigation [1]. Triangulation (intersection or resection) from images is used in space navigation challenges like finding your way based on known landmarks, tracking objects using angles, and even guiding spacecraft with the help of stars for interstellar journeys [1]. We place ourselves in the context of resection, i.e., the observer estimates its unknown location by using angle measurements to points at known locations (i.e., landmarks). For instance, using resection, one can figure out where a spacecraft is in cislunar space by looking at known satellites or during interplanetary missions by tracking asteroids and planets [2, 3]. As a contribution to resection for spacecraft navigation, we tackle the dynamic stars image estimation problem based on radio interferometry, i.e., image of radio source power, where the measurements are sample covariance matrices (SCMs), a.k.a visibility matrices [4]. Indeed, such radio interferometric images estimate both power and direction of arrival (DOA) of a radio source, and allow to identify the DOA of radio sources of known power, provided that the power estimation is accurate enough. We consider the case where several measurements are available as well as a known dynamic linear model of image evolution, a.k.a a linear state model.
The design and use of state estimation techniques is fundamental in a plethora of applications, such as robotics, tracking, guidance and navigation systems [5,6,7]. For a linear dynamic system, the Kalman filter (KF) is the best linear minimum mean-squared error (MMSE) estimator. The most widespread solution for nonlinear systems is to resort to system linearization, leading to the so-called linearized or extended KF (EKF) [7]. In both cases, as well as for more advanced techniques such as sigma-point filters [8], the main assumption is a perfect system knowledge [9]: (i) known process and measurement functions, including their parameters, (ii) known inputs and (iii) noise statistics (i.e., first and second order moments for the KF and EKF). Thus, usage of KF may not be possible or of poor performance if one (or more) of the above requirements is not met [10].
Consequently, at first sight, Kalman-like filters do not seem to be usable for the problem at hand since, although each individual sample may adhere to a linear parametric model, the finite horizon SCMs cannot be formulated analytically as a function of state parameters to be estimated (radio source power). Fortunately, this lack of analytical formulation can be circumvented by a statistical linear fitting, at least under the assumptions of: a) a deterministic dynamic state model (no state noise), and b) instantaneous linear observations from multiple radio sources in the presence of additive noise (stochastic observation model), when the radio sources are mutually independent and independent from the noise. The proposed linear fitting allows the SCMs to be expressed in terms of the state parameters and introduces an additive residual noise, equivalent to a measurement noise, whose covariance matrix depends on the current state parameters, a non-standard case for a measurement model. The covariance matrix of the residual noise is derived whatever the distributions of the radio sources and of the additive noise at the samples level, unveiling the contribution of their multivariate kurtosis [11] whose values depend on whether the radio sources and noise distributions are heavy or light tailed. To support the discussion, the proposed method is evaluated on simulated data representative of a dynamic radio interferometic imaging framework.
2 Measurement model
Let us consider a network composed of M antennas receiving signals at consecutive short time integration (STI) intervals \(\left[ t_k, t_k+\epsilon \right]\), \(k\geqslant 0\). During the \(k-\)th STI interval, observations are i.i.d. realizations of a stochastic variable \({{\textbf{z}}}_k\in {\mathbb {C}}^{M\times 1}\) consisting of a linear mixture of signals coming from Q radio sources, \({{\textbf{s}}}_k\in {\mathbb {C}}^{Q\times 1}\), in the presence of an additive noise \({\textbf{n}}_k\):
with \({\textbf{A}}_k\in {\mathbb {C}}^{M\times Q}\) the system response matrix, and \({\textbf{s}}_k\) and \({\textbf{n}}_k\) being independent and centered complex circular random vectors. Especially, we consider the case where we only have access to the sample covariance matrix (SCM)
where \({\textbf{z}}_k{(n)}\) is the \(n-\)th realization of \({\textbf{z}}_k\), N is the number of samples and the subscript H denotes the hermitian operator, i.e., \({\textbf{z}}_k^H\triangleq \left( {\textbf{z}}_k^*\right) ^T\). Asymptotically, i.e., when N tends to infinity, \(\hat{{\textbf{C}}}_{{\textbf{z}}_k}\) converges in probability to
In practice, \({\textbf{A}}_k\) is known and \({\textbf{C}}_{{\textbf{n}}_k}\in \mathbb {C}^{M\times M}\) is known or can be measured at desired precision.
Considering mutually independent source signals and denoting \({\textbf{x}}_k\in \mathbb {R}^{Q\times 1}_+\) the vector composed of individual source signal powers, a.k.a. intensities, the source signals covariance matrix is diagonal and writes
using the vector-to-matrix \(\text {diag}\) operator. Assuming independence among all \({\textbf{n}}_k\) and \({\textbf{s}}_l\) for \(k,\,l\geqslant 1\), the objective is to estimate the intensities \({\textbf{x}}_k\), under the state model
where the state-transition matrix \({\textbf{F}}_{k-1}{\in \mathbb {R}^{Q\times Q}}\) is known a priori.
2.1 Linear discrete state-space model
One can construct a linear observation model in the asymptotic case, using the vectorized covariance matrix as observation, i.e.,
with \(*\) being the column-wise Kronecker product, also called the Khatri–Rao product [12]. However, such linear model does not apply in the non asymptotic case. In order to circumvent this limitation, we propose to apply a statistical linear fitting model based on (6) where measurements are the vectorized SCMs, which are concatenated with their complex conjugates for the sake of optimality (see Appendix 1), i.e.,
At iteration k, the observation model is defined as
and the observation residual is defined as
The observation residual \({\textbf{v}}_k\) corresponds to the approximation error of the vectorized SCM by the linear model \(\textbf{H}_k\textbf{x}_k\).
In the following, a Kalman filter is derived for the estimation of \({{\textbf{x}}_k}\) based on SCMs measurements. \({\textbf{v}}_k\) is considered as the observation noise according to the standard KF formalism. This corresponds to a non-standard linear discrete state space (LDSS) model for which the quantities of interest \({{\textbf{x}}_k}\) are involved in the observation noise. Section 3 design the KF in accordance with the definition of \({\textbf{v}}_k\), that will induce \({{\textbf{x}}_k}\) to appear in the KF algorithm. When needed, in particular in the covariance matrix of \({{\textbf{v}}_k}\), it is proposed to replace \({{\textbf{x}}_k}\) by an estimate in order to conserve the underlying structure.
3 Design of the Kalman filter
3.1 Kalman filter existence
For such a LDSS model, without state noise, one needs to verify that
is null for \(k\geqslant 2\) and \(l<k\) in order to prove the existence of a Kalman filter [9]. The verification is straightforward since \({\textbf{y}}_k\) and \({\textbf{y}}_l\) are independent and \({{\textbf{x}}_k}\) is deterministic, i.e.,
3.2 Identification of first and second order statistics
The next step consists in evaluating the first and second order statistics of the observation noise.
3.2.1 Mean of the observation noise
Since \(\hat{{\textbf{C}}}_{{\textbf{z}}_k}\) is an unbiased estimate of \({\textbf{C}}_{{\textbf{z}}_k}\), one has that
where
is the asymptotic measurement noise. From (9), one obtains that \({\mathbb {E}}[{\textbf{v}}_k]={\mathbb {E}}[{\textbf{y}}_k] - {\textbf{H}}_k{{\textbf{x}}_k}\) and therefore
3.2.2 Covariance of the observation noise
As \({{\textbf{x}}_k}\) is deterministic, one obtains \({\textbf{C}}_{{\textbf{v}}_k}={\textbf{C}}_{{\textbf{y}}_k}\), with
where \(\otimes\) is the usual Kronecker product, such that \({\textbf{z}}_k^{*}\otimes {\textbf{z}}_k={\text {vec}}({\textbf{z}}_k{\textbf{z}}_k^H)\), \({\textbf{z}}_k\otimes {\textbf{z}}_k^*={\text {vec}}({\textbf{z}}_k^*{\textbf{z}}_k^T)\) and \({\textbf{z}}_k(n)\) are N independent replicates of \({\textbf{z}}_k\). Finally, one has that
Based on the independence of source signals and noises as well as their circularity, Appendix 2 develops expression of \({\textbf{C}}_{{\textbf{z}}^*\otimes {\textbf{z}}}\) with respect to Kronecker product properties. After dropping the index k in order to shorten expressions, one obtains
for source signals \((s_q)_{q=1,\ldots ,Q}\) and noises \((n_m)_{m=1,\ldots ,M}\) following any distributions, where \(x_q={\mathbb {E}\left[ \left| s_q\right| ^2\right] }\) is the \(q-\)th coordinate of \(\textbf{x}\), \({\textbf{a}}_q\) is the \(q-\)th column of \({\textbf{A}}\) and
is the normalized multivariate kurtosis of \(s_q\) [11].
Finally, note that for
with \(\left( {\textbf{e}}_1,\ldots ,{\textbf{e}}_M\right)\) the canonical basis of \({\mathbb {R}}^M\), one obtains that
from which the following equality holds:
for any distribution.
Independent noise components In particular, for a vector \(\textbf{n}\) composed of independent variables \(n_m\), Appendix 3 shows that the kurtosis of the noise also appears in \(\textbf{C}_{\textbf{n}^{*}\otimes \textbf{n}}\), which is a diagonal matrix such that
where
and \(\sigma _{n_m}\) are respectively the normalized multivariate kurtosis and the variance of \(n_m\).
Gaussian noise distribution For instance, for \(n_m\) such that \({\mathfrak {Re}}(n_m)\) and \({\mathfrak {Im}}(n_m)\) are independent and follows the same distribution \({\mathcal {D}}\),
Since the kurtosis of a univariate real-valued Gaussian random variable is 3, \(\rho _{n_m}=0\) for Gaussian noise, but in general \(\rho _{n_m}\ne 0\). Hence, for centered and Gaussian complex circular noise, one has that
from which
Gaussian noise and signal distributions It simplifies for centered and Gaussian complex circular signal and noise, for which
The latter can be proven by remarking that \(\rho _{s_q}=0\) in (26), or following Appendix 3 since \(\textbf{z}\) is a vector of centered and Gaussian complex circular random variables.
3.3 Kalman filter recursion
At iteration k, the current estimate writes
where state and measurement predictions are
The observation noise covariance \(\textbf{C}_{\textbf{v}_k}\) is unknown since it depends on the current state \(\textbf{x}_k\) that has to be estimated. An estimate \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}\) is used, from which follows that the innovation covariance matrix \({\textbf{S}}_{k|k-1}{\triangleq \textbf{C}_{\textbf{y}_k - \widehat{{\textbf{y}}}_{k|k-1}}}\) and hence the predicted and a posteriori estimate error covariance matrices, i.e., \({\textbf{P}}_{k|k-1}{\triangleq \textbf{C}_{\textbf{x}_k - \widehat{{\textbf{x}}}_{k|k-1}}}\) and \({\textbf{P}}_{k|k}{\triangleq \textbf{C}_{\textbf{x}_k - \widehat{{\textbf{x}}}_{k|k}}}\) respectively, are estimated. The optimal Kalman gain \({\textbf{K}}_k\) is computed with the recursion
The measurement noise covariance estimate is constructed with the state prediction \(\widehat{{\textbf{x}}}_{k|k-1}\). Since there is no non-negativity constraint on the estimation, one expect that \(\widehat{\textbf{x}}_{k|k}\) and \(\widehat{\textbf{x}}_{k|k-1}\) fluctuate around the mean value \({\textbf{x}}_{k}\), such that the estimation of the KF can include negative intensities. Typically, this corresponds to actual low or null intensities and those negative estimates are kept in the recursion. On the other hand, one needs to conserve the positiveness of the estimated measurement noise covariance matrix, that is to use a projection of \(\widehat{{\textbf{x}}}_{k|k-1}\) on a certain domain \(D\subseteq {\mathbb {R}}_+^{{Q\times 1}}\) (which remains to apply a threshold on \(\widehat{{\textbf{x}}}_{k|k-1}\) coordinates). Hence, the considered noise covariance estimate is \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}(\pi _{D}\left( \widehat{{\textbf{x}}}_{k|k-1}\right) )\), where \(\pi _{D}\) is the projector on D. Even though it is crucial that negative values of \(\widehat{\textbf{x}}_{k|k}\) and \(\widehat{\textbf{x}}_{k|k-1}\) are retained in the recursion (29) to fit with the analytic recursion (30) [5], the estimate that must be considered in physical application is indeed its thresholded version \(\pi _D\left( \widehat{\textbf{x}}_{k|k}\right)\) (as in Sect. 4).
3.4 Performance analysis
Since the measurement noise covariance matrix \({{\textbf{C}}}_{{\textbf{v}}_k}\) involves the actual state to be estimated and is thus unknown, the proposed KF is misspecified. Performances obtained by applying the designed KF on the actual signal model are suboptimal with respect to the MSE, i.e., compared to what can ideally achieve the KF with a perfect knowledge of the system (i.e., when \({{\textbf{C}}}_{{\textbf{v}}_k}\) is known). Following the discussion in [13], the performance obtained by applying the designed KF on the actual signal model is
and can be estimated by Monte Carlo simulations.Footnote 1 The error covariance matrix \(\widehat{\textbf{P}}_{k|k}\) computed by KF recursion (30) is an estimation of \({\textbf{P}}_{k|k}^\text {p}\). Typically, poor-quality estimator of \({{\textbf{C}}}_{{\textbf{v}}_k}\) leads to poor-quality performance estimator \(\widehat{\textbf{P}}_{k|k}\) which decreases the true performances \({\textbf{P}}_{k|k}^\text {p}\).
An ideal KF, corresponding to the best linear unbiased estimator (BLUE) in the MSE sense, is constructed by considering the actual measurement noise covariance \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}({{\textbf{x}}_k}) \triangleq {\textbf{C}}_{{\textbf{v}}_k}\) (not feasible in practice). The error covariance matrix of the ideal KF, denoted \({\textbf{P}}_{k|k}^\text {a}\), is computed by KF recursion and verifies that
There are different sources of misspecification that can combine and decrease KF performances. Indeed, the very first hypotheses mentioned in Sect. 2 must be verified in order for the considered LDSS to apply: independence properties and validity of the observation and state transition models (i.e., \(\textbf{A}_k\) and \(\textbf{F}_k\)). Systematic tests can be applied in order to verify the validity of these hypothesis. Besides this, the purpose of this work lies in the analytical expression of \(\textbf{C}_{\textbf{v}_k}\) and its usage in the KF: using \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}(\pi _{D}\left( \widehat{{\textbf{x}}}_{k|k-1}\right) )\) constructed from (16) improves performances compared with default model of \(\textbf{C}_{\textbf{v}_k}\) such as \(\sigma ^2\textbf{I}_{2M^2}\) (it also improves accuracy of \(\widehat{\textbf{P}}_{k|k}\)). This becomes especially true for smaller samples size since estimation errors of \({\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k}\) becomes more significant in \(\textbf{C}_{\textbf{v}_k}\) expression (\(\textbf{C}_{\textbf{v}_k}\approx \textbf{0}\) in the asymptotic regime). Once expression (16) is used, a second source of error concerns \(\textbf{C}_{\textbf{n}_k}\) misspecification (e.g., validity of the independence assumption of \(\textbf{n}_k\) coordinates and \(\left( \textbf{n}_k\right) _m\) kurtosis) and the signal kurtosis misspecification. The latter are also more significant for smaller sample sizes.
While \(\widehat{\textbf{x}}_{k|k}\) is not thresholded during the KF recursion, one may consider \(\pi _{D}\left( \widehat{\textbf{x}}_{k|k}\right)\) as a final estimator. Yet performance of \(\pi _{D}\left( \widehat{\textbf{x}}_{k|k}\right)\) is not given analytically, but it can be estimated by Monte Carlo simulations. These can be significantly better than for \(\widehat{\textbf{x}}_{k|k}\), in particular for small or null values.
3.5 Kalman filter initialization
KF formalism supposes that the first and second order moments of \({\textbf{x}}_0\) and \({\textbf{v}}_k\) for \(k\geqslant 0\) are known, which in the case of a deterministic state vector \({{\textbf{x}}_k}\) amounts to know \({\textbf{x}}_0\). However, \(\textbf{x}_0\) is unknown and must be estimated (which is the aim of the present paper); therefore, the filter must be initialized with an estimate \(\widehat{{\textbf{x}}}_{0|0}\) and its corresponding mean square error \({\textbf{P}}_{0|0}={\textbf{C}}_{\widehat{{\textbf{x}}}_{0|0}}\).
A plethora of estimators can be used [14], provided that the associated covariance is known. Among them, a linear unbiased estimator \(\widehat{{\textbf{x}}}_{0|0}\) is of particular interest since the unbiased property is conserved by the KF. It can be obtained by a Distortionless Response Filter (DRF) \({\textbf{K}}_0\) verifying \({\textbf{K}}_0 {\textbf{H}}_0 = {\textbf{I}}_{Q}\), which would give
Among the DRFs, the Minimum Variance Distortionless Filter (MVDRF) given by
which minimizes the covariance (hence the MSE), is used in this work.
However, in some practical cases (including the one under consideration), \(\textbf{C}_{{\textbf{v}}_0}\) is not invertible, and a specific form must be used. Based on a compact eigen value decomposition (EVD) of the measurement noise covariance and a compact singular value decomposition (SVD) of the measurement model, i.e.,
for \(\mathbb {U}\) such that \(\left[ \textbf{U}~\mathbb {U}\right]\) forms an orthonormal basis of \(\mathbb {R}^{2M^{2}}\) and
Appendix 4 shows that the MVDRF writes
It is clear from (33) that, any DRF implementation supposes to know the observation noise covariance matrix \({\textbf{C}}_{{\textbf{v}}_0}\), which, due to the unconventional construction of the KF, depends on the current state \({\textbf{x}}_0\). As for \(k\geqslant 1\), the KF must be computed by substituting \({\textbf{C}}_{{\textbf{v}}_0}\) by an estimate \(\widehat{{\textbf{C}}}_{{\textbf{v}}_0}\).
In the following results, at initialization, the observation noise covariance matrix \(\widehat{{\textbf{C}}}_{{\textbf{v}}_0}\) injected in the MVDRF initialization is built with a beamforming estimator
where \(\odot\) is the Hadamar product. By construction
for \(q=1,\ldots ,Q\), and \(\left( \widehat{\textbf{x}}_{0|0}^{\text {BF}}\right) _q\) can be considered as an unbiased estimator of an upper bound of \(\left( {\textbf{x}_0}\right) _q\) and in practice \(\left( \widehat{\textbf{x}}_{0|0}^{\text {BF}}\right) _q\gg \left( {\textbf{x}_0}\right) _q\). Then from Appendix 5 one obtains
which implies that the performance of the KF are upper bounded by \(\widehat{\textbf{P}}_{0|0}\) [13] (i.e., the KF estimate \(\widehat{\textbf{P}}_{0|0}\) is pessimistic). In particular,
which is not necessarily verified for \(k>0\).
4 Application to dynamic imaging for spacecraft navigation
As a contribution to resection for spacecraft navigation, we tackle the dynamic stars image estimation problem based on radio interferometry, i.e., image of radio source power. Indeed, as higlighted below, such radio interfeometric images estimate both power (i.e., \(x_q\)) and DOA of a radio source (i.e., \({\textbf{u}_q}\) in (42)), and allow to identify the DOA of radio sources of known power, provided that the power estimation is accurate enough. For instance, in order to achieve a satisfactory DOA resolution, modern radio astronomy observatories are interferometers which consist of a network of M antennas, possibly scattered around the globe. Considering their respective positions \({\textbf{r}}_m\in {\mathbb {R}}^2\), wavelength \(\lambda\) and gain function \(g_m(.)\) for \(m=1,\ldots ,M\), the transfer function associated to a band-limited signal modulated by a carrier frequency \(f_c=c/\lambda\) is given by
where \({\textbf{u}}\in {\mathbb {R}}^2\) is the DOA, i.e., the unit vector oriented toward the source of the signal, and c is the speed of light in a vacuum. Given an image of Q pixels, there are respectively Q direction of arrivals \({\textbf{u}}_q\) defining the array response matrix \({\textbf{A}}=\left( {\textbf{a}}({\textbf{u}}_1),\ldots ,{\textbf{a}}({\textbf{u}}_Q)\right) {\in \mathbb {C}^{M\times Q}}\) [4]. During the \(k-\)th STI interval, signals from the different radio sources are complex circular, mutually independent. Hereinafter, they are concatenated in a vector \({\textbf{s}}_k{\in \mathbb {C}^{Q\times 1}}\). The signal received by the network of antennas \({\textbf{z}}_k{\in \mathbb {C}^{M\times 1}}\) is modeled as (1), where \({\textbf{n}}_k{\in \mathbb {C}^{M\times 1}}\) is the measurement noise of the interferometer, which is complex circular, Gaussian distributed and independent of the source signals \({\textbf{s}}_k\). The observed visibility matrix, i.e., the SCM \(\hat{\textbf{C}}_{{\textbf{z}}_k}{\in \mathbb {C}^{M\times M}}\), is computed as in (2) from N independent and identically distributed realizations of \({\textbf{z}}_k\).
The proposed formalization of the KF provides an estimate of the time-varying power of the radio sources based on \(\hat{\textbf{C}}_{{\textbf{z}}_{l}}\) measurements for \(0\leqslant l\leqslant k\) in the case of a deterministic state model (5) and which can be computed iteratively. The source power estimation can be used to identify and track landmarks of known power, their DOAs being used for resection.
4.1 Results
The considered network has a “Y” shape, similarly to the Karl G. Jansky Very Large Array (VLA) observatory [15] and is composed of 27 independent antennas divided in three branches. The method is evaluated on a synthetic image composed of \(Q=22\times 22\) pixels, with a state-transition model \({\textbf{F}}_{k}{\in \mathbb {R}^{Q\times Q}}\) being a rotation matrix which corresponds to a rotation of the image by a fixed angle of \(90\,\text {deg}\) between each STI interval. \({\textbf{F}}_{k}\) is supposed to be known from an inertial motion sensor. Without loss of generality, all the gains \(g_m\) are considered equal to 1. Different sample sizes are considered for the SCM construction, that is \(N=10^5\) and \(N=10^3\) samples of bivariate signals \({\textbf{s}}_k\) and Gaussian noises \({\textbf{n}}_k\). A constant normalized noise covariance matrix is considered as \({\textbf{C}}_{{\textbf{n}}_k}\triangleq {\textbf{I}}_M\). Signals have independent real and imaginary parts following a Laplace distribution, i.e., \(\rho _{s_q}=3/2\). The wavelength is taken such that \(\lambda \triangleq 1\), which remains to express coordinates of antennas as function of the wavelength.
Performances of the KF are illustrated in Figs. 1, 2 in terms of MSE, defined as
Considered images and KF estimates are displayed in Fig. 3. The assumed KF performance obtained from KF recursion \(\widehat{\text {MSE}}\left( \widehat{\textbf{x}}_{k|k}\right) ={\text {trace}}\left( \mathbb {E}\left[ \widehat{\textbf{P}}_{k|k}\right] \right)\), the true KF performance \(\text {MSE}^p\left( \widehat{\textbf{x}}_{k|k}\right) ={\text {trace}}\left( {\textbf{P}}_{k|k}^p\right)\) (computed by Monte Carlo simulations) and the lower bound \(\text {MSE}^a\left( \widehat{\textbf{x}}_{k|k}\right) ={\text {trace}}\left( {\textbf{P}}_{k|k}^a\right)\) (computed by KF with \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}({{\textbf{x}}_k}) \triangleq {\textbf{C}}_{{\textbf{v}}_k}\)) are given with respect to the number of iterations. It is observed that for such reasonable configuration, \(\widehat{\text {MSE}}\left( \widehat{\textbf{x}}_{k|k}\right) \geqslant {\text {MSE}}^p\left( \widehat{\textbf{x}}_{k|k}\right) \geqslant {\text {MSE}}^a\left( \widehat{\textbf{x}}_{k|k}\right)\) for \(k\geqslant 0\), although it was proved only for \(k=0\). One shows that those three (i.e., \(\widehat{\text {MSE}}\left( \widehat{\textbf{x}}_{k|k}\right)\), \({\text {MSE}}^p\left( \widehat{\textbf{x}}_{k|k}\right)\) and \({\text {MSE}}^a\left( \widehat{\textbf{x}}_{k|k}\right)\)) appear to converge toward each other along time. \(\text {MSE}^p\left( \pi _{D}\left( \widehat{\textbf{x}}_{k|k}\right) \right)\) is presented for \(D = \mathbb {R}_+^{{Q\times 1}}\), illustrating that the truncated estimator can achieve better performances than the ideal KF.
Figures 1 and 3 present the results for \(N=10^3\) and \(N=10^5\). Notably, the KF can reach the same performance with a lower sample size but with an increasing iteration number. For \(N=10^5\), the true KF MSE is below \(-50\,\text {db10}\) for \(k\geqslant 3\), while it is for \(k\geqslant 120\) with \(N=10^3\). Indeed, performances increase with the number of samples, i.e., as the SCM \(\widehat{\textbf{C}}_{\textbf{z}_k}\) converges to \(\textbf{C}_{\textbf{z}_k}\). The effect of the sample size N is clearly visible in the expression of the observation noise covariance matrix (16): increasing N reduces the variance of the observation noise \(\textbf{v}_k\), and therefore reduces the estimation error.
The MVDRF is limited to a low number of pixels (it can go up to \(M(M-2)=675\) pixels). Other estimators could be used in a general context, e.g., \(\widehat{{\textbf{x}}}_{0|0}\triangleq \widehat{{\textbf{x}}}_{0|0}^\text {BF}\). Performances of the latter are illustrated in Figs. 2 and 4. The associated error covariance matrix expression is not known, and it is considered that \(\widehat{\textbf{P}}_{0|0}\triangleq 2{\text {diag}}\left( \widehat{{\textbf{x}}}_{0|0}\odot \widehat{{\textbf{x}}}_{0|0}\right)\). Two sets of direction of arrival are considered: the same as previously with \(Q=22\times 22\) and one with a better resolution of \(Q=30\times 30\). While there is no performance lower bound for \(Q=30\times 30\) since \(Q> M(M-2)\), it is shown that the KF still improves over recursion.
A key model property, in order for the KF to converge, is the validity of \({\textbf{P}}_{k|k}\). For instance, the proposed filter may not work for high resolution images as the observation model \(\textbf{H}_k\) will be to bad conditioned, corrupting the estimation of \({\textbf{P}}_{k|k}\) at each step (the number of considered direction of arrivals depends on the number of antennas).
5 Conclusion and perspectives
Autonomous spacecraft navigation relies on the availability of trustworthy measurements from which reference elements are tracked. For instance, radio interferometric images can be used to identify known radio sources from their power, provided that the estimation is accurate enough. In this context, this work proposes a formalization of the Kalman filter for the dynamic estimation of radio source power based on empirical covariance measurements, considering any signal and noise distributions. The proposed filter is misspecified but conserves the structure of the observation noise covariance matrix during the recursion. As such, the state and the filter accuracy are estimated conjointly. It is shown that the observation noise covariance matrix expression involves the multivariate kurtosis of the source signals and noise. An application on simulated data representative of a dynamic radio interferometic imaging framework was presented, highlighting the applicability of the proposed filter. Given that the observation model is well conditioned, one can compute a lower performance bound toward which the predicted and true filter performances converge along iterations. Future work will focus on state-transition model with additive state noise.
Availability of data and materials
Not applicable.
Code availability
The code used to generate the presented results are available from the corresponding author upon request.
Notes
Note that the recursive estimation procedure of \({\textbf{P}}_{k|k}^\text {p}\) presented in [13] is not tractable in practice since \({{\textbf{C}}}_{{\textbf{v}}_k}\) is unknown.
References
S. Henry, J.A. Christian, Absolute triangulation algorithms for space exploration. J. Guid. Control. Dyn. 46(1), 21–46 (2023). https://doi.org/10.2514/1.g006989
J.L. Poirot, G.V. McWilliams, Navigation by back triangulation. IEEE Trans. Aerosp. Electron. 12(2), 270–274 (1976). https://doi.org/10.1109/TAES.1976.308304
M. Driedger, P. Ferguson, Feasibility study of an orbital navigation filter using resident space object observations. J. Guid. Control. Dyn. 44(3), 622–628 (2021). https://doi.org/10.2514/1.G005210
A.-J. Veen, S.J. Wijnholds, A.M. Sardarabadi, in: Bhattacharyya, S.S., Deprettere, E.F., Leupers, R., Takala, J. (eds.) Signal Processing for Radio Astronomy, pp. 311–360. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91734-4_9
J.L. Crassidis, J.L. Junkins, Optimal Estimation of Dynamic Systems (2nd Ed.). Chapman and Hall/CRC (2011). https://doi.org/10.1201/b11154
P.S.R. Diniz, Adaptive Filtering: Algorithms and Practical Implementation (4 Ed.). Springer, Berlin (2013)
P.D. Groves, Principles of GNSS, Inertial, and Multisensor Integrated Navigation Systems (Second Edition, Artech House, 2013)
I. Arasaratnam, S. Haykin, Cubature kalman filters. IEEE Trans. Autom. Control 54(6), 1254–1269 (2009). https://doi.org/10.1109/TAC.2009.2019800
E. Chaumette, J. Vilá-Valls, F. Vincent, On the general conditions of existence for linear mmse filters: Wiener and kalman. Signal Process. 184, 108052 (2021). https://doi.org/10.1016/j.sigpro.2021.108052
J. Vilá-Valls, E. Chaumette, F. Vincent, P. Closas, Robust linearly constrained kalman filter for general mismatched linear state-space models. IEEE Trans. Autom. Control 67(12), 6794–6801 (2022). https://doi.org/10.1109/TAC.2021.3132890
K.V. Mardia, Measures of multivariate skewness and kurtosis with applications. Biometrika 57(3), 519–530 (1970). https://doi.org/10.1093/biomet/57.3.519
M.C. Vanderveen, B.C. Ng, C.B. Papadias, A. Paulraj, Joint angle and delay estimation (jade) for signals in multipath environments, in: Conference Record of The Thirtieth Asilomar Conference on Signals, Systems and Computers, pp. 1250–12542 (1996). https://doi.org/10.1109/ACSSC.1996.599145
B.D.O. Anderson, J.B. Moore, Optimal Filtering (Prentice-Hall, Information and system sciences series, 1979)
H.L. Van Trees, Optimum Array Processing: Part IV of Detection, Estimation, and Modulation Theory. Wiley, New York (2002). https://doi.org/10.1002/0471221104.ch2
P.J. Napier, A.R. Thompson, R.D. Ekers, The very large array: Design and performance of a modern synthesis radio telescope. Proc. IEEE 71(11), 1295–1320 (1983). https://doi.org/10.1109/PROC.1983.12765
Acknowledgements
Not applicable
Funding
This work was partially supported by the DGA/AID project 2021.65.0070.
Author information
Authors and Affiliations
Contributions
C.C. and N.A. designed the algorithm; E.C. is at the origin of the conceptualization; C.C. and E.C. wrote the manuscript and developed the methodology; C.C. completed simulation experiments; N.A., P.L., N.E.K. and I.V. provided suggestions for code and article details.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no conflict of interest.
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Estimation of real-valued state with complex measurements
Let consider a linear observation model
A complex, affine estimator of \(\textbf{x}\) writes \(\widehat{\textbf{x}}={\textbf{K}}{\textbf{y}}+\textbf{a}\) with \(\widehat{ \textbf{x}}\in \mathbb {C} ^{P}\), \(\textbf{K}\in \mathbb {C} ^{P\times N}\), \(\textbf{a}\in \mathbb {C} ^{P}\) and
and thus
The mean square error matrix writes
Then \(\forall \textbf{w}\in \mathbb {R} ^{P}\)
On the other hand, one may consider the following observation model:
for which a real and affine estimator of \(\textbf{x}\) writes \(\widehat{\textbf{x}}=\left[ \textbf{K}_{r}~\textbf{K}_{j}\right] \left( \begin{array}{c} \textbf{y}_{r} \\ \textbf{y}_{j} \end{array} \right) +\textbf{a}_{r}=\textbf{K}_{r}\textbf{y}_{r}+\textbf{K}_{j}\textbf{y} _{j}+\textbf{a}_{r} \widehat{\textbf{x}}\in \mathbb {R} ^{P}\), \(\textbf{K}_{r},\textbf{K}_{j}\in \mathbb {R} ^{P\times N}\), \(\textbf{a}_{r}\in \mathbb {R} ^{P}\). Its mean square error matrix is
Then, \(\forall \textbf{w}\in \mathbb {R} ^{P}\),
which leads to
Finally, it was proved that the performance of a linear estimator based on the concatenation of real and imaginary parts of a complex valued observation is upper bounded by the performance of an estimator based only on complex measurements. The latter also apply for the concatenation of complex-valued observations and its complex conjugate, as presented in this work. The demonstration derives from the relation
Appendix 2: Covariance of measurement’s vectorized SCM
This section provides expression of \({\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k}\) in order to implement (15). To proceed, one starts to write
where the subscript k is dropped in order to simplify expressions, with
and
The latter is developed from the general form of \(\textbf{z}\)
as
where
The former is developed from
which, given that \({\textbf{s}},\,{\textbf{n}}\) are independent and composed of respectively Q and M circular random variables, leads to
where \(x_q\triangleq \mathbb {E}\left[ \left| s_q\right| ^2\right]\). Since
and
one has that
It results from (B9) that
where \(\rho _{s_q}\) is defined as (18). On the other hand, remarking that
leads to the final equation:
Appendix 3: Covariance of a vectorized SCM
Let \(\textbf{n}\) be a vector of centered and independent complex circular random variables \(n_m\) for \(m=1,\ldots ,M\), then for \(\left( \textbf{e}_1,\ldots ,\textbf{e}_M^2\right)\) being the canonical basis of \(\mathbb {R}^{M^2}\), one has that
and thus
where \(m,\,m^\prime ,\,l,\,l^\prime \in \left\{ 1,\ldots ,M\right\}\).
On one hand, for \(\textbf{n}\) being a vector of centered and Gaussian complex circular random variables, the fourth-order cumulant
is null, i.e., \(\kappa _{i,j,k,l}=0\), which implies that
and then
from (C23). Finally, one has directly that
On the other hand, for \(\textbf{n}\) being a vector of independent and centered complex circular random variables following any distribution, one has from (C23) that
where \(\mathbb {E}\left[ \left| n_m\right| ^2\right] =\sigma _{n_m}^2,\, \mathbb {E}\left[ \left| n_{l^\prime }\right| ^2\right] =\sigma _{n_{l^\prime }}^2\) and
where
In particular, \(\rho _{n_m}=0\) for Gaussian random variables, which implies (C26).
In summary, it was proved that (C26) applies for a vector \(\textbf{n}\) of centered and Gaussian complex circular random variables, and that (C27) applies for centered and independent complex circular random variables following any distribution.
Appendix 4: Computation of the MVDRF
Given a linear observation model
such that
the general form of a DRF is (denoting \(\textbf{L}^{H}=\textbf{K}\))
with
The MVDRF (provided that it exists) is then the DRF such that
Considering that nor \(\textbf{H}\) and \(\textbf{C}_{\textbf{v}}\) are full rank, one denotes
and \(\mathbb {U},\,\mathbb {W}\) such that \(\left[ \textbf{U}\,\mathbb {U}\right]\), \(\left[ \textbf{W}\,\mathbb {W}\right]\) are unitary matrices. The considered filter is of the form
with
Thus
from which the distortionless condition is equivalent to
One has that
with
Then, one applies a last decomposition
Given \(\textbf{g }\in \mathbb {C} ^{P}\), one obtains
where \(\left\| \textbf{z}\right\| _{\textbf{D}}^{2}=\textbf{z}^{H} {\textbf{D}}{\textbf{z}}\). Considering the oblique projector
then
from which one can conclude
i.e.,
The equality holds for
Let \(\underline{\mathbb {V}}\in \mathbb {C}^{\left( N-P\right) \times \left( \left( N-P\right) -Q\right) }\) be such that \(\left[ \underline{\textbf{V}}~\underline{\mathbb {V}}\right]\) is a unitary matrix, then since \(\mathbb {T}\in \mathbb {C} ^{\left( N-P\right) \times P}\),
which leads to
where
and thus
One deduces that there is an infinity of solutions which minimize \(\textbf{L}^{H}\textbf{C}_{\textbf{v}}\textbf{L}\) (D38i) : since there is no constraint on \(\underline{\mathbb {A}}\), the solution has \(((N-P)-Q)\times P\) degrees of freedom. One may consider the solution with minimal Frobenius norm
i.e.,
by setting \(\underline{\mathbb {A}}=\textbf{0}\) (i.e., no unnecessarily additional assumption is made).
Appendix 5: Upper bound of the measurement noise cocariance
This section proves that for a given estimation \(\widehat{{\textbf{x}}}_{k|k}\) verifying \(\left( \widehat{{\textbf{x}}}_{k|k}\right) _q\geqslant \left( {\textbf{x}}_{k}\right) _q\,\forall q\in \left\{ 1,\ldots ,Q\right\}\), one have that \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}(\widehat{{\textbf{x}}}_{k|k})\geqslant \widehat{{\textbf{C}}}_{{\textbf{v}}_k}({{\textbf{x}}}_{k})\), where \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}({{\textbf{x}}}_{k})\triangleq {\textbf{C}}_{{\textbf{v}}_k}\).
Let \(\textbf{x},\,\textbf{x}^\prime \in \mathbb {R}^Q_+\) be such that \(x^\prime _q\geqslant x_q\,\forall q\in \left\{ 1,\ldots ,Q\right\}\), where the subscript k is dropped in order to simplify formulas. Since
one has that
with \(\left( \textbf{a}_q\textbf{a}_q^H\right) ^T\otimes \left( \textbf{a}_l\textbf{a}_l^H\right)\), \(\left( \textbf{a}_q\textbf{a}_q^H\right) ^T\otimes \textbf{C}_{\textbf{n}}\) and \(\textbf{C}_{\textbf{n}}^T\otimes \left( \textbf{a}_q\textbf{a}_q^H\right)\) being three hermitian and positive matrices. Hence, \(\widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}^\prime })^T\otimes \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}^\prime }) - \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}})\otimes \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}})\) is a linear combination of positive hermitian matrices with positive coefficients, and thus
On the other hand, from (17), one has that
which (from the same argument) yields to
Finally, remarking that from (16), \(\textbf{C}_{\textbf{v}}(\textbf{x}^\prime ) - \textbf{C}_{\textbf{v}}(\textbf{x})\) is the sum of
and
which are positive hermitian matrices since
one can conclude that
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Cano, C., Arab, N., Chaumette, É. et al. Kalman filter for radio source power and direction of arrival estimation. EURASIP J. Adv. Signal Process. 2024, 66 (2024). https://doi.org/10.1186/s13634-024-01147-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13634-024-01147-x