Skip to main content

Kalman filter for radio source power and direction of arrival estimation

Abstract

Images are an important source of information for spacecraft navigation. Based on an image and a known attitude, triangulation techniques (intersection or resection) are often used for positioning and navigation. In the resection problem, an observer estimate its unknown location by using angle measurements to points at known locations (i.e., landmarks), the localization performance depending on the accuracy of the angle measurements. As a contribution to resection for spacecraft navigation, we considers the dynamic image estimation problem based on radio interferometry, i.e., image of radio source power, where the measurements are sample covariance matrices (SCMs). Considering the case where several measurements are available as well as a known dynamic linear model of image evolution, a.k.a a linear state model, the minimum mean-squared error image estimator (MMSE) is given by the Kalman filter (KF) or one of its variants. However standard Kalman-like filters are not a priori suitable for the problem at hand since the measurements (i.e., SCMs) cannot be formulated analytically as a function of state parameters to be estimated (i.e., radio source power). In fact, this lack of analytical formulation can be circumvented by a statistical linear fitting allowing the SCMs to be expressed in terms of the state. This linear fitting introduces an additive residual noise, equivalent to a measurement noise, whose covariance matrix depends on the current state, a non-standard case for a measurement model. The covariance matrix of the residual noise is derived whatever the distributions of the radio sources and of the additive noise at the samples level, unveiling the contribution of their multivariate kurtosis. The proposed method is evaluated on simulated data representative of a dynamic radio interferometry framework. The results show that the proposed method is capable of effectively tracking moving radio sources in complex scenes with theoretical guaranties when the signal multivariate kurtosis is known.

1 Introduction

Cameras, telescopes, and similar tools provide essential information for today’s spacecraft. These devices serve various purposes, such as guiding, navigation and positioning. Relying on a known object attitude, digital images are used to identify known landmarks in the observed scene to their apparent pixel locations. In the context of guiding spacecraft, it is also known as “angles-only” optical navigation [1]. Triangulation (intersection or resection) from images is used in space navigation challenges like finding your way based on known landmarks, tracking objects using angles, and even guiding spacecraft with the help of stars for interstellar journeys [1]. We place ourselves in the context of resection, i.e., the observer estimates its unknown location by using angle measurements to points at known locations (i.e., landmarks). For instance, using resection, one can figure out where a spacecraft is in cislunar space by looking at known satellites or during interplanetary missions by tracking asteroids and planets [2, 3]. As a contribution to resection for spacecraft navigation, we tackle the dynamic stars image estimation problem based on radio interferometry, i.e., image of radio source power, where the measurements are sample covariance matrices (SCMs), a.k.a visibility matrices [4]. Indeed, such radio interferometric images estimate both power and direction of arrival (DOA) of a radio source, and allow to identify the DOA of radio sources of known power, provided that the power estimation is accurate enough. We consider the case where several measurements are available as well as a known dynamic linear model of image evolution, a.k.a a linear state model.

The design and use of state estimation techniques is fundamental in a plethora of applications, such as robotics, tracking, guidance and navigation systems [5,6,7]. For a linear dynamic system, the Kalman filter (KF) is the best linear minimum mean-squared error (MMSE) estimator. The most widespread solution for nonlinear systems is to resort to system linearization, leading to the so-called linearized or extended KF (EKF) [7]. In both cases, as well as for more advanced techniques such as sigma-point filters [8], the main assumption is a perfect system knowledge [9]: (i) known process and measurement functions, including their parameters, (ii) known inputs and (iii) noise statistics (i.e., first and second order moments for the KF and EKF). Thus, usage of KF may not be possible or of poor performance if one (or more) of the above requirements is not met [10].

Consequently, at first sight, Kalman-like filters do not seem to be usable for the problem at hand since, although each individual sample may adhere to a linear parametric model, the finite horizon SCMs cannot be formulated analytically as a function of state parameters to be estimated (radio source power). Fortunately, this lack of analytical formulation can be circumvented by a statistical linear fitting, at least under the assumptions of: a) a deterministic dynamic state model (no state noise), and b) instantaneous linear observations from multiple radio sources in the presence of additive noise (stochastic observation model), when the radio sources are mutually independent and independent from the noise. The proposed linear fitting allows the SCMs to be expressed in terms of the state parameters and introduces an additive residual noise, equivalent to a measurement noise, whose covariance matrix depends on the current state parameters, a non-standard case for a measurement model. The covariance matrix of the residual noise is derived whatever the distributions of the radio sources and of the additive noise at the samples level, unveiling the contribution of their multivariate kurtosis [11] whose values depend on whether the radio sources and noise distributions are heavy or light tailed. To support the discussion, the proposed method is evaluated on simulated data representative of a dynamic radio interferometic imaging framework.

2 Measurement model

Let us consider a network composed of M antennas receiving signals at consecutive short time integration (STI) intervals \(\left[ t_k, t_k+\epsilon \right]\), \(k\geqslant 0\). During the \(k-\)th STI interval, observations are i.i.d. realizations of a stochastic variable \({{\textbf{z}}}_k\in {\mathbb {C}}^{M\times 1}\) consisting of a linear mixture of signals coming from Q radio sources, \({{\textbf{s}}}_k\in {\mathbb {C}}^{Q\times 1}\), in the presence of an additive noise \({\textbf{n}}_k\):

$$\begin{aligned} {\textbf{z}}_k = {\textbf{A}}_k{\textbf{s}}_k + {\textbf{n}}_k, \end{aligned}$$
(1)

with \({\textbf{A}}_k\in {\mathbb {C}}^{M\times Q}\) the system response matrix, and \({\textbf{s}}_k\) and \({\textbf{n}}_k\) being independent and centered complex circular random vectors. Especially, we consider the case where we only have access to the sample covariance matrix (SCM)

$$\begin{aligned} \hat{{\textbf{C}}}_{{\textbf{z}}_k} = \frac{1}{N} \sum \limits _{n=1}^N {\textbf{z}}_k{(n)}{\textbf{z}}_k{(n)}^H{\in \mathbb {C}^{M\times M}}, \end{aligned}$$
(2)

where \({\textbf{z}}_k{(n)}\) is the \(n-\)th realization of \({\textbf{z}}_k\), N is the number of samples and the subscript H denotes the hermitian operator, i.e., \({\textbf{z}}_k^H\triangleq \left( {\textbf{z}}_k^*\right) ^T\). Asymptotically, i.e., when N tends to infinity, \(\hat{{\textbf{C}}}_{{\textbf{z}}_k}\) converges in probability to

$$\begin{aligned} \begin{aligned} {\textbf{C}}_{{\textbf{z}}_k}&= {\mathbb {E}}\left[ {\textbf{z}}_k{\textbf{z}}_k^H\right] \\&= {\textbf{A}}_k {\mathbb {E}}[{\textbf{s}}_k {\textbf{s}}_k^H]{\textbf{A}}_k^H + {\textbf{C}}_{{\textbf{n}}_k}. \end{aligned} \end{aligned}$$
(3)

In practice, \({\textbf{A}}_k\) is known and \({\textbf{C}}_{{\textbf{n}}_k}\in \mathbb {C}^{M\times M}\) is known or can be measured at desired precision.

Considering mutually independent source signals and denoting \({\textbf{x}}_k\in \mathbb {R}^{Q\times 1}_+\) the vector composed of individual source signal powers, a.k.a. intensities, the source signals covariance matrix is diagonal and writes

$$\begin{aligned} {\mathbb {E}}[{\textbf{s}}_k {\textbf{s}}_k^H]\triangleq {\text {diag}}\left( {{\textbf{x}}_k}\right) \in \mathbb {R}^{Q\times Q}, \end{aligned}$$
(4)

using the vector-to-matrix \(\text {diag}\) operator. Assuming independence among all \({\textbf{n}}_k\) and \({\textbf{s}}_l\) for \(k,\,l\geqslant 1\), the objective is to estimate the intensities \({\textbf{x}}_k\), under the state model

$$\begin{aligned} {{\textbf{x}}_k} = {\textbf{F}}_{k-1} {\textbf{x}}_{k-1}, \end{aligned}$$
(5)

where the state-transition matrix \({\textbf{F}}_{k-1}{\in \mathbb {R}^{Q\times Q}}\) is known a priori.

2.1 Linear discrete state-space model

One can construct a linear observation model in the asymptotic case, using the vectorized covariance matrix as observation, i.e.,

$$\begin{aligned} \text {vec}\left( \textbf{C}_{\textbf{z}_k}\right) = \left[ \textbf{A}_k^**\textbf{A}_k\right] \textbf{x}_k + \text {vec}\left( \textbf{C}_{\textbf{n}_k}\right) , \end{aligned}$$
(6)

with \(*\) being the column-wise Kronecker product, also called the Khatri–Rao product [12]. However, such linear model does not apply in the non asymptotic case. In order to circumvent this limitation, we propose to apply a statistical linear fitting model based on (6) where measurements are the vectorized SCMs, which are concatenated with their complex conjugates for the sake of optimality (see Appendix 1), i.e.,

$$\begin{aligned} {\textbf{y}}_{k}\triangleq \left( \begin{array}{c} {\text {vec}}\left( \hat{{\textbf{C}}}_{{\textbf{z}} _{k}}\right) \\ {\text {vec}}\left( \hat{{\textbf{C}}}_{{\textbf{z}} _{k}}^{*}\right) \end{array} \right) \in \mathbb {C}^{2M^2\times 1}. \end{aligned}$$
(7)

At iteration k, the observation model is defined as

$$\begin{aligned} {\textbf{H}}_k \triangleq \left[ \begin{array}{c} {\textbf{A}}_k^* * {\textbf{A}}_k\\ {\textbf{A}}_k * {\textbf{A}}_k^{*} \end{array}\right] \in \mathbb {C}^{2M^2\times Q}, \end{aligned}$$
(8)

and the observation residual is defined as

$$\begin{aligned} {\textbf{v}}_k \triangleq {\textbf{y}}_k - {\textbf{H}}_k {{\textbf{x}}_k}. \end{aligned}$$
(9)

The observation residual \({\textbf{v}}_k\) corresponds to the approximation error of the vectorized SCM by the linear model \(\textbf{H}_k\textbf{x}_k\).

In the following, a Kalman filter is derived for the estimation of \({{\textbf{x}}_k}\) based on SCMs measurements. \({\textbf{v}}_k\) is considered as the observation noise according to the standard KF formalism. This corresponds to a non-standard linear discrete state space (LDSS) model for which the quantities of interest \({{\textbf{x}}_k}\) are involved in the observation noise. Section 3 design the KF in accordance with the definition of \({\textbf{v}}_k\), that will induce \({{\textbf{x}}_k}\) to appear in the KF algorithm. When needed, in particular in the covariance matrix of \({{\textbf{v}}_k}\), it is proposed to replace \({{\textbf{x}}_k}\) by an estimate in order to conserve the underlying structure.

3 Design of the Kalman filter

3.1 Kalman filter existence

For such a LDSS model, without state noise, one needs to verify that

$$\begin{aligned} {\textbf{C}}_{{\textbf{v}}_{k},{\textbf{y}}_{l}}=\mathbb {E}\left[ \left( {\textbf{v}}_{k}-\mathbb {E}\left[ {\textbf{v}}_{k}\right] \right) \left( {\textbf{y}}_{l}-\mathbb {E}\left[ {\textbf{y}}_{l}\right] \right) ^H\right] \end{aligned}$$
(10)

is null for \(k\geqslant 2\) and \(l<k\) in order to prove the existence of a Kalman filter [9]. The verification is straightforward since \({\textbf{y}}_k\) and \({\textbf{y}}_l\) are independent and \({{\textbf{x}}_k}\) is deterministic, i.e.,

$$\begin{aligned} \begin{aligned} {\textbf{C}}_{{\textbf{v}}_{k},{\textbf{y}}_{l}}=&\mathbb {E}\left[ \left( {\textbf{y}}_k - {\textbf{H}}_k {{\textbf{x}}_k}-\mathbb {E}\left[ {\textbf{y}}_k - {\textbf{H}}_k {{\textbf{x}}_k}\right] \right) \left( {\textbf{y}}_{l}-\mathbb {E}\left[ {\textbf{y}}_{l}\right] \right) ^H\right] \\ =&\mathbb {E}\left[ \left( {\textbf{y}}_k -\mathbb {E}\left[ {\textbf{y}}_k \right] \right) \left( {\textbf{y}}_{l}-\mathbb {E}\left[ {\textbf{y}}_{l}\right] \right) ^H\right] \\ =&{\textbf{C}}_{{\textbf{y}}_{k},{\textbf{y}}_{l}}\\ =&\textbf{0}. \end{aligned} \end{aligned}$$
(11)

3.2 Identification of first and second order statistics

The next step consists in evaluating the first and second order statistics of the observation noise.

3.2.1 Mean of the observation noise

Since \(\hat{{\textbf{C}}}_{{\textbf{z}}_k}\) is an unbiased estimate of \({\textbf{C}}_{{\textbf{z}}_k}\), one has that

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[{\textbf{y}}_k] =&\left[ {\text {vec}}\left( {\textbf{C}}_{{\textbf{z}}_k}\right) ^T, {\text {vec}}\left( {\textbf{C}}_{{\textbf{z}}_k}^{*} \right) ^T\right] ^{T}\\ =&\, {\textbf{H}}_k {{\textbf{x}}_k} + {\textbf{v}}_k^a, \end{aligned} \end{aligned}$$
(12)

where

$$\begin{aligned} {\textbf{v}}_k^a=\left[ {\text {vec}}\left( {\textbf{C}}_{{\textbf{n}}_k}\right) ^T, {\text {vec}}\left( {\textbf{C}}_{{\textbf{n}}_k}^{*} \right) ^T\right] ^{T} \end{aligned}$$
(13)

is the asymptotic measurement noise. From (9), one obtains that \({\mathbb {E}}[{\textbf{v}}_k]={\mathbb {E}}[{\textbf{y}}_k] - {\textbf{H}}_k{{\textbf{x}}_k}\) and therefore

$$\begin{aligned} {\mathbb {E}}[{\textbf{v}}_k] ={\textbf{v}}_k^a. \end{aligned}$$
(14)

3.2.2 Covariance of the observation noise

As \({{\textbf{x}}_k}\) is deterministic, one obtains \({\textbf{C}}_{{\textbf{v}}_k}={\textbf{C}}_{{\textbf{y}}_k}\), with

$$\begin{array}{*{20}l} {{\mathbf{C}}_{{{\mathbf{y}}_{k} }} = {\mathbb{E}}\left[ {\left( {{\mathbf{y}}_{k} - {\mathbb{E}}\left[ {{\mathbf{y}}_{k} } \right]} \right)\left( {{\mathbf{y}}_{k} - {\mathbb{E}}\left[ {{\mathbf{y}}_{k} } \right]} \right)^{H} } \right]} \hfill \\ { = {\mathbb{E}}\left[ {\left( {\begin{array}{*{20}c} {{\text{vec}}\left( {\frac{1}{N}\sum\limits_{{n = 1}}^{N} {{\mathbf{z}}_{k} } (n){\mathbf{z}}_{k}^{H} (n)} \right) - {\mathbb{E}}\left[ {{\text{vec}}\left( {\frac{1}{N}\sum\limits_{{n = 1}}^{N} {{\mathbf{z}}_{k} } (n){\mathbf{z}}_{k}^{H} (n)} \right)} \right]} \\ {{\text{vec}}\left( {\frac{1}{N}\sum\limits_{{n = 1}}^{N} {{\mathbf{z}}_{k} } (n)^{*} {\mathbf{z}}_{k}^{T} (n)} \right) - {\mathbb{E}}\left[ {{\text{vec}}\left( {\frac{1}{N}\sum\limits_{{n = 1}}^{N} {{\mathbf{z}}_{k}^{*} } (n){\mathbf{z}}_{k}^{T} (n)} \right)} \right]} \\ \end{array} } \right)} \right.} \hfill \\ {\quad \times \left. {\left( {\begin{array}{*{20}c} {{\text{vec}}\left( {\frac{1}{N}\sum\limits_{{n = 1}}^{N} {{\mathbf{z}}_{k} } (n){\mathbf{z}}_{k}^{H} (n)} \right) - {\mathbb{E}}\left[ {{\text{vec}}\left( {\frac{1}{N}\sum\limits_{{n = 1}}^{N} {{\mathbf{z}}_{k} } (n){\mathbf{z}}_{k}^{H} (n)} \right)} \right]} \\ {{\text{vec}}\left( {\frac{1}{N}\sum\limits_{{n = 1}}^{N} {{\mathbf{z}}_{k} } (n)^{*} {\mathbf{z}}_{k}^{T} (n)} \right) - {\mathbb{E}}\left[ {{\text{vec}}\left( {\frac{1}{N}\sum\limits_{{n = 1}}^{N} {{\mathbf{z}}_{k}^{*} } (n){\mathbf{z}}_{k}^{T} (n)} \right)} \right]} \\ \end{array} } \right)^{H} } \right]} \hfill \\ { = \frac{1}{N}{\mathbb{E}}\left[ {\left( {\begin{array}{*{20}c} {{\text{vec}}\left( {{\mathbf{z}}_{k} {\mathbf{z}}_{k}^{H} } \right) - {\mathbb{E}}\left[ {{\text{vec}}\left( {{\mathbf{z}}_{k} {\mathbf{z}}_{k}^{H} } \right)} \right]} \\ {{\text{vec}}\left( {{\mathbf{z}}_{k}^{*} {\mathbf{z}}_{k}^{T} } \right) - {\mathbb{E}}\left[ {{\text{vec}}\left( {{\mathbf{z}}_{k}^{*} {\mathbf{z}}_{k}^{T} } \right)} \right]} \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {{\text{vec}}\left( {{\mathbf{z}}_{k} {\mathbf{z}}_{k}^{H} } \right) - {\mathbb{E}}\left[ {{\text{vec}}\left( {{\mathbf{z}}_{k} {\mathbf{z}}_{k}^{H} } \right)} \right]} \\ {{\text{vec}}\left( {{\mathbf{z}}_{k}^{*} {\mathbf{z}}_{k}^{T} } \right) - {\mathbb{E}}\left[ {{\text{vec}}\left( {{\mathbf{z}}_{k}^{*} {\mathbf{z}}_{k}^{T} } \right)} \right]} \\ \end{array} } \right)^{H} } \right]} \hfill \\ { = \frac{1}{N}{\mathbf{C}}_{{\left( {\begin{array}{*{20}c} {{\mathbf{z}}_{k}^{*} \otimes {\mathbf{z}}_{k} } \\ {{\mathbf{z}}_{k} \otimes {\mathbf{z}}_{k}^{*} } \\ \end{array} } \right)}} ,} \hfill \\ \end{array}$$
(15)

where \(\otimes\) is the usual Kronecker product, such that \({\textbf{z}}_k^{*}\otimes {\textbf{z}}_k={\text {vec}}({\textbf{z}}_k{\textbf{z}}_k^H)\), \({\textbf{z}}_k\otimes {\textbf{z}}_k^*={\text {vec}}({\textbf{z}}_k^*{\textbf{z}}_k^T)\) and \({\textbf{z}}_k(n)\) are N independent replicates of \({\textbf{z}}_k\). Finally, one has that

$$\begin{aligned} {\textbf{C}}_{{\textbf{v}}_k} =\frac{1}{N} \left[ \begin{array}{cc} {\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k} &{} {\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k,{\textbf{z}}_k\otimes {\textbf{z}}_k^{*}} \\ {\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k,{\textbf{z}}_k\otimes {\textbf{z}}_k^{*}}^{*} &{} {\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k}^{*} \end{array} \right] . \end{aligned}$$
(16)

Based on the independence of source signals and noises as well as their circularity, Appendix 2 develops expression of \({\textbf{C}}_{{\textbf{z}}^*\otimes {\textbf{z}}}\) with respect to Kronecker product properties. After dropping the index k in order to shorten expressions, one obtains

$$\begin{aligned} \begin{aligned} {\textbf{C}}_{{\textbf{z}}^*\otimes {\textbf{z}}} =&\, {\textbf{C}}_{{\textbf{z}}}^T\otimes {\textbf{C}}_{{\textbf{z}}}+\sum \limits _{q=1}^Q\left( {\textbf{a}}^*_q\otimes {\textbf{a}}_q\right) \left( {\textbf{a}}^*_q\otimes {\textbf{a}}_q\right) ^H\rho _{s_q}{x_q}^2 \\ {}&+ \textbf{C}_{\textbf{n}^{*}\otimes \textbf{n}}-\textbf{C}_{\textbf{n}}^{T}\otimes \textbf{C}_{\textbf{n}}. \end{aligned} \end{aligned}$$
(17)

for source signals \((s_q)_{q=1,\ldots ,Q}\) and noises \((n_m)_{m=1,\ldots ,M}\) following any distributions, where \(x_q={\mathbb {E}\left[ \left| s_q\right| ^2\right] }\) is the \(q-\)th coordinate of \(\textbf{x}\), \({\textbf{a}}_q\) is the \(q-\)th column of \({\textbf{A}}\) and

$$\begin{aligned} \rho _{s_q} = \frac{{\mathbb {E}}\left[ \left( s_q^*s_q\right) ^2\right] }{{x_q}^2} -2 \end{aligned}$$
(18)

is the normalized multivariate kurtosis of \(s_q\) [11].

Finally, note that for

$$\begin{aligned} {\textbf{P}}=\sum \limits ^M_{m=1}\sum \limits ^M_{m^\prime =1}\left( {\textbf{e}}_m\otimes {\textbf{e}}_{m^\prime }\right) \left( {\textbf{e}}_{m^\prime }\otimes {\textbf{e}}_{m}\right) ^T, \end{aligned}$$
(19)

with \(\left( {\textbf{e}}_1,\ldots ,{\textbf{e}}_M\right)\) the canonical basis of \({\mathbb {R}}^M\), one obtains that

$$\begin{aligned} \textbf{e}_{m+M(m^\prime -1)}^T \textbf{C}_{\textbf{z}^*\otimes \textbf{z}}\textbf{P} \textbf{e}_{l+M(l^\prime -1)} = \textbf{e}_{m+M(m^\prime -1)}^T \textbf{C}_{\textbf{z}^*\otimes \textbf{z}} \textbf{e}_{l^\prime +M(l-1)} \end{aligned}$$
(20)

from which the following equality holds:

$$\begin{aligned} \textbf{C}_{\textbf{z}^*\otimes \textbf{z}, \textbf{z}\otimes \textbf{z}^*} = \textbf{C}_{\textbf{z}^*\otimes \textbf{z}}\textbf{P}, \end{aligned}$$
(21)

for any distribution.

Independent noise components In particular, for a vector \(\textbf{n}\) composed of independent variables \(n_m\), Appendix 3 shows that the kurtosis of the noise also appears in \(\textbf{C}_{\textbf{n}^{*}\otimes \textbf{n}}\), which is a diagonal matrix such that

$$\begin{aligned} \left( \textbf{C}_{\textbf{n}^{*}\otimes \textbf{n}}\right) _{m+M(m^\prime -1),\,m+M(m^\prime -1)}= \left\{ \begin{aligned}&(\rho _{n_m}+1)\sigma ^4_{n_m},&m=m^\prime ;\\&\sigma ^2_{n_m}\sigma ^2_{n_{m^\prime }},&m\ne m^\prime , \end{aligned} \right. \end{aligned}$$
(22)

where

$$\begin{aligned} \rho _{n_m} = \frac{{\mathbb {E}}\left[ \left( n_m^*n_m\right) ^2\right] }{{\sigma ^4_{n_m}}}-2 \end{aligned}$$
(23)

and \(\sigma _{n_m}\) are respectively the normalized multivariate kurtosis and the variance of \(n_m\).

Gaussian noise distribution For instance, for \(n_m\) such that \({\mathfrak {Re}}(n_m)\) and \({\mathfrak {Im}}(n_m)\) are independent and follows the same distribution \({\mathcal {D}}\),

$$\begin{aligned} \rho _{n_m} = \frac{1}{2}\left( {\text {kurtosis}}\left( {{\mathcal {D}}}\right) -3\right) . \end{aligned}$$
(24)

Since the kurtosis of a univariate real-valued Gaussian random variable is 3, \(\rho _{n_m}=0\) for Gaussian noise, but in general \(\rho _{n_m}\ne 0\). Hence, for centered and Gaussian complex circular noise, one has that

$$\begin{aligned} \textbf{C}_{\textbf{n}^{*}\otimes \textbf{n}}=\textbf{C}_{\textbf{n}}^{T}\otimes \textbf{C}_{\textbf{n}}, \end{aligned}$$
(25)

from which

$$\begin{aligned} {\textbf{C}}_{{\textbf{z}}^*\otimes {\textbf{z}}} =&\, {\textbf{C}}_{{\textbf{z}}}^T\otimes {\textbf{C}}_{{\textbf{z}}}+\sum \limits _{q=1}^Q\left( {\textbf{a}}^*_q\otimes {\textbf{a}}_q\right) \left( {\textbf{a}}^*_q\otimes {\textbf{a}}_q\right) ^H\rho _{s_q}{x_q}^2 , \end{aligned}$$
(26)

Gaussian noise and signal distributions It simplifies for centered and Gaussian complex circular signal and noise, for which

$$\begin{aligned} {\textbf{C}}_{{\textbf{z}}^*\otimes {\textbf{z}}} =&\, {\textbf{C}}_{{\textbf{z}}}^T\otimes {\textbf{C}}_{{\textbf{z}}}. \end{aligned}$$
(27)

The latter can be proven by remarking that \(\rho _{s_q}=0\) in (26), or following Appendix 3 since \(\textbf{z}\) is a vector of centered and Gaussian complex circular random variables.

3.3 Kalman filter recursion

At iteration k, the current estimate writes

$$\begin{aligned} \widehat{{\textbf{x}}}_{k|k}=\widehat{{\textbf{x}}}_{k|k-1}+{\textbf{K}} _{k}\left( {\textbf{y}}_{k}-\widehat{{\textbf{y}}}_{k|k-1}\right) , \end{aligned}$$
(28)

where state and measurement predictions are

$$\begin{aligned} \widehat{{\textbf{x}}}_{k|k-1}={\textbf{F}}_{k-1}\widehat{{\textbf{x}}}_{k-1|k-1}, \quad \widehat{{\textbf{y}}}_{k|k-1}={\textbf{H}}_{k}\widehat{{\textbf{x}}}_{k|k-1}+{\textbf{v}}_{k}^{a}. \end{aligned}$$
(29)

The observation noise covariance \(\textbf{C}_{\textbf{v}_k}\) is unknown since it depends on the current state \(\textbf{x}_k\) that has to be estimated. An estimate \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}\) is used, from which follows that the innovation covariance matrix \({\textbf{S}}_{k|k-1}{\triangleq \textbf{C}_{\textbf{y}_k - \widehat{{\textbf{y}}}_{k|k-1}}}\) and hence the predicted and a posteriori estimate error covariance matrices, i.e., \({\textbf{P}}_{k|k-1}{\triangleq \textbf{C}_{\textbf{x}_k - \widehat{{\textbf{x}}}_{k|k-1}}}\) and \({\textbf{P}}_{k|k}{\triangleq \textbf{C}_{\textbf{x}_k - \widehat{{\textbf{x}}}_{k|k}}}\) respectively, are estimated. The optimal Kalman gain \({\textbf{K}}_k\) is computed with the recursion

$$\begin{aligned} \left\{ \begin{array}{rl} \widehat{\textbf{P}}_{k|k-1} &{}={\textbf{F}}_{k-1}\widehat{\textbf{P}}_{k-1|k-1}{\textbf{F}}_{k-1}^{H};\\ \widehat{\textbf{S}}_{k|k-1} &{}={\textbf{H}}_{k}\widehat{\textbf{P}}_{k|k-1}{\textbf{H}}_{k}^{H}+\widehat{\textbf{C}}_{{\textbf{v}}_{k}}{(\pi _{D}\left( \widehat{{\textbf{x}}}_{k|k-1}\right) )};\\ {\textbf{K}}_{k} &{}=\widehat{\textbf{P}}_{k|k-1}{\textbf{H}}_{k}^{H}\left( \widehat{\textbf{S}}_{k|k-1}\right) ^{-1}; \\ \widehat{\textbf{P}}_{k|k} &{}=\left( {\textbf{I}}-{\textbf{K}}_{k}{\textbf{H}}_{k}\right) \widehat{\textbf{P}}_{k|k-1}. \end{array} \right. \end{aligned}$$
(30)

The measurement noise covariance estimate is constructed with the state prediction \(\widehat{{\textbf{x}}}_{k|k-1}\). Since there is no non-negativity constraint on the estimation, one expect that \(\widehat{\textbf{x}}_{k|k}\) and \(\widehat{\textbf{x}}_{k|k-1}\) fluctuate around the mean value \({\textbf{x}}_{k}\), such that the estimation of the KF can include negative intensities. Typically, this corresponds to actual low or null intensities and those negative estimates are kept in the recursion. On the other hand, one needs to conserve the positiveness of the estimated measurement noise covariance matrix, that is to use a projection of \(\widehat{{\textbf{x}}}_{k|k-1}\) on a certain domain \(D\subseteq {\mathbb {R}}_+^{{Q\times 1}}\) (which remains to apply a threshold on \(\widehat{{\textbf{x}}}_{k|k-1}\) coordinates). Hence, the considered noise covariance estimate is \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}(\pi _{D}\left( \widehat{{\textbf{x}}}_{k|k-1}\right) )\), where \(\pi _{D}\) is the projector on D. Even though it is crucial that negative values of \(\widehat{\textbf{x}}_{k|k}\) and \(\widehat{\textbf{x}}_{k|k-1}\) are retained in the recursion (29) to fit with the analytic recursion (30) [5], the estimate that must be considered in physical application is indeed its thresholded version \(\pi _D\left( \widehat{\textbf{x}}_{k|k}\right)\) (as in Sect. 4).

3.4 Performance analysis

Since the measurement noise covariance matrix \({{\textbf{C}}}_{{\textbf{v}}_k}\) involves the actual state to be estimated and is thus unknown, the proposed KF is misspecified. Performances obtained by applying the designed KF on the actual signal model are suboptimal with respect to the MSE, i.e., compared to what can ideally achieve the KF with a perfect knowledge of the system (i.e., when \({{\textbf{C}}}_{{\textbf{v}}_k}\) is known). Following the discussion in [13], the performance obtained by applying the designed KF on the actual signal model is

$$\begin{aligned} {\textbf{P}}_{k|k}^\text {p} = {\textbf{C}}_{{\textbf{x}}_k - \widehat{\textbf{x}}_{k|k}}, \end{aligned}$$
(31)

and can be estimated by Monte Carlo simulations.Footnote 1 The error covariance matrix \(\widehat{\textbf{P}}_{k|k}\) computed by KF recursion (30) is an estimation of \({\textbf{P}}_{k|k}^\text {p}\). Typically, poor-quality estimator of \({{\textbf{C}}}_{{\textbf{v}}_k}\) leads to poor-quality performance estimator \(\widehat{\textbf{P}}_{k|k}\) which decreases the true performances \({\textbf{P}}_{k|k}^\text {p}\).

An ideal KF, corresponding to the best linear unbiased estimator (BLUE) in the MSE sense, is constructed by considering the actual measurement noise covariance \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}({{\textbf{x}}_k}) \triangleq {\textbf{C}}_{{\textbf{v}}_k}\) (not feasible in practice). The error covariance matrix of the ideal KF, denoted \({\textbf{P}}_{k|k}^\text {a}\), is computed by KF recursion and verifies that

$$\begin{aligned} {\textbf{P}}_{k|k}^\text {p}\geqslant {\textbf{P}}_{k|k}^\text {a}. \end{aligned}$$
(32)

There are different sources of misspecification that can combine and decrease KF performances. Indeed, the very first hypotheses mentioned in Sect. 2 must be verified in order for the considered LDSS to apply: independence properties and validity of the observation and state transition models (i.e., \(\textbf{A}_k\) and \(\textbf{F}_k\)). Systematic tests can be applied in order to verify the validity of these hypothesis. Besides this, the purpose of this work lies in the analytical expression of \(\textbf{C}_{\textbf{v}_k}\) and its usage in the KF: using \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}(\pi _{D}\left( \widehat{{\textbf{x}}}_{k|k-1}\right) )\) constructed from (16) improves performances compared with default model of \(\textbf{C}_{\textbf{v}_k}\) such as \(\sigma ^2\textbf{I}_{2M^2}\) (it also improves accuracy of \(\widehat{\textbf{P}}_{k|k}\)). This becomes especially true for smaller samples size since estimation errors of \({\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k}\) becomes more significant in \(\textbf{C}_{\textbf{v}_k}\) expression (\(\textbf{C}_{\textbf{v}_k}\approx \textbf{0}\) in the asymptotic regime). Once expression (16) is used, a second source of error concerns \(\textbf{C}_{\textbf{n}_k}\) misspecification (e.g., validity of the independence assumption of \(\textbf{n}_k\) coordinates and \(\left( \textbf{n}_k\right) _m\) kurtosis) and the signal kurtosis misspecification. The latter are also more significant for smaller sample sizes.

While \(\widehat{\textbf{x}}_{k|k}\) is not thresholded during the KF recursion, one may consider \(\pi _{D}\left( \widehat{\textbf{x}}_{k|k}\right)\) as a final estimator. Yet performance of \(\pi _{D}\left( \widehat{\textbf{x}}_{k|k}\right)\) is not given analytically, but it can be estimated by Monte Carlo simulations. These can be significantly better than for \(\widehat{\textbf{x}}_{k|k}\), in particular for small or null values.

3.5 Kalman filter initialization

KF formalism supposes that the first and second order moments of \({\textbf{x}}_0\) and \({\textbf{v}}_k\) for \(k\geqslant 0\) are known, which in the case of a deterministic state vector \({{\textbf{x}}_k}\) amounts to know \({\textbf{x}}_0\). However, \(\textbf{x}_0\) is unknown and must be estimated (which is the aim of the present paper); therefore, the filter must be initialized with an estimate \(\widehat{{\textbf{x}}}_{0|0}\) and its corresponding mean square error \({\textbf{P}}_{0|0}={\textbf{C}}_{\widehat{{\textbf{x}}}_{0|0}}\).

A plethora of estimators can be used [14], provided that the associated covariance is known. Among them, a linear unbiased estimator \(\widehat{{\textbf{x}}}_{0|0}\) is of particular interest since the unbiased property is conserved by the KF. It can be obtained by a Distortionless Response Filter (DRF) \({\textbf{K}}_0\) verifying \({\textbf{K}}_0 {\textbf{H}}_0 = {\textbf{I}}_{Q}\), which would give

$$\begin{aligned} \widehat{{\textbf{x}}}_{0|0}={\textbf{K}}_{0}\left( {\textbf{y}}_0-{\mathbb {E}}\left[ {\textbf{y}}_0\right] \right) ~\text {and}~{\textbf{P}}_{0|0}={\textbf{K}}_{0}{\textbf{C}}_{{\textbf{v}}_0}{\textbf{K}}_{0}^H. \end{aligned}$$
(33)

Among the DRFs, the Minimum Variance Distortionless Filter (MVDRF) given by

$$\begin{aligned} {\textbf{K}}_0 = \left( {\textbf{H}}^H_0{\textbf{C}}^{-1}_{{\textbf{v}}_0}{\textbf{H}}_0\right) ^{-1}{\textbf{H}}^H_0{\textbf{C}}^{-1}_{{\textbf{v}}_0}, \end{aligned}$$
(34)

which minimizes the covariance (hence the MSE), is used in this work.

However, in some practical cases (including the one under consideration), \(\textbf{C}_{{\textbf{v}}_0}\) is not invertible, and a specific form must be used. Based on a compact eigen value decomposition (EVD) of the measurement noise covariance and a compact singular value decomposition (SVD) of the measurement model, i.e.,

$$\begin{aligned} \textbf{C}_{\textbf{v}_{0}}\overset{\text {compact EVD}}{=}\textbf{WDW}^{H} \quad \text {and}\quad \textbf{H}_{0}\overset{\text {compact SVD}}{=}\mathbf {U\Sigma V}^{H}; \end{aligned}$$
(35)

for \(\mathbb {U}\) such that \(\left[ \textbf{U}~\mathbb {U}\right]\) forms an orthonormal basis of \(\mathbb {R}^{2M^{2}}\) and

$$\begin{aligned} \textbf{W}^{H}\mathbb {U}\overset{\text {compact SVD}}{=}\underline{\textbf{U}} \underline{\mathbf {\Sigma }}\underline{\textbf{V}}^{H}, \end{aligned}$$
(36)

Appendix 4 shows that the MVDRF writes

$$\begin{aligned} \textbf{K}_{0}=\textbf{V}\mathbf {\Sigma }^{-1}\textbf{U}^{H} - \textbf{V}\mathbf {\Sigma }^{-1}\textbf{U}^H\textbf{W}\textbf{D}\underline{\textbf{U}}\left( \underline{\textbf{U}}^{H}\textbf{D}\underline{\textbf{U}}\right) ^{-1}\underline{\mathbf {\Sigma }}^{-1}\underline{\textbf{V}}^H\mathbb {U}^H. \end{aligned}$$
(37)

It is clear from (33) that, any DRF implementation supposes to know the observation noise covariance matrix \({\textbf{C}}_{{\textbf{v}}_0}\), which, due to the unconventional construction of the KF, depends on the current state \({\textbf{x}}_0\). As for \(k\geqslant 1\), the KF must be computed by substituting \({\textbf{C}}_{{\textbf{v}}_0}\) by an estimate \(\widehat{{\textbf{C}}}_{{\textbf{v}}_0}\).

In the following results, at initialization, the observation noise covariance matrix \(\widehat{{\textbf{C}}}_{{\textbf{v}}_0}\) injected in the MVDRF initialization is built with a beamforming estimator

$$\begin{aligned} \widehat{{\textbf{x}}}_{0|0}^\text {BF} = {\text {diag}}\left( {\textbf{A}}_k^H \left( \widehat{\textbf{C}}_{{\textbf{z}}_0} - {\textbf{C}}_{{\textbf{n}}_0}\right) {\textbf{A}}_k\right) \odot {\text {diag}}\left( {\textbf{A}}_k^* {\textbf{A}}_k\right) ^{-1} \end{aligned}$$
(38)

where \(\odot\) is the Hadamar product. By construction

$$\begin{aligned} \mathbb {E}\left[ \left( \widehat{\textbf{x}}_{0|0}^{\text {BF}}\right) _q\right] =\left( {\textbf{x}_0}\right) _q+\sum \limits _{q^{\prime }=1,q^{\prime }\ne q}^{Q}\left( {\textbf{x}_0}\right) _{q^{\prime }}\frac{\left| \textbf{a}_{q}^{H}\textbf{a}_{q^{\prime }}\right| ^{2}}{\left| \textbf{a}_{q}^{H}\textbf{a}_{q}\right| ^{2}}\geqslant \left( {\textbf{x}_0}\right) _q \end{aligned}$$
(39)

for \(q=1,\ldots ,Q\), and \(\left( \widehat{\textbf{x}}_{0|0}^{\text {BF}}\right) _q\) can be considered as an unbiased estimator of an upper bound of \(\left( {\textbf{x}_0}\right) _q\) and in practice \(\left( \widehat{\textbf{x}}_{0|0}^{\text {BF}}\right) _q\gg \left( {\textbf{x}_0}\right) _q\). Then from Appendix 5 one obtains

$$\begin{aligned} \widehat{{\textbf{C}}}_{{\textbf{v}}_0}\left( \widehat{{\textbf{x}}}_{0|0}^\text {BF}\right) \geqslant {\textbf{C}}_{{\textbf{v}}_0}, \end{aligned}$$
(40)

which implies that the performance of the KF are upper bounded by \(\widehat{\textbf{P}}_{0|0}\) [13] (i.e., the KF estimate \(\widehat{\textbf{P}}_{0|0}\) is pessimistic). In particular,

$$\begin{aligned} \widehat{\textbf{P}}_{0|0}\geqslant {\textbf{P}}_{0|0}^\text {p}\geqslant {\textbf{P}}_{0|0}^\text {a}, \end{aligned}$$
(41)

which is not necessarily verified for \(k>0\).

4 Application to dynamic imaging for spacecraft navigation

As a contribution to resection for spacecraft navigation, we tackle the dynamic stars image estimation problem based on radio interferometry, i.e., image of radio source power. Indeed, as higlighted below, such radio interfeometric images estimate both power (i.e., \(x_q\)) and DOA of a radio source (i.e., \({\textbf{u}_q}\) in (42)), and allow to identify the DOA of radio sources of known power, provided that the power estimation is accurate enough. For instance, in order to achieve a satisfactory DOA resolution, modern radio astronomy observatories are interferometers which consist of a network of M antennas, possibly scattered around the globe. Considering their respective positions \({\textbf{r}}_m\in {\mathbb {R}}^2\), wavelength \(\lambda\) and gain function \(g_m(.)\) for \(m=1,\ldots ,M\), the transfer function associated to a band-limited signal modulated by a carrier frequency \(f_c=c/\lambda\) is given by

$$\begin{aligned} {\textbf{a}}({\textbf{u}}) = \left( g_1\left( {\textbf{u}}\right) e^{2j\pi \frac{{\textbf{r}}_1^T{\textbf{u}}}{\lambda }},\ldots ,g_M\left( {\textbf{u}}\right) e^{2j\pi \frac{{\textbf{r}}_M^T{\textbf{u}}}{\lambda }}\right) ^T, \end{aligned}$$
(42)

where \({\textbf{u}}\in {\mathbb {R}}^2\) is the DOA, i.e., the unit vector oriented toward the source of the signal, and c is the speed of light in a vacuum. Given an image of Q pixels, there are respectively Q direction of arrivals \({\textbf{u}}_q\) defining the array response matrix \({\textbf{A}}=\left( {\textbf{a}}({\textbf{u}}_1),\ldots ,{\textbf{a}}({\textbf{u}}_Q)\right) {\in \mathbb {C}^{M\times Q}}\) [4]. During the \(k-\)th STI interval, signals from the different radio sources are complex circular, mutually independent. Hereinafter, they are concatenated in a vector \({\textbf{s}}_k{\in \mathbb {C}^{Q\times 1}}\). The signal received by the network of antennas \({\textbf{z}}_k{\in \mathbb {C}^{M\times 1}}\) is modeled as (1), where \({\textbf{n}}_k{\in \mathbb {C}^{M\times 1}}\) is the measurement noise of the interferometer, which is complex circular, Gaussian distributed and independent of the source signals \({\textbf{s}}_k\). The observed visibility matrix, i.e., the SCM \(\hat{\textbf{C}}_{{\textbf{z}}_k}{\in \mathbb {C}^{M\times M}}\), is computed as in (2) from N independent and identically distributed realizations of \({\textbf{z}}_k\).

The proposed formalization of the KF provides an estimate of the time-varying power of the radio sources based on \(\hat{\textbf{C}}_{{\textbf{z}}_{l}}\) measurements for \(0\leqslant l\leqslant k\) in the case of a deterministic state model (5) and which can be computed iteratively. The source power estimation can be used to identify and track landmarks of known power, their DOAs being used for resection.

4.1 Results

The considered network has a “Y” shape, similarly to the Karl G. Jansky Very Large Array (VLA) observatory [15] and is composed of 27 independent antennas divided in three branches. The method is evaluated on a synthetic image composed of \(Q=22\times 22\) pixels, with a state-transition model \({\textbf{F}}_{k}{\in \mathbb {R}^{Q\times Q}}\) being a rotation matrix which corresponds to a rotation of the image by a fixed angle of \(90\,\text {deg}\) between each STI interval. \({\textbf{F}}_{k}\) is supposed to be known from an inertial motion sensor. Without loss of generality, all the gains \(g_m\) are considered equal to 1. Different sample sizes are considered for the SCM construction, that is \(N=10^5\) and \(N=10^3\) samples of bivariate signals \({\textbf{s}}_k\) and Gaussian noises \({\textbf{n}}_k\). A constant normalized noise covariance matrix is considered as \({\textbf{C}}_{{\textbf{n}}_k}\triangleq {\textbf{I}}_M\). Signals have independent real and imaginary parts following a Laplace distribution, i.e., \(\rho _{s_q}=3/2\). The wavelength is taken such that \(\lambda \triangleq 1\), which remains to express coordinates of antennas as function of the wavelength.

Fig. 1
figure 1

MSE lower bound \(\text {MSE}^a\left( \widehat{\textbf{x}}_{k|k}\right)\) (black circles), MSE predicted by KF \(\widehat{\text {MSE}}\left( \widehat{\textbf{x}}_{k|k}\right)\) (pink boxes), MSE actually achieved \(\text {MSE}^p\left( \widehat{\textbf{x}}_{k|k}\right)\) (blue crosses) of the proposed FK and MSE achieved by the thresholded estimates \(\text {MSE}^p\left( \pi _{\mathbb {R}^{{Q\times 1}}_+}\left( \widehat{\textbf{x}}_{k|k}\right) \right)\) (green plus signs), for a \(N=10^5\) and b \(N=10^3\) samples. Initialization was performed with a MVDRF

Performances of the KF are illustrated in Figs. 1, 2 in terms of MSE, defined as

$$\begin{aligned} \text {MSE}^p\left( \widehat{\textbf{x}}_{k|k}\right) =\mathbb {E}\left( \left\| \textbf{x}_k - \widehat{\textbf{x}}_{k|k} \right\| _2^2\right) . \end{aligned}$$
(43)

Considered images and KF estimates are displayed in Fig. 3. The assumed KF performance obtained from KF recursion \(\widehat{\text {MSE}}\left( \widehat{\textbf{x}}_{k|k}\right) ={\text {trace}}\left( \mathbb {E}\left[ \widehat{\textbf{P}}_{k|k}\right] \right)\), the true KF performance \(\text {MSE}^p\left( \widehat{\textbf{x}}_{k|k}\right) ={\text {trace}}\left( {\textbf{P}}_{k|k}^p\right)\) (computed by Monte Carlo simulations) and the lower bound \(\text {MSE}^a\left( \widehat{\textbf{x}}_{k|k}\right) ={\text {trace}}\left( {\textbf{P}}_{k|k}^a\right)\) (computed by KF with \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}({{\textbf{x}}_k}) \triangleq {\textbf{C}}_{{\textbf{v}}_k}\)) are given with respect to the number of iterations. It is observed that for such reasonable configuration, \(\widehat{\text {MSE}}\left( \widehat{\textbf{x}}_{k|k}\right) \geqslant {\text {MSE}}^p\left( \widehat{\textbf{x}}_{k|k}\right) \geqslant {\text {MSE}}^a\left( \widehat{\textbf{x}}_{k|k}\right)\) for \(k\geqslant 0\), although it was proved only for \(k=0\). One shows that those three (i.e., \(\widehat{\text {MSE}}\left( \widehat{\textbf{x}}_{k|k}\right)\), \({\text {MSE}}^p\left( \widehat{\textbf{x}}_{k|k}\right)\) and \({\text {MSE}}^a\left( \widehat{\textbf{x}}_{k|k}\right)\)) appear to converge toward each other along time. \(\text {MSE}^p\left( \pi _{D}\left( \widehat{\textbf{x}}_{k|k}\right) \right)\) is presented for \(D = \mathbb {R}_+^{{Q\times 1}}\), illustrating that the truncated estimator can achieve better performances than the ideal KF.

Fig. 2
figure 2

KF performances with a beamforming initialization for different image size: \(Q=22\times 22\) (a) and \(Q=30\times 30\) (b). The MSE lower bound \(\text {MSE}^a\left( \widehat{\textbf{x}}_{k|k}\right)\) (black circles), MSE predicted by KF \(\widehat{\text {MSE}}\left( \widehat{\textbf{x}}_{k|k}\right)\) (pink boxes), MSE actually achieved \(\text {MSE}^p\left( \widehat{\textbf{x}}_{k|k}\right)\) (blue crosses) of the proposed FK and MSE achieved by the thresholded estimates \(\text {MSE}^p\left( \pi _{\mathbb {R}^{{Q\times 1}}_+}\left( \widehat{\textbf{x}}_{k|k}\right) \right)\) (green plus signs) are given for \(N=10^3\)

Figures 1 and 3 present the results for \(N=10^3\) and \(N=10^5\). Notably, the KF can reach the same performance with a lower sample size but with an increasing iteration number. For \(N=10^5\), the true KF MSE is below \(-50\,\text {db10}\) for \(k\geqslant 3\), while it is for \(k\geqslant 120\) with \(N=10^3\). Indeed, performances increase with the number of samples, i.e., as the SCM \(\widehat{\textbf{C}}_{\textbf{z}_k}\) converges to \(\textbf{C}_{\textbf{z}_k}\). The effect of the sample size N is clearly visible in the expression of the observation noise covariance matrix (16): increasing N reduces the variance of the observation noise \(\textbf{v}_k\), and therefore reduces the estimation error.

Fig. 3
figure 3

Actual images (left) and KF estimates \(\pi _{{\mathbb {R}}_+^Q}\left( \widehat{{\textbf{x}}}_{k|k}\right)\) (right) at \(k=0,1,5\) and 30 (b, c, d and e respectively) for \(N=10^5\) (middle) and \(N=10^3\) (right) samples. The beamforming estimation (a) is used to initialize \(\widehat{\textbf{C}}_{{\textbf{v}}_0}\) for the MVDRF estimation (b)

The MVDRF is limited to a low number of pixels (it can go up to \(M(M-2)=675\) pixels). Other estimators could be used in a general context, e.g., \(\widehat{{\textbf{x}}}_{0|0}\triangleq \widehat{{\textbf{x}}}_{0|0}^\text {BF}\). Performances of the latter are illustrated in Figs. 2 and 4. The associated error covariance matrix expression is not known, and it is considered that \(\widehat{\textbf{P}}_{0|0}\triangleq 2{\text {diag}}\left( \widehat{{\textbf{x}}}_{0|0}\odot \widehat{{\textbf{x}}}_{0|0}\right)\). Two sets of direction of arrival are considered: the same as previously with \(Q=22\times 22\) and one with a better resolution of \(Q=30\times 30\). While there is no performance lower bound for \(Q=30\times 30\) since \(Q> M(M-2)\), it is shown that the KF still improves over recursion.

A key model property, in order for the KF to converge, is the validity of \({\textbf{P}}_{k|k}\). For instance, the proposed filter may not work for high resolution images as the observation model \(\textbf{H}_k\) will be to bad conditioned, corrupting the estimation of \({\textbf{P}}_{k|k}\) at each step (the number of considered direction of arrivals depends on the number of antennas).

Fig. 4
figure 4

Actual images (left) and KF estimates \(\pi _{{\mathbb {R}}_+^Q}\left( \widehat{{\textbf{x}}}_{k|k}\right)\) (right) at \(k=0,1,5\) and 30 (ad respectively) for \(N=10^3\) samples

5 Conclusion and perspectives

Autonomous spacecraft navigation relies on the availability of trustworthy measurements from which reference elements are tracked. For instance, radio interferometric images can be used to identify known radio sources from their power, provided that the estimation is accurate enough. In this context, this work proposes a formalization of the Kalman filter for the dynamic estimation of radio source power based on empirical covariance measurements, considering any signal and noise distributions. The proposed filter is misspecified but conserves the structure of the observation noise covariance matrix during the recursion. As such, the state and the filter accuracy are estimated conjointly. It is shown that the observation noise covariance matrix expression involves the multivariate kurtosis of the source signals and noise. An application on simulated data representative of a dynamic radio interferometic imaging framework was presented, highlighting the applicability of the proposed filter. Given that the observation model is well conditioned, one can compute a lower performance bound toward which the predicted and true filter performances converge along iterations. Future work will focus on state-transition model with additive state noise.

Availability of data and materials

Not applicable.

Code availability

The code used to generate the presented results are available from the corresponding author upon request.

Notes

  1. Note that the recursive estimation procedure of \({\textbf{P}}_{k|k}^\text {p}\) presented in [13] is not tractable in practice since \({{\textbf{C}}}_{{\textbf{v}}_k}\) is unknown.

References

  1. S. Henry, J.A. Christian, Absolute triangulation algorithms for space exploration. J. Guid. Control. Dyn. 46(1), 21–46 (2023). https://doi.org/10.2514/1.g006989

    Article  Google Scholar 

  2. J.L. Poirot, G.V. McWilliams, Navigation by back triangulation. IEEE Trans. Aerosp. Electron. 12(2), 270–274 (1976). https://doi.org/10.1109/TAES.1976.308304

    Article  Google Scholar 

  3. M. Driedger, P. Ferguson, Feasibility study of an orbital navigation filter using resident space object observations. J. Guid. Control. Dyn. 44(3), 622–628 (2021). https://doi.org/10.2514/1.G005210

    Article  Google Scholar 

  4. A.-J. Veen, S.J. Wijnholds, A.M. Sardarabadi, in: Bhattacharyya, S.S., Deprettere, E.F., Leupers, R., Takala, J. (eds.) Signal Processing for Radio Astronomy, pp. 311–360. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91734-4_9

  5. J.L. Crassidis, J.L. Junkins, Optimal Estimation of Dynamic Systems (2nd Ed.). Chapman and Hall/CRC (2011). https://doi.org/10.1201/b11154

  6. P.S.R. Diniz, Adaptive Filtering: Algorithms and Practical Implementation (4 Ed.). Springer, Berlin (2013)

  7. P.D. Groves, Principles of GNSS, Inertial, and Multisensor Integrated Navigation Systems (Second Edition, Artech House, 2013)

    Google Scholar 

  8. I. Arasaratnam, S. Haykin, Cubature kalman filters. IEEE Trans. Autom. Control 54(6), 1254–1269 (2009). https://doi.org/10.1109/TAC.2009.2019800

    Article  MathSciNet  Google Scholar 

  9. E. Chaumette, J. Vilá-Valls, F. Vincent, On the general conditions of existence for linear mmse filters: Wiener and kalman. Signal Process. 184, 108052 (2021). https://doi.org/10.1016/j.sigpro.2021.108052

    Article  Google Scholar 

  10. J. Vilá-Valls, E. Chaumette, F. Vincent, P. Closas, Robust linearly constrained kalman filter for general mismatched linear state-space models. IEEE Trans. Autom. Control 67(12), 6794–6801 (2022). https://doi.org/10.1109/TAC.2021.3132890

    Article  MathSciNet  Google Scholar 

  11. K.V. Mardia, Measures of multivariate skewness and kurtosis with applications. Biometrika 57(3), 519–530 (1970). https://doi.org/10.1093/biomet/57.3.519

    Article  MathSciNet  Google Scholar 

  12. M.C. Vanderveen, B.C. Ng, C.B. Papadias, A. Paulraj, Joint angle and delay estimation (jade) for signals in multipath environments, in: Conference Record of The Thirtieth Asilomar Conference on Signals, Systems and Computers, pp. 1250–12542 (1996). https://doi.org/10.1109/ACSSC.1996.599145

  13. B.D.O. Anderson, J.B. Moore, Optimal Filtering (Prentice-Hall, Information and system sciences series, 1979)

    Google Scholar 

  14. H.L. Van Trees, Optimum Array Processing: Part IV of Detection, Estimation, and Modulation Theory. Wiley, New York (2002). https://doi.org/10.1002/0471221104.ch2

  15. P.J. Napier, A.R. Thompson, R.D. Ekers, The very large array: Design and performance of a modern synthesis radio telescope. Proc. IEEE 71(11), 1295–1320 (1983). https://doi.org/10.1109/PROC.1983.12765

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable

Funding

This work was partially supported by the DGA/AID project 2021.65.0070.

Author information

Authors and Affiliations

Authors

Contributions

C.C. and N.A. designed the algorithm; E.C. is at the origin of the conceptualization; C.C. and E.C. wrote the manuscript and developed the methodology; C.C. completed simulation experiments; N.A., P.L., N.E.K. and I.V. provided suggestions for code and article details.

Corresponding author

Correspondence to Cyril Cano.

Ethics declarations

Competing interests

The authors declare that they have no conflict of interest.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Estimation of real-valued state with complex measurements

Let consider a linear observation model

$$\begin{aligned} \textbf{y}={\textbf{H}}{\textbf{x}}+\textbf{v},\quad \textbf{y},\textbf{v}\in \mathbb {C} ^{N},\quad \textbf{x}\in \mathbb {R} ^{P}. \end{aligned}$$
(A1)

A complex, affine estimator of \(\textbf{x}\) writes \(\widehat{\textbf{x}}={\textbf{K}}{\textbf{y}}+\textbf{a}\) with \(\widehat{ \textbf{x}}\in \mathbb {C} ^{P}\), \(\textbf{K}\in \mathbb {C} ^{P\times N}\), \(\textbf{a}\in \mathbb {C} ^{P}\) and

$$\begin{aligned} {\textbf{K}}{\textbf{y}} =\left( \textbf{K}_{r}+j\textbf{K}_{j}\right) \left( \textbf{y} _{r}+j\textbf{y}_{j}\right) =\left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K} _{j}\textbf{y}_{j}\right) +j\left( \textbf{K}_{r}\textbf{y}_{j}+\textbf{K} _{j}\textbf{y}_{r}\right) \end{aligned}$$

and thus

$$\begin{aligned} \left( {\textbf{K}}{\textbf{y}}+\textbf{a}\right) -\textbf{x} =\left( \textbf{K}_{r} \textbf{y}_{r}-\textbf{K}_{j}\textbf{y}_{j}+\textbf{a}_{r}-\textbf{x}\right) +j\left( \textbf{K}_{r}\textbf{y}_{j}+\textbf{K}_{j}\textbf{y}_{r}+\textbf{a} _{j}\right) . \end{aligned}$$
(A2)

The mean square error matrix writes

$$\begin{aligned} \textbf{P}\left( \textbf{K}\right)= & {} E\left[ \left( \left( {\textbf{K}}{\textbf{y}}+ \textbf{a}\right) -\textbf{x}\right) \left( \left( {\textbf{K}}{\textbf{y}}+\textbf{a} \right) -\textbf{x}\right) ^{H}\right] \nonumber \\= & {} E\left[ \begin{array}{l} \left( \left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K}_{j}\textbf{y}_{j}+ \textbf{a}_{r}-\textbf{x}\right) +j\left( \textbf{K}_{r}\textbf{y}_{j}+ \textbf{K}_{j}\textbf{y}_{r}+\textbf{a}_{j}\right) \right) \\ \times \left( \left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K}_{j}\textbf{y} _{j}+\textbf{a}_{r}-\textbf{x}\right) +j\left( \textbf{K}_{r}\textbf{y}_{j}+ \textbf{K}_{j}\textbf{y}_{r}+\textbf{a}_{j}\right) \right) ^{H} \end{array} \right] \nonumber \\= & {} E\left[ \begin{array}{l} \left( \left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K}_{j}\textbf{y}_{j}+ \textbf{a}_{r}-\textbf{x}\right) +j\left( \textbf{K}_{r}\textbf{y}_{j}+ \textbf{K}_{j}\textbf{y}_{r}+\textbf{a}_{j}\right) \right) \\ \times \left( \left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K}_{j}\textbf{y} _{j}+\textbf{a}_{r}-\textbf{x}\right) ^{T}-j\left( \textbf{K}_{r}\textbf{y} _{j}+\textbf{K}_{j}\textbf{y}_{r}+\textbf{a}_{j}\right) ^{T}\right) \end{array} \right] \nonumber \\= & {} \begin{array}{l} E\left[ \left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K}_{j}\textbf{y}_{j}+ \textbf{a}_{r}-\textbf{x}\right) \left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{ K}_{j}\textbf{y}_{j}+\textbf{a}_{r}-\textbf{x}\right) ^{T}\right] \\ +E\left[ \left( \textbf{K}_{r}\textbf{y}_{j}+\textbf{K}_{j}\textbf{y}_{r}+ \textbf{a}_{j}\right) \left( \textbf{K}_{r}\textbf{y}_{j}+\textbf{K}_{j} \textbf{y}_{r}+\textbf{a}_{j}\right) ^{T}\right] \\ +j\left( \begin{array}{l} E\left[ \left( \textbf{K}_{r}\textbf{y}_{j}+\textbf{K}_{j}\textbf{y}_{r}+ \textbf{a}_{j}\right) \left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K}_{j} \textbf{y}_{j}+\textbf{a}_{r}-\textbf{x}\right) ^{T}\right] \\ -E\left[ \left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K}_{j}\textbf{y}_{j}+ \textbf{a}_{r}-\textbf{x}\right) \left( \textbf{K}_{r}\textbf{y}_{j}+\textbf{ K}_{j}\textbf{y}_{r}+\textbf{a}_{j}\right) ^{T}\right] \end{array} \right) \end{array} \end{aligned}$$
(A3)

Then \(\forall \textbf{w}\in \mathbb {R} ^{P}\)

$$\begin{aligned} \begin{aligned} \textbf{w}^{T}\textbf{P}\left( \textbf{K}\right) \textbf{w} \mathbf {=}&~~\textbf{w}^{T}E\left[ \left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K}_{j} \textbf{y}_{j}+\textbf{a}_{r}-\textbf{x}\right) \left( \textbf{K}_{r}\textbf{ y}_{r}-\textbf{K}_{j}\textbf{y}_{j}+\textbf{a}_{r}-\textbf{x}\right) ^{T} \right] \textbf{w} \\&+\textbf{w}^{T}E\left[ \left( \textbf{K}_{r}\textbf{y}_{j}+\textbf{K}_{j} \textbf{y}_{r}+\textbf{a}_{j}\right) \left( \textbf{K}_{r}\textbf{y}_{j}+ \textbf{K}_{j}\textbf{y}_{r}+\textbf{a}_{j}\right) ^{T}\right] \textbf{w} \\&+j\Big ( \textbf{w}^{T}E\left[ \left( \textbf{K}_{r}\textbf{y}_{j}+\textbf{K}_{j} \textbf{y}_{r}+\textbf{a}_{j}\right) \left( \textbf{K}_{r}\textbf{y}_{r}- \textbf{K}_{j}\textbf{y}_{j}+\textbf{a}_{r}-\textbf{x}\right) ^{T}\right] \textbf{w}\\&- \textbf{w}^{T}E\left[ \left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K}_{j} \textbf{y}_{j}+\textbf{a}_{r}-\textbf{x}\right) \left( \textbf{K}_{r}\textbf{ y}_{j}+\textbf{K}_{j}\textbf{y}_{r}+\textbf{a}_{j}\right) ^{T}\right] \textbf{w} \Big ) \\ =&E\left[ \left( \textbf{w}^{T}\left( \textbf{K}_{r}\textbf{y}_{r}-\textbf{K}_{j} \textbf{y}_{j}+\textbf{a}_{r}-\textbf{x}\right) \right) ^{2}\right] +E\left[ \left( \textbf{w}^{T}\left( \textbf{K}_{r}\textbf{y}_{j}+\textbf{K}_{j} \textbf{y}_{r}+\textbf{a}_{j}\right) \right) ^{2}\right] . \end{aligned} \end{aligned}$$
(A4)

On the other hand, one may consider the following observation model:

$$\begin{aligned} \left( \begin{array}{c} \textbf{y}_{r} \\ \textbf{y}_{j} \end{array} \right) =\left[ \begin{array}{c} \textbf{H}_{r} \\ \textbf{H}_{j} \end{array} \right] \textbf{x}+\left( \begin{array}{c} \textbf{v}_{r} \\ \textbf{v}_{j} \end{array} \right) \end{aligned}$$
(A5)

for which a real and affine estimator of \(\textbf{x}\) writes \(\widehat{\textbf{x}}=\left[ \textbf{K}_{r}~\textbf{K}_{j}\right] \left( \begin{array}{c} \textbf{y}_{r} \\ \textbf{y}_{j} \end{array} \right) +\textbf{a}_{r}=\textbf{K}_{r}\textbf{y}_{r}+\textbf{K}_{j}\textbf{y} _{j}+\textbf{a}_{r} \widehat{\textbf{x}}\in \mathbb {R} ^{P}\), \(\textbf{K}_{r},\textbf{K}_{j}\in \mathbb {R} ^{P\times N}\), \(\textbf{a}_{r}\in \mathbb {R} ^{P}\). Its mean square error matrix is

$$\begin{aligned} \textbf{P}\left( \left[ \textbf{K}_{r}~\textbf{K}_{j}\right] \right) =E\left[ \left( \textbf{K}_{r}\textbf{y}_{r}+\textbf{K}_{j}\textbf{y}_{j}+\textbf{a} _{r}-\textbf{x}\right) \left( \textbf{K}_{r}\textbf{y}_{r}+\textbf{K}_{j} \textbf{y}_{j}+\textbf{a}_{r}-\textbf{x}\right) ^{T}\right] . \end{aligned}$$

Then, \(\forall \textbf{w}\in \mathbb {R} ^{P}\),

$$\begin{aligned} \textbf{w}^{T}\textbf{P}\left( \left[ \textbf{K}_{r}~-\textbf{K}_{j}\right] \right) \textbf{w}=E\left[ \left( \textbf{w}^{T}\left( \textbf{K}_{r}\textbf{ y}_{r}-\textbf{K}_{j}\textbf{y}_{j}+\textbf{a}_{r}-\textbf{x}\right) \right) ^{2}\right] \le \textbf{w}^{T}\textbf{P}\left( \textbf{K}_{r}+j\textbf{K} _{j}\right) \textbf{w} \end{aligned}$$
(A6)

which leads to

$$\begin{aligned} \forall \textbf{w}\in \mathbb {R} ^{P},~\textbf{w}^{T}\textbf{P}\left( \left[ \textbf{K}_{r}~\textbf{K}_{j} \right] ^{b}\right) \textbf{w}\le \textbf{w}^{T}\textbf{P}\left( \left[ \textbf{K}_{r}^{b}~-\textbf{K}_{j}^{b}\right] \right) \textbf{w}\le \textbf{ w}^{T}\textbf{P}\left( \textbf{K}^{b}=\textbf{K}_{r}^{b}+j\textbf{K} _{j}^{b}\right) \textbf{w}. \end{aligned}$$
(A7)

Finally, it was proved that the performance of a linear estimator based on the concatenation of real and imaginary parts of a complex valued observation is upper bounded by the performance of an estimator based only on complex measurements. The latter also apply for the concatenation of complex-valued observations and its complex conjugate, as presented in this work. The demonstration derives from the relation

$$\begin{aligned} \left[ \begin{array}{c} \textbf{y}_{r} \\ \textbf{y}_{j} \end{array}\right] =\frac{1}{2}\left[ \begin{array}{cc} \mathbf {I_N}&{} \mathbf {I_N}\\ -j\mathbf {I_N}&{}j\mathbf {I_N} \end{array}\right] \left[ \begin{array}{c} \textbf{y} \\ \textbf{y}^* \end{array}\right] . \end{aligned}$$
(A8)

Appendix 2: Covariance of measurement’s vectorized SCM

This section provides expression of \({\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k}\) in order to implement (15). To proceed, one starts to write

$$\begin{aligned} {\textbf{C}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}= {\textbf{R}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}-{\textbf{m}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}{\textbf{m}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}^{H}, \end{aligned}$$
(B9)

where the subscript k is dropped in order to simplify expressions, with

$$\begin{aligned} {\textbf{R}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}={\mathbb {E}}\left[ \left( {\textbf{z}}^{*}\otimes {\textbf{z}}\right) \left( {\textbf{z}}^{*}\otimes {\textbf{z}}\right) ^H\right] , \end{aligned}$$
(B10)

and

$$\begin{aligned} {\textbf{m}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}={\mathbb {E}}\left[ {\textbf{z}}^{*}\otimes {\textbf{z}}\right] . \end{aligned}$$
(B11)

The latter is developed from the general form of \(\textbf{z}\)

$$\begin{aligned} {\textbf{z}} =\sum _{q=1}^{Q}{\textbf{a}}_{q}s_{q}+{\textbf{n}} \end{aligned}$$
(B12)

as

$$\begin{aligned} {\textbf{m}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}} = {\text {vec}}\left( {\textbf{C}}_{\textbf{z}}\right) , \end{aligned}$$
(B13)

where

$$\begin{aligned} {\text {vec}}\left( {\textbf{C}}_{\textbf{z}}\right)&= {\text {vec}}\left( \sum \limits ^Q_{q=1}{x}_q {\textbf{a}}_q{\textbf{a}}_q^H + {\text {vec}}\left( {\textbf{C}}_{\textbf{n}}\right) \right) \nonumber \\&= \sum \limits ^Q_{q=1}{x}_q {\textbf{a}}_q^*\otimes {\textbf{a}}_q + {\text {vec}}\left( {\textbf{C}}_{\textbf{n}}\right) . \end{aligned}$$
(B14)

The former is developed from

$$\begin{aligned} {\textbf{z}}^{*}\otimes {\textbf{z}} =\sum \limits _{q=1}^{Q}\sum \limits _{q^{\prime }=1}^{Q}{\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q^{\prime }}\left( s_{q}^{*}s_{q^{\prime }}\right) +{\textbf{n}}^{*}\otimes {\textbf{n}}\\ +\sum \limits _{q^{\prime }=1}^{Q}\left( s_{q^{\prime }}\textbf{n}^{*}\right) \otimes {\textbf{a}}_{q^{\prime }}+\sum \limits _{q=1}^{Q}{\textbf{a}}_{q}^{*}\otimes \left( s_{q}^{*}{\textbf{n}}\right) , \end{aligned}$$
(B15)

which, given that \({\textbf{s}},\,{\textbf{n}}\) are independent and composed of respectively Q and M circular random variables, leads to

$$\begin{aligned} \begin{aligned} \textbf{R}_{\textbf{z}^{*}\otimes \textbf{z}}=&\sum \limits _{l=1}^{Q}\sum \limits _{l^{\prime }=1}^{Q}\sum \limits _{q=1}^{Q}\sum \limits _{q^{\prime }=1}^{Q}\left( \textbf{ a}_{q}^{*}\otimes \textbf{a}_{q^{\prime }}\right) \left( \textbf{a} _{l}^{*}\otimes \textbf{a}_{l^{\prime }}\right) ^{H}E\left[ s_{q}^{*}s_{l}s_{q^{\prime }}s_{l^{\prime }}^{*}\right] +\left( \sum \limits _{q=1}^{Q}\textbf{a}_{q}^{*}\otimes \textbf{a}_{q}x _{q}^{2}\right) E\left[ \textbf{n}^{*}\otimes \textbf{n}\right] ^{H} \\&+ \left( \sum \limits _{q=1}^{Q}x _{q}^{2}E\left[ \left( \textbf{n}^{*}\otimes \textbf{a}_{q}\right) \left( \textbf{n}^{*}\otimes \textbf{a} _{q}\right) ^{H}\right] \right) +\sum \limits _{q=1}^{Q}x _{q}^{2}E\left[ \left( \textbf{n}^{*}\otimes \textbf{a}_{q}\right) \left( \textbf{a} _{q}^{*}\otimes \textbf{n}\right) ^{H}\right] \\&+\sum \limits _{q=1}^{Q}x _{q}^{2}E\left[ \left( \textbf{a}_{q}^{*}\otimes \textbf{n}\right) \left( \textbf{n}^{*}\otimes \textbf{a} _{q}\right) ^{H}\right] +\sum \limits _{q=1}^{Q}x _{q}^{2}E\left[ \left( \textbf{a}_{q}^{*}\otimes \textbf{n}\right) \left( \textbf{a}_{q}^{*}\otimes \textbf{n}\right) ^{H}\right] \\&+ E\left[ \textbf{n}^{*}\otimes \textbf{n}\right] \left( \sum \limits _{q=1}^{Q}\textbf{a}_{q}^{*}\otimes \textbf{a}_{q}x _{q}^{2}\right) ^{H}+E\left[ \left( \textbf{n}^{*}\otimes \textbf{n} \right) \left( \textbf{n}^{*}\otimes \textbf{n}\right) ^{H}\right] \end{aligned} \end{aligned}$$
(B16)

where \(x_q\triangleq \mathbb {E}\left[ \left| s_q\right| ^2\right]\). Since

$$\begin{aligned} E\left[ s_{q}^{*}s_{l}s_{q^{\prime }}s_{l^{\prime }}^{*}\right] = \left\{ \begin{array}{ll} x_q x_{q^\prime } &{},\,\text {if}\; q=l\ne q^\prime =l^\prime ;\\ x_q x_l &{},\,\text {if}\; q=q^\prime \ne l=l^\prime ;\\ \mathbb {E}\left[ \left| s_q\right| ^4\right] &{},\,\text {if}\; m=l\ne l^\prime =m^\prime ;\\ 0 &{},\,\text {elsewhere}, \end{array} \right. \end{aligned}$$
(B17)

and

$$\begin{aligned} E\left[ \left( \textbf{n}^{*}\otimes \textbf{a}_{q}\right) \left( \textbf{n}^{*}\otimes \textbf{a}_{q}\right) ^{H}\right]= & {} E\left[ {\textbf{n}}{\textbf{n}}^{H}\right] ^{*}\otimes \textbf{a}_{q}\textbf{a}_{q}^{H} \\= & {} \textbf{C}_{\textbf{n}}^{*}\otimes \textbf{a}_{q}\textbf{a}_{q}^{H} \\{} & {} \\ E\left[ \left( \textbf{n}^{*}\otimes \textbf{a}_{q}\right) \left( \textbf{a}_{q}^{*}\otimes \textbf{n}\right) ^{H}\right]= & {} E\left[ \left( \textbf{n}\otimes \textbf{a}_{q}^{*}\right) \left( \textbf{a} _{q}^{*}\otimes \textbf{n}\right) ^{T}\right] ^{*}\\= & {} \textbf{0} \\{} & {} \\ E\left[ \left( \textbf{a}_{q}^{*}\otimes \textbf{n}\right) \left( \textbf{n}^{*}\otimes \textbf{a}_{q}\right) ^{H}\right]= & {} E\left[ \left( \textbf{a}_{q}^{*}\otimes \textbf{n}\right) \left( \textbf{n} \otimes \textbf{a}_{q}^{*}\right) ^{T}\right] \\= & {} \textbf{0} \\{} & {} \\ E\left[ \left( \textbf{a}_{q}^{*}\otimes \textbf{n}\right) \left( \textbf{a}_{q}^{*}\otimes \textbf{n}\right) ^{H}\right]= & {} \textbf{a} _{q}^{*}\textbf{a}_{q}^{T}\otimes E\left[ {\textbf{n}}{\textbf{n}}^{H}\right] \\= & {} \left( \textbf{a}_{q}\textbf{a}_{q}^{H}\right) ^{*}\otimes \textbf{C}_{\textbf{n }} \end{aligned}$$

one has that

$$\begin{aligned} \begin{aligned} {\textbf{R}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}=&\sum \limits _{q=1}^{Q}\sum \limits _{q^{\prime }=1}^{Q}\left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q^{\prime }}\right) \left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q^{\prime }}\right) ^{H}{x_q}x_{q^{\prime }}\\&+\sum \limits _{q=1}^{Q}\sum \limits _{q^{\prime }=1}^{Q}\left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q}\right) \left( {\textbf{a}}_{q^{\prime }}^{*}\otimes {\textbf{a}}_{q^{\prime }}\right) ^{H}x_{q^{\prime }}{x_q}\\&+ \sum \limits _{q=1}^{Q}\left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q}\right) \left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q}\right) ^{H}\left( {\mathbb {E}}\left[ \left( s_{q}^{*}s_{q}\right) ^{2}\right] -2{x_q} ^{2}\right) \\&+ \left( \sum \limits _{q=1}^{Q}{x_q}{\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q}\right) {\text {vec}}\left( {\textbf{C}}_{{\textbf{n}}}\right) ^{H} +\left( \left( \sum \limits _{q=1}^{Q}{x_q}{\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q}\right) {\text {vec}}\left( {\textbf{C}}_{{\textbf{n}}}\right) ^{H}\right) ^{H}\\&+ {\textbf{C}}_{{\textbf{n}}}^{*}\otimes \left( \sum \limits _{q=1}^{Q}{x_q}{\textbf{a}}_{q}{\textbf{a}}_{q}^{H}\right) +\left( \sum \limits _{q=1}^{Q}{x_q}{\textbf{a}}_{q}{\textbf{a}}_{q}^{H}\right) ^{*}\otimes {\textbf{C}}_{{\textbf{n}}}\\&+{\mathbb {E}}\left[ \left( {\textbf{n}}^{*}\otimes {\textbf{n}}\right) \left( {\textbf{n}}^{*}\otimes {\textbf{n}}\right) ^{H}\right] . \end{aligned} \end{aligned}$$
(B18)

It results from (B9) that

$$\begin{aligned} \begin{aligned} {\textbf{C}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}} =&\sum \limits _{q=1}^{Q}\sum \limits _{q^{\prime }=1}^{Q}\left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q^{\prime }}\right) \left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q^{\prime }}\right) ^{H}{x_{q} }x_{q^{\prime }}\\&+ \sum \limits _{q=1}^{Q}\left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q}\right) \left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q}\right) ^{H}\rho _{s_q}{x_{q} } ^{2} \\&+ {\textbf{C}}_{{\textbf{n}}}^{*}\otimes \left( \sum \limits _{q=1}^{Q}x_{q}^{2}{\textbf{a}}_{q}{\textbf{a}}_{q}^{H}\right) +\left( \sum \limits _{q=1}^{Q}x _{q}^{2}{\textbf{a}}_{q}{\textbf{a}}_{q}^{H}\right) ^{*}\otimes {\textbf{C}}_{{\textbf{n}}} + {\textbf{C}}_{{\textbf{n}^*\otimes \textbf{n}}} \end{aligned} \end{aligned}$$
(B19)

where \(\rho _{s_q}\) is defined as (18). On the other hand, remarking that

$$\begin{aligned} \begin{aligned} {\textbf{C}}_{{\textbf{z}}}^{T}\otimes {\textbf{C}}_{{\textbf{z}}} =&\sum \limits _{q=1}^{Q}\sum \limits _{q^{\prime }=1}^{Q^{\prime }}{x_{q} }x_{q^{\prime }}\left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q^{\prime }}\right) \left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q^{\prime }}\right) ^{H}\\&+\left( \sum \limits _{q=1}^{Q}{x_{q} }{\textbf{a}}_{q}^{*}{\textbf{a}}_{q}^{T}\right) \otimes {\textbf{C}}_{{\textbf{n}}}\\&+{\textbf{C}}_{{\textbf{n}}}^{T}\otimes \left( \sum \limits _{q=1}^{Q}{x_{q} }{\textbf{a}}_{q}{\textbf{a}}_{q}^{H}\right) +{\textbf{C}}_{{\textbf{n}}}^{T}\otimes {\textbf{C}}_{{\textbf{n}}} \end{aligned} \end{aligned}$$
(B20)

leads to the final equation:

$$\begin{aligned} \begin{aligned} {\textbf{C}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}} =&{\textbf{C}}_{{\textbf{z}}}^{T}\otimes {\textbf{C}}_{{\textbf{z}}} + \sum \limits _{q=1}^{Q}\left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q}\right) \left( {\textbf{a}}_{q}^{*}\otimes {\textbf{a}}_{q}\right) ^{H}\rho _{s_q}{x_{q} } ^{2}\\&+ \left( \textbf{C}_{\textbf{n}^{*}\otimes \textbf{n}}-\textbf{C}_{\textbf{n}}^{T}\otimes \textbf{C}_{\textbf{n}}\right) . \end{aligned} \end{aligned}$$
(B21)

Appendix 3: Covariance of a vectorized SCM

Let \(\textbf{n}\) be a vector of centered and independent complex circular random variables \(n_m\) for \(m=1,\ldots ,M\), then for \(\left( \textbf{e}_1,\ldots ,\textbf{e}_M^2\right)\) being the canonical basis of \(\mathbb {R}^{M^2}\), one has that

$$\begin{aligned} \begin{aligned} \textbf{e}_{m+M\left( m^{\prime }-1\right) }^{T}\textbf{C}_{\textbf{n}^{*}\otimes \textbf{n}}\textbf{e}_{l+M\left( l^{\prime }-1\right) }&=\textbf{C}_{\textbf{e}_{m+M\left( m^{\prime }-1\right) }^{T}\left( \textbf{n}^{*}\otimes \textbf{n}\right) ,\textbf{e} _{l+M\left( l^{\prime }-1\right) }^{T}\left( \textbf{n}^{*}\otimes \textbf{n}\right) } \\&=\textbf{C}_{\left( \textbf{e}_{m^{\prime }}\otimes \textbf{e}_{m}\right) ^{T}\left( \textbf{n}^{*}\otimes \textbf{n}\right) ,\left( \textbf{e}_{l^{\prime }}\otimes \textbf{e}_{l}\right) ^{T}\left( \textbf{n}^{*}\otimes \textbf{n}\right) }\\&=\textbf{C}_{\left( \textbf{e}_{m^{\prime }}^{T}\textbf{n}^{*}\right) \otimes \left( \textbf{e}_{m}^{T}\textbf{n}\right) ,\left( \textbf{e}_{l^{\prime }}^{T}\textbf{n}^{*}\right) \otimes \left( \textbf{e}_{l}^{T}\textbf{n}\right) } \end{aligned} \end{aligned}$$
(C22)

and thus

$$\begin{aligned} \textbf{e}_{m+M(m^\prime -1)}^T \textbf{C}_{\textbf{n}^*\otimes \textbf{n}} \textbf{e}_{l+M(l^\prime -1)} = \mathbb {E}\left[ n_{m^\prime }^*n_{m}n_{l}^*n_{l^\prime }\right] - \mathbb {E}\left[ n_{m}n_{m^\prime }^*\right] \mathbb {E}\left[ n_{l}^*n_{l^\prime }\right] , \end{aligned}$$
(C23)

where \(m,\,m^\prime ,\,l,\,l^\prime \in \left\{ 1,\ldots ,M\right\}\).

On one hand, for \(\textbf{n}\) being a vector of centered and Gaussian complex circular random variables, the fourth-order cumulant

$$\begin{aligned} \kappa _{i,j,k,l}=E\left[ n_{i}n_{j}^{*}n_{k}n_{l}^{*}\right] -E \left[ n_{i}n_{j}^{*}\right] E\left[ n_{k}n_{l}^{*}\right] -E\left[ n_{i}n_{l}^{*}\right] E\left[ n_{k}n_{j}^{*}\right] \end{aligned}$$
(C24)

is null, i.e., \(\kappa _{i,j,k,l}=0\), which implies that

$$\begin{aligned} E\left[ n_{i}n_{j}^{*}n_{k}n_{l}^{*}\right] =E\left[ n_{i}n_{j}^{*}\right] E\left[ n_{k}n_{l}^{*}\right] +E\left[ n_{i}n_{l}^{*}\right] E\left[ n_{k}n_{j}^{*}\right] \end{aligned}$$
(C25)

and then

$$\begin{aligned} \textbf{e}_{m+M\left( m^{\prime }-1\right) }^{T}\textbf{C}_{\textbf{n}^{*}\otimes \textbf{n}}\textbf{e}_{l+M\left( l^{\prime }-1\right) } =&\mathbb {E}\left[ n_{m}n_{l}^{*}\right] \mathbb {E}\left[ n_{l^{\prime }}n_{m^{\prime }}^{*}\right] \\ =&\left( \textbf{C}_{\textbf{n}}\right) _{m,l}\left( \textbf{C}_{\textbf{n}}\right) _{l^{\prime },m^{\prime }} \end{aligned}$$

from (C23). Finally, one has directly that

$$\begin{aligned} \textbf{C}_{\textbf{n}^{*}\otimes \textbf{n}}=\textbf{C}_{\textbf{n} }^{T}\otimes \textbf{C}_{\textbf{n}}. \end{aligned}$$
(C26)

On the other hand, for \(\textbf{n}\) being a vector of independent and centered complex circular random variables following any distribution, one has from (C23) that

$$\begin{aligned} \textbf{e}_{m+M(m^\prime -1)}^T \textbf{C}_{\textbf{n}^*\otimes \textbf{n}} \textbf{e}_{l+M(l^\prime -1)} = \left\{ \begin{array}{ll} \mathbb {E}\left[ \left| n_m\right| ^4\right] - \mathbb {E}\left[ \left| n_m\right| ^2\right] ^2 &{},\,\text {if}\; m=m^\prime =l=l^\prime ;\\ \mathbb {E}\left[ \left| n_m\right| ^2\right] \mathbb {E}\left[ \left| n_{l^\prime }\right| ^2\right] &{},\,\text {if}\; m=l\ne l^\prime =m^\prime ;\\ 0 &{},\,\text {elsewhere}, \end{array} \right. \end{aligned}$$
(C27)

where \(\mathbb {E}\left[ \left| n_m\right| ^2\right] =\sigma _{n_m}^2,\, \mathbb {E}\left[ \left| n_{l^\prime }\right| ^2\right] =\sigma _{n_{l^\prime }}^2\) and

$$\begin{aligned} \mathbb {E}\left[ \left| n_m\right| ^4\right] - \mathbb {E}\left[ \left| n_m\right| ^2\right] ^2 = \left( \rho _{n_m}+1\right) \sigma _{n_m}^4 \end{aligned}$$
(C28)

where

$$\begin{aligned} \rho _{n_m} = \frac{\mathbb {E}\left[ \left| n_m\right| ^4\right] }{\mathbb {E}\left[ \left| n_m\right| ^2\right] ^2} - 2. \end{aligned}$$
(C29)

In particular, \(\rho _{n_m}=0\) for Gaussian random variables, which implies (C26).

In summary, it was proved that (C26) applies for a vector \(\textbf{n}\) of centered and Gaussian complex circular random variables, and that (C27) applies for centered and independent complex circular random variables following any distribution.

Appendix 4: Computation of the MVDRF

Given a linear observation model

$$\begin{aligned} \textbf{y}={\textbf{H}}{\textbf{x}}+\textbf{v},\quad \textbf{y},\textbf{v}\in \mathbb {C} ^{N\times 1},~\textbf{x}\in \mathbb {C} ^{P\times 1},~\textbf{H}\in \mathbb {C} ^{N\times P}, \end{aligned}$$
(D30a)

such that

$$\begin{aligned} \text {rank}\left( \textbf{H}\right) =P\le N,\quad \text {rank}\left( \textbf{C}_{\textbf{v} }\right) =R\le N, \end{aligned}$$
(D30b)

the general form of a DRF is (denoting \(\textbf{L}^{H}=\textbf{K}\))

$$\begin{aligned} \begin{aligned} \widehat{\textbf{x}}&=\textbf{m}_{\textbf{x}}+\textbf{L}^{H}\left( \mathbf {y-m}_{\textbf{y}}\right) \\&=\left( \textbf{I}_{P}-\textbf{L}^{H}\textbf{H}\right) \textbf{m}_{\textbf{x}}+\textbf{L}^{H}\left( \mathbf {y-m}_{\textbf{v}}\right) \end{aligned} \end{aligned}$$
(D31)

with

$$\left\{\begin{array}{*{20}l} {\widehat{{\mathbf{x}}} = {\mathbf{x}} + {\mathbf{L}}^{H} \left( {{\mathbf{v}} - {\mathbf{m}}_{{\mathbf{v}}} } \right);} \hfill \\ {{\mathbf{I}}_{P} = {\mathbf{L}}^{H} {\mathbf{H}};} \hfill \\ {E\left[ {\left( {\widehat{{\mathbf{x}}} - {\mathbf{x}}} \right)\left( {\widehat{{\mathbf{x}}} - {\mathbf{x}}} \right)^{H} } \right] = {\mathbf{L}}^{H} {\mathbf{C}}_{{\mathbf{v}}} {\mathbf{L}}.} \hfill \\ \end{array} \right.$$

The MVDRF (provided that it exists) is then the DRF such that

$$\begin{aligned} \textbf{L}^{b}=\arg \underset{\textbf{L}}{\min }\left\{ \textbf{L}^{H}\textbf{C}_{\textbf{v}}\textbf{L}\text { s.t. }\textbf{L}^{H} \textbf{H}=\textbf{I}_{P}\right\} . \end{aligned}$$
(D32)

Considering that nor \(\textbf{H}\) and \(\textbf{C}_{\textbf{v}}\) are full rank, one denotes

$$\begin{aligned} \textbf{H}\overset{\text {compact SVD}}{=}\mathbf {U\Sigma V}^{H}, \end{aligned}$$
(D33a)
$$\begin{aligned} \textbf{C}_{\textbf{v}}\overset{\text {compact EVD}}{=}\textbf{WDW}^{H} \end{aligned}$$
(D33b)

and \(\mathbb {U},\,\mathbb {W}\) such that \(\left[ \textbf{U}\,\mathbb {U}\right]\), \(\left[ \textbf{W}\,\mathbb {W}\right]\) are unitary matrices. The considered filter is of the form

$$\begin{aligned} \textbf{L}={\textbf{U}}{\textbf{T}}+{\mathbb {U}}{\mathbb {T}} \end{aligned}$$
(D33c)

with

$$\begin{aligned} \textbf{T}\in \mathbb {C} ^{P\times P},~\mathbb {T}\in \mathbb {C} ^{\left( N-P\right) \times P}. \end{aligned}$$
(D33d)

Thus

$$\begin{aligned} \textbf{L}^{H}\textbf{H}=\textbf{T}^{H}\mathbf {\Sigma V}^{H} \end{aligned}$$
(D34)

from which the distortionless condition is equivalent to

$$\begin{aligned} \textbf{T}=\mathbf {\Sigma }^{-1}\textbf{V}^{H}. \end{aligned}$$
(D35)

One has that

$$\begin{aligned} \textbf{L}^{H}\textbf{C}_{\textbf{v}}\textbf{L}= & {} \left( \mathbf {U\Sigma } ^{-1}\textbf{V}^{H}+{\mathbb {U}}{\mathbb {T}}\right) ^{H}\textbf{C}_{\textbf{v}}\left( \mathbf {U\Sigma }^{-1}\textbf{V}^{H}+{\mathbb {U}}{\mathbb {T}}\right) \nonumber \\= & {} \left( \mathbf {U\Sigma }^{-1}\textbf{V}^{H}+{\mathbb {U}}{\mathbb {T}}\right) ^{H} \textbf{WDW}^{H}\left( \mathbf {U\Sigma }^{-1}\textbf{V}^{H}+{\mathbb {U}}{\mathbb {T}} \right) \nonumber \\= & {} \left( \textbf{W}^{H} \mathbf {U\Sigma }^{-1}\textbf{V}^{H}+\textbf{W}^{H}{\mathbb {U}}{\mathbb {T}}\right) ^{H} \textbf{D}\left( \textbf{W}^{H}\mathbf {U\Sigma }^{-1}\textbf{V}^{H}+\textbf{W }^{H}{\mathbb {U}}{\mathbb {T}}\right) \end{aligned}$$
(D36)

with

$$\begin{aligned} \textbf{W}^{H}\mathbb {U\in C}^{R\times \left( N-P\right) },\quad rank\left( \textbf{W}^{H}\mathbb {U}\right) =Q\le \min \left( R,N-P\right) . \end{aligned}$$
(D37)

Then, one applies a last decomposition

$$\begin{aligned} \textbf{W}^{H}\mathbb {U}\overset{\text {compact SVD}}{=}\underline{\textbf{U}}\underline{ \mathbf {\Sigma }}\underline{\textbf{V}}^{H}. \end{aligned}$$
(D38a)

Given \(\textbf{g }\in \mathbb {C} ^{P}\), one obtains

$$\begin{aligned} \textbf{g }^{H}\left( \textbf{L}^{H}\textbf{C}_{\textbf{v}}\textbf{L} \right) \textbf{g }= & {} \left\| \left( \textbf{W}^{H}\mathbf {U\Sigma }^{-1}\textbf{V}^{H}+\underline{\textbf{U}}\left( \underline{\mathbf {\Sigma } }\underline{\textbf{V}}^{H}\mathbb {T}\right) \right) \textbf{g } \right\| _{\textbf{D}}^{2} \end{aligned}$$
(D38b)

where \(\left\| \textbf{z}\right\| _{\textbf{D}}^{2}=\textbf{z}^{H} {\textbf{D}}{\textbf{z}}\). Considering the oblique projector

$$\begin{aligned} \mathbf {\Pi }_{\underline{\textbf{U}}}^{\textbf{D}}=\underline{\textbf{U}} \left( \underline{\textbf{U}}^{H}\textbf{D}\underline{\textbf{U}}\right) ^{-1}\underline{\textbf{U}}^{H}\textbf{D}\in \mathbb {C} ^{R\times R}, \end{aligned}$$
(D38c)

then

$$\begin{aligned} \textbf{g }^{H}\left( \textbf{L}^{H}\textbf{C}_{\textbf{v}}\textbf{L}\right) \textbf{g }= & {} \left\| \mathbf {\Pi }_{\underline{\textbf{U}}}^{\textbf{D}}\left( \textbf{W}^{H}\mathbf {U\Sigma }^{-1}\textbf{V}^{H}+\underline{\textbf{U}} \left( \underline{\mathbf {\Sigma }}\underline{\textbf{V}}^{H}\mathbb {T} \right) \right) \textbf{g }\right\| _{\textbf{D}}^{2}\nonumber \\{} & {} +\left\| \left( \textbf{I}_{R}-\mathbf {\Pi }_{\underline{\textbf{U}}}^{\textbf{D} }\right) \left( \textbf{W}^{H}\mathbf {U\Sigma }^{-1}\textbf{V}^{H}+ \underline{\textbf{U}}\left( \underline{\mathbf {\Sigma }}\underline{\textbf{V }}^{H}\mathbb {T}\right) \right) \textbf{g }\right\| _{\textbf{D} }^{2} \nonumber \nonumber \\= & {} \left\| \underline{\textbf{U}}\left( \left( \underline{\textbf{U}}^{H}\textbf{D}\underline{\textbf{U}}\right) ^{-1} \underline{\textbf{U}}^{H}{\textbf{D}}{\textbf{W}}^{H}\mathbf {U\Sigma }^{-1}\textbf{V} ^{H}+\underline{\mathbf {\Sigma }}\underline{\textbf{V}}^{H}\mathbb {T}\right) \textbf{g }\right\| _{\textbf{D}}^{2}\nonumber \\{} & {} +\left\| \left( \textbf{I} _{R}-\mathbf {\Pi }_{\underline{\textbf{U}}}^{\textbf{D}}\right) \textbf{W} ^{H}\mathbf {U\Sigma }^{-1}\textbf{V}^{H}\textbf{g }\right\| _{ \textbf{D}}^{2}\qquad \quad \end{aligned}$$
(D38d)

from which one can conclude

$$\begin{aligned} \textbf{g }^{H}\left( \textbf{L}^{H}\textbf{C}_{\textbf{v}}\textbf{L} \right) \textbf{g }\ge \left\| \left( \textbf{I}_{R}-\mathbf {\Pi } _{\underline{\textbf{U}}}^{\textbf{D}}\right) \textbf{W}^{H}\mathbf {U\Sigma } ^{-1}\textbf{V}^{H}\textbf{g }\right\| _{\textbf{D}}^{2}, \end{aligned}$$

i.e.,

$$\begin{aligned} \textbf{L}^{H}\textbf{C}_{\textbf{v}}\textbf{L}\ge \mathbf {V\Sigma }^{-1} \textbf{U}^{H}\textbf{W}\left( \textbf{I}_{R}-\mathbf {\Pi }_{\underline{ \textbf{U}}}^{\textbf{D}}\right) ^{H}\textbf{D}\left( \textbf{I}_{R}-\mathbf { \Pi }_{\underline{\textbf{U}}}^{\textbf{D}}\right) \textbf{W}^{H}\mathbf { U\Sigma }^{-1}\textbf{V}^{H}. \end{aligned}$$
(D38e)

The equality holds for

$$\begin{aligned} \left( \underline{\textbf{U}}^{H}\textbf{D}\underline{\textbf{U}}\right) ^{-1}\underline{\textbf{U}}^{H}{\textbf{D}}{\textbf{W}}^{H}\mathbf {U\Sigma }^{-1}\textbf{V }^{H}+\underline{\mathbf {\Sigma }}\underline{\textbf{V}}^{H}\mathbb {T}= & {} \textbf{0} \nonumber \\&\Updownarrow&\nonumber \\ -\underline{\mathbf {\Sigma }}^{-1}\left( \underline{\textbf{U}}^{H}\textbf{D} \underline{\textbf{U}}\right) ^{-1}\underline{\textbf{U}}^{H}{\textbf{D}}{\textbf{W}}^{H} \mathbf {U\Sigma }^{-1}\textbf{V}^{H}= & {} \underline{\textbf{V}}^{H}\mathbb {T}. \end{aligned}$$
(D38f)

Let \(\underline{\mathbb {V}}\in \mathbb {C}^{\left( N-P\right) \times \left( \left( N-P\right) -Q\right) }\) be such that \(\left[ \underline{\textbf{V}}~\underline{\mathbb {V}}\right]\) is a unitary matrix, then since \(\mathbb {T}\in \mathbb {C} ^{\left( N-P\right) \times P}\),

$$\begin{aligned} \mathbb {T}=\left( \underline{\textbf{V}}\underline{\textbf{V}}^{H}+ \underline{\mathbb {V}}\underline{\mathbb {V}}^{H}\right) \mathbb {T}= \left[ \underline{\textbf{V}}~\underline{\mathbb {V}}\right] \left[ \begin{array}{c} \underline{\textbf{V}}^{H}\mathbb {T} \\ \underline{\mathbb {V}}^{H}\mathbb {T} \end{array} \right] , \end{aligned}$$

which leads to

$$\begin{aligned} \mathbb {T}=-\underline{\textbf{V}}\left( \underline{\mathbf {\Sigma }} ^{-1}\left( \underline{\textbf{U}}^{H}\textbf{D}\underline{\textbf{U}} \right) ^{-1}\underline{\textbf{U}}^{H}{\textbf{D}}{\textbf{W}}^{H}\mathbf {U\Sigma }^{-1} \textbf{V}^{H}\right) +\underline{\mathbb {V}}\underline{\mathbb {A}}, \end{aligned}$$
(D38g)

where

$$\begin{aligned} \underline{\mathbb {A}}\triangleq \underline{\mathbb {V}}^{H}\mathbb {T}\in \mathbb {C} ^{\left( \left( N-P\right) -Q\right) \times P} \end{aligned}$$
(D38h)

and thus

$$\begin{aligned} \textbf{L}=\textbf{U}\left( \mathbf {\Sigma }^{-1}\textbf{V}^{H}\right) - \mathbb {U}\underline{\textbf{V}}\left( \underline{\mathbf {\Sigma }} ^{-1}\left( \underline{\textbf{U}}^{H}\textbf{D}\underline{\textbf{U}} \right) ^{-1}\underline{\textbf{U}}^{H}{\textbf{D}}{\textbf{W}}^{H}\mathbf {U\Sigma }^{-1} \textbf{V}^{H}\right) +\mathbb {U}\underline{\mathbb {V}}\underline{\mathbb {A}}. \end{aligned}$$
(D38i)

One deduces that there is an infinity of solutions which minimize \(\textbf{L}^{H}\textbf{C}_{\textbf{v}}\textbf{L}\) (D38i) : since there is no constraint on \(\underline{\mathbb {A}}\), the solution has \(((N-P)-Q)\times P\) degrees of freedom. One may consider the solution with minimal Frobenius norm

$$\begin{aligned} tr\left( \textbf{L}^{H}\textbf{L}\right)= & {} tr\left( \left( {\textbf{U}}{\textbf{B}}- \mathbb {U}\underline{\textbf{V}}\textbf{A}+\mathbb {U}\underline{\mathbb {V}} \underline{\mathbb {A}}\right) ^{H}\left( {\textbf{U}}{\textbf{B}}-\mathbb {U}\underline{ \textbf{V}}\textbf{A}+\mathbb {U}\underline{\mathbb {V}}\underline{\mathbb {A}} \right) \right) \\= & {} tr\left( \textbf{B}^{H}\textbf{B} \right) +tr\left( \textbf{A}^{H}\textbf{A}\right) +tr\left( \mathbb {A}^{H} \mathbb {A}\right) \end{aligned}$$

i.e.,

$$\begin{aligned} \textbf{L}=\textbf{U}\left( \mathbf {\Sigma }^{-1}\textbf{V}^{H}\right) - \mathbb {U}\underline{\textbf{V}}\underline{\mathbf {\Sigma }}^{-1}\left( \underline{\textbf{U}}^{H}\textbf{D}\underline{\textbf{U}}\right) ^{-1} \underline{\textbf{U}}^{H}{\textbf{D}}{\textbf{W}}^{H}\mathbf {U\Sigma }^{-1}\textbf{V} ^{H}, \end{aligned}$$
(D39a)

by setting \(\underline{\mathbb {A}}=\textbf{0}\) (i.e., no unnecessarily additional assumption is made).

Appendix 5: Upper bound of the measurement noise cocariance

This section proves that for a given estimation \(\widehat{{\textbf{x}}}_{k|k}\) verifying \(\left( \widehat{{\textbf{x}}}_{k|k}\right) _q\geqslant \left( {\textbf{x}}_{k}\right) _q\,\forall q\in \left\{ 1,\ldots ,Q\right\}\), one have that \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}(\widehat{{\textbf{x}}}_{k|k})\geqslant \widehat{{\textbf{C}}}_{{\textbf{v}}_k}({{\textbf{x}}}_{k})\), where \(\widehat{{\textbf{C}}}_{{\textbf{v}}_k}({{\textbf{x}}}_{k})\triangleq {\textbf{C}}_{{\textbf{v}}_k}\).

Let \(\textbf{x},\,\textbf{x}^\prime \in \mathbb {R}^Q_+\) be such that \(x^\prime _q\geqslant x_q\,\forall q\in \left\{ 1,\ldots ,Q\right\}\), where the subscript k is dropped in order to simplify formulas. Since

$$\begin{aligned} \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}}) = \sum \limits _{q=1}^Qx_q\textbf{a}_q\textbf{a}_q^H + \textbf{C}_{\textbf{n}}, \end{aligned}$$
(E40)

one has that

$$\begin{aligned} \begin{aligned} \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}^\prime })^T\otimes \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}^\prime }) - \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}})\otimes \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}}) =&\sum \limits _{q=1}^Q\sum \limits _{l=1}^Q\left( x_q^\prime x_l^\prime - x_qx_l\right) \left( \textbf{a}_q\textbf{a}_q^H\right) ^T\otimes \left( \textbf{a}_l\textbf{a}_l^H\right) \\&+ \sum \limits _{q=1}^Q\left( x_q^\prime - x_q\right) \left( \textbf{a}_q\textbf{a}_q^H\right) ^T\otimes \textbf{C}_{\textbf{n}}\\&+ \sum \limits _{q=1}^Q\left( x_q^\prime - x_q\right) \textbf{C}_{\textbf{n}}^T\otimes \left( \textbf{a}_q\textbf{a}_q^H\right) , \end{aligned} \end{aligned}$$
(E41)

with \(\left( \textbf{a}_q\textbf{a}_q^H\right) ^T\otimes \left( \textbf{a}_l\textbf{a}_l^H\right)\), \(\left( \textbf{a}_q\textbf{a}_q^H\right) ^T\otimes \textbf{C}_{\textbf{n}}\) and \(\textbf{C}_{\textbf{n}}^T\otimes \left( \textbf{a}_q\textbf{a}_q^H\right)\) being three hermitian and positive matrices. Hence, \(\widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}^\prime })^T\otimes \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}^\prime }) - \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}})\otimes \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}})\) is a linear combination of positive hermitian matrices with positive coefficients, and thus

$$\begin{aligned} \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}^\prime })^T\otimes \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}^\prime }) \geqslant \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}})\otimes \widehat{\textbf{C}}_{\textbf{z}} ({\textbf{x}}). \end{aligned}$$
(E42)

On the other hand, from (17), one has that

$$\begin{aligned} \begin{aligned} \widehat{\textbf{C}}_{{\textbf{z}}^*\otimes {\textbf{z}}}({\textbf{x}^\prime }) - \widehat{\textbf{C}}_{{\textbf{z}}^*\otimes {\textbf{z}}}({\textbf{x}}) =&\, \widehat{\textbf{C}}_{{\textbf{z}}}^T({\textbf{x}^\prime })\otimes \widehat{\textbf{C}}_{{\textbf{z}}}({\textbf{x}^\prime }) - \widehat{\textbf{C}}_{{\textbf{z}}}^T({\textbf{x}})\otimes \widehat{\textbf{C}}_{{\textbf{z}}}({\textbf{x}})\\&+\sum \limits _{q=1}^Q\left( {\textbf{a}}^*_q\otimes {\textbf{a}}_q\right) \left( {\textbf{a}}^*_q\otimes {\textbf{a}}_q\right) ^H\rho _{s_q}\left( {x_q^\prime }^2-{x_q}^2\right) \end{aligned} \end{aligned}$$
(E43)

which (from the same argument) yields to

$$\begin{aligned} \widehat{\textbf{C}}_{{\textbf{z}}^*\otimes {\textbf{z}}}({\textbf{x}^\prime }) \geqslant \widehat{\textbf{C}}_{{\textbf{z}}^*\otimes {\textbf{z}}}({\textbf{x}}). \end{aligned}$$
(E44)

Finally, remarking that from (16), \(\textbf{C}_{\textbf{v}}(\textbf{x}^\prime ) - \textbf{C}_{\textbf{v}}(\textbf{x})\) is the sum of

$$\begin{aligned} \frac{1}{N} \left[ \begin{array}{cc} {\textbf{C}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}(\textbf{x}^\prime ) - {\textbf{C}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}(\textbf{x}) &{} \textbf{0} \\ \textbf{0} &{} {\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}}^{*}(\textbf{x}^\prime ) - {\textbf{C}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}^{*}(\textbf{x}) \end{array} \right] \end{aligned}$$
(E45)

and

$$\begin{aligned} \frac{1}{N} \left[ \begin{array}{cc} \textbf{0} &{} {\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k,{\textbf{z}}_k\otimes {\textbf{z}}_k^{*}}(\textbf{x}^\prime ) - {\textbf{C}}_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k,{\textbf{z}}_k\otimes {\textbf{z}}_k^{*}}(\textbf{x})\\ {\textbf{C}}^*_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k,{\textbf{z}}_k\otimes {\textbf{z}}_k^{*}}(\textbf{x}^\prime ) - {\textbf{C}}^*_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k,{\textbf{z}}_k\otimes {\textbf{z}}_k^{*}}(\textbf{x})&{} \textbf{0}\\ \end{array} \right] \end{aligned}$$
(E46)

which are positive hermitian matrices since

$$\begin{aligned} \begin{aligned} {\textbf{C}}^H_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k,{\textbf{z}}_k\otimes {\textbf{z}}_k^{*}}(\textbf{x})&= \textbf{P}^T{\textbf{C}}^H_{{\textbf{z}}^{*}\otimes {\textbf{z}}}(\textbf{x})\\&= \textbf{P}^T{\textbf{C}}_{{\textbf{z}}^{*}\otimes {\textbf{z}}}(\textbf{x})\\&= {\textbf{C}}_{{\textbf{z}}\otimes {\textbf{z}}^*,{\textbf{z}}^{*}\otimes {\textbf{z}}}(\textbf{x})\\&= {\textbf{C}}_{{\textbf{z}}\otimes {\textbf{z}}^*}(\textbf{x})\textbf{P}\\&= \left( {\textbf{C}}_{{\textbf{z}^*}\otimes {\textbf{z}}}(\textbf{x})\textbf{P}\right) ^*\\&= {\textbf{C}}^*_{{\textbf{z}}_k^{*}\otimes {\textbf{z}}_k,{\textbf{z}}_k\otimes {\textbf{z}}_k^{*}}(\textbf{x}), \end{aligned} \end{aligned}$$
(E47)

one can conclude that

$$\begin{aligned} \textbf{C}_{\textbf{v}}(\textbf{x}^\prime ) \geqslant \textbf{C}_{\textbf{v}}(\textbf{x}). \end{aligned}$$
(E48)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cano, C., Arab, N., Chaumette, É. et al. Kalman filter for radio source power and direction of arrival estimation. EURASIP J. Adv. Signal Process. 2024, 66 (2024). https://doi.org/10.1186/s13634-024-01147-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-024-01147-x

Keywords