Open Access

NLOS mitigation in indoor localization by marginalized Monte Carlo Gaussian smoothing

EURASIP Journal on Advances in Signal Processing20172017:62

Received: 24 March 2017

Accepted: 18 August 2017

Published: 29 August 2017


One of the main challenges in indoor time-of-arrival (TOA)-based wireless localization systems is to mitigate non-line-of-sight (NLOS) propagation conditions, which degrade the overall positioning performance. The positive skewed non-Gaussian nature of TOA observations under LOS/NLOS conditions can be modeled as a heavy-tailed skew t-distributed measurement noise. The main goal of this article is to provide a robust Bayesian inference framework to deal with target localization under NLOS conditions. A key point is to take advantage of the conditionally Gaussian formulation of the skew t-distribution, thus being able to use computationally light Gaussian filtering and smoothing methods as the core of the new approach. The unknown non-Gaussian noise latent variables are marginalized using Monte Carlo sampling. Numerical results are provided to show the performance improvement of the proposed approach.


Robust Bayesian inference Gaussian filtering and smoothing NLOS mitigation Skew t-distributed measurement noise Indoor localization Monte Carlo integration

1 Introduction

The knowledge of position is ubiquitous in many applications and services, playing an important role. The widely diffused Global Navigation Satellite System (GNSS) offers a worldwide service coverage due to a network of dedicated satellites [1]. GNSS is recognized to be the de facto system in outdoor environments when it is available. Under the assumption that its reception is not obstructed or jammed [24], there is no doubt that GNSS is the main enabler for location-based services (LBS). One of such situations is indoor positioning and tracking, where satellite signals are hardly useful (unless extremely large integration times are considered). In indoor scenarios, a plethora of alternative and complementary technologies can be considered [1, 5, 6].

We are interested in a particular propagation phenomena encountered in most positioning technologies (both outdoor and indoor), known as non-line-of sight (NLOS). It is one of the most challenging problems for tracking. Particularly, when considering time-of-arrival (TOA) measurements as range estimates, the measured distance can be severely degraded. These ranges are typically positively biased with respect to the true distances, therefore seen as outliers at the receiver. It is of interest to develop NLOS mitigation techniques, providing enhanced robustness to tracking methods based on TOA measurements [6].

In general, the problem under study concerns the derivation of new robust methods to solve the Bayesian filtering and smoothing problem in challenging applications such as the LOS/NLOS propagation conditions in indoor localization systems. The state-space models (SSM) of interest are expressed as
$$\begin{array}{@{}rcl@{}} \mathbf{x}_{k} &=& \mathbf{f}_{k-1} \left(\mathbf{x}_{k-1} \right) + \mathbf{u}_{k}, \qquad \mathbf{u}_{k} \sim \mathcal{N}\left(0,\mathbf{Q}_{k}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \mathbf{y}_{k} &=& \mathbf{h}_{k} \left(\mathbf{x}_{k} \right) + \mathbf{n}_{k}, \qquad\qquad\! \mathbf{n}_{k} \sim \mathcal{ST}\left(\boldsymbol{\phi}_{k}\right), \end{array} $$
where \(\mathbf {x}_{k} \in \mathbb {R}^{n_{x}}\) and \(\mathbf {y}_{k} \in \mathbb {R}^{n_{y}}\) are the hidden states of the system and measurements at time k. f k−1(·) and h k (·) are known to be the possibly nonlinear functions of the state; and both process and observation noises, u k and n k , assumed to be mutually independent. In real–life applications, we may not have a complete knowledge of the system conditions, thus the measurement noise statistics are assumed to be unknown to a certain extent. In contrast, we consider a known process noise covariance Q k . Regarding the measurement noise, we assume that it is distributed according to a parametric heavy-tailed skew t-distribution, \(\mathbf {n}_{k} \sim \mathcal {ST}\left (\boldsymbol {\phi }_{k}\right)\), with ϕ k representing the set of possibly unknown parameters of the non-Gaussian distribution. The probability density function (pdf) of the univariate skew t distribution of interest can be written as [19]
$$\mathcal{ST}\left(z;\mu,\sigma^{2},\lambda,\nu\right) = 2 \mathcal{T}\left(z; \mu, \lambda^{2}+\sigma^{2}, \nu\right) \mathrm{T}(\tilde{z}; 0,1,\nu+1), $$
with \(\mu \in \mathbb {R}\), \(\sigma ^{2} \in \mathbb {R}^{+}\), \(\lambda \in \mathbb {R}\), and \(\nu \in \mathbb {R}^{+}\), referring to the distribution location, scale, skewness, and degrees of freedom, respectively. \(\mathcal {T}\left (z; \mu,\sigma ^{2}, \nu \right)\) is the pdf of the Student’s t distribution,
$$\begin{array}{*{20}l} &\mathcal{T}\left(z; \mu,\sigma^{2}, \nu\right)=\frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\sigma \sqrt{\nu \pi} \Gamma\left(\frac{\nu}{2}\right)} \left(1+\frac{(z-\mu)^{2}}{\nu \sigma^{2}}\right)^{-\frac{\nu+1}{2}}, \end{array} $$
with Γ(·) the gamma function. \(\mathrm {T}(\tilde {z}; 0,1,\nu)\) is the cumulative distribution function (CDF) of the Student’s t distribution with ν degrees of freedom and
$$\begin{array}{*{20}l} &\tilde{z}= \frac{(z-\mu)\lambda}{\sigma}\sqrt{\frac{\nu+1}{\nu \left(\lambda^{2}+\sigma^{2}\right)+(z-\mu)^{2}}}. \end{array} $$

Notice that the standard Student t distribution is \(\mathcal {T}\left (z;\mu, \sigma ^{2}, \nu \right)= \mathcal {ST}\left (z;\mu,\sigma ^{2}, \lambda =0, \nu \right)\), the skew normal pdf is \(\mathcal {SN}\left (z;\mu, \sigma ^{2}, \lambda \right)= \mathcal {ST}\left (z;\mu,\sigma ^{2}, \lambda, \nu \rightarrow \infty \right)\), and \(\mathcal {N}\left (z;\mu, \sigma ^{2} \right) = \mathcal {ST}\left (z;\mu,\sigma ^{2}, \lambda = 0, \nu \rightarrow \infty \right)\) the normal distribution.

1.1 State-of-the-art

The skew t distribution has been recently shown to provide a reasonable fit to realistic indoor TOA measurements. For instance, characterizing range measures in NLOS conditions in ultra-wideband (UWB) localization [7] or in multipath channels when ranging is computed with long-term evolution (LTE) networks [8]. Interestingly, this distribution allows a Gaussian mean-scale mixture (GMSM) representation, which implies that the distribution can be reformulated as hierarchically (conditionally) Gaussian [9, 10]. Mathematically, if we have the skew t-distributed random variable \(z \sim \mathcal {ST}(z;\mu,\sigma ^{2},\lambda,\nu)\), then we can write [41]
$$\begin{array}{@{}rcl@{}} z|\gamma, \tau \sim \mathcal{N}\left(z; \mu+\lambda \gamma, \tau^{-1} \sigma^{2}\right), \end{array} $$

with \(\tau \sim \mathcal {G}\left (\tau ; \frac {\nu }{2},\frac {\nu }{2}\right)\), \(\gamma |\tau \sim \mathcal {N}_{+} \left (\gamma ; 0, \tau ^{-1}\right)\), and \(\mathcal {N}_{+}\left (\cdot \right) \) and \(\mathcal {G}(\cdot)\) as the positive truncated normal and gamma distributions. This is a key point in our problem formulation, because under the knowledge of the noise parameters (i.e., μ,σ 2,λ,ν,γ and τ in (3)), both the conditional marginal filtering and smoothing posterior distributions of the states, p(x k |y 1:k ) and p(x k |y 1:N ), turn to be Gaussian and thus we are able to use computationally light Gaussian smoothing methods to infer the states of the system.

In the literature, some contributions dealing with conditionally Gaussian SSMs corrupted by both heavy-tailed symmetric and skewed noise distributions were proposed. A particle filter (PF) solution for linear SSMs in symmetric α-stable (S α S) noise was presented in [11]. This idea was further explored in [12] for nonlinear systems and generalized to other symmetric distributions in [13]. The key idea was to take advantage of the conditionally Gaussian form and use a sigma-point Gaussian filter [14, 15] for the nonlinear state estimation. A robust filtering variational Bayesian (VB) approach was considered for linear systems in [16] and further extended to nonlinear SSMs in [17] considering a symmetric Student t measurement noise. But symmetric distributions may not always be appropriate to characterize the system noise. Recently, two interesting approaches to deal with linear SSMs under skewed noise were proposed, the first one uses a marginalized PF [18] and the other considers a VB solution [7, 19]. It is important to point out that (i) these contributions deal with either nonlinear systems corrupted by symmetric distributed noises or linear SSMs under skewed noise and (ii) the core of these methods use standard Bayesian filtering algorithms, then the smoothing problem needs to be further analyzed within this context.

Related to the problem under study, it is worth saying that several contributions deal with the filtering problem in nonlinear/non-Gaussian SSM under model uncertainty using sequential Monte Carlo (SMC) methods, for instance, joint state and parameter estimation solutions [20], model selection strategies using interacting parallel PFs [21, 22], or model information fusion within the SMC formulation [23]. The main drawback of SMC methods is their high computational complexity and the curse-of-dimensionality [24]. That is the reason why we propose to take advantage of the underlying conditional Gaussian nature of the problem and use more efficient methods in this context.

1.2 Contributions

The main contributions of the article, which generalize the preliminary results in [25], are summarized as:
  • New Bayesian filtering and smoothing-based solutions for SSMs corrupted by parametric heavy-tailed skewed measurement noise

  • Marginalization of the unknown non-Gaussian noise latent variables by Monte Carlo integration

  • TOA-based robust target tracking, where the LOS/NLOS propagation is modeled using a skew t-distributed measurement noise. Whereas a Gaussian filter and smoother deals with the nonlinear state estimation problem, the time-varying skew t distribution parameters are marginalized via Monte Carlo sampling

The article is organized as follows: first, we provide a discussion on Gaussian filtering and smoothing in nonlinear/Gaussian systems, together with the sigma-point-based approximation of the multidimensional integrals in the conceptual solution, being computationally more efficient than SMC methods under the Gaussian assumption; then, we provide the conditionally Gaussian formulation of the measurement noise and a method to deal with the unknown non-Gaussian noise latent variables, and finally, we propose a NLOS indoor localization solution, based on the Gaussian smoother and the sequential noise latent variables marginalization. Numerical results are provided in realistic scenarios using UWB signals.

2 Gaussian filtering and smoothing

This section reviews the general filtering and smoothing solutions in the case of nonlinear/Gaussian systems, this material corresponds to Sections 2.1 and 2.2, respectively. Then, in Section 2.3, we provide the implementation details when sigma-points are used to solve the filtering/smoothing equations. Notice that when the system is linear/Gaussian, the optimal solutions are given by the standard Kalman filter (KF) [35] and Kalman smoother (KS) [36], and for general nonlinear/non-Gaussian systems, one should consider more sophisticated SMC techniques [29].

For the formal derivation of the Gaussian filter/smoother, we assume that the measurement noise in (2) is zero-mean Gaussian with known covariance R k . Later, in Section 3, we discuss how the method can be used in the context of conditionally Gaussian models.

2.1 Bayesian Gaussian filtering

From a theoretical point of view, all necessary information to infer information of the unknown states resides in the marginal posterior distribution of the states, p(x k |y 1:k ). Thus, the Bayesian filtering problem is one of evaluating this distribution. It can be recursively computed [26] in two steps: (1) prediction of p(x k |y 1:k−1) using the prior information and the previous filtering distribution and (2) update with new measurements y k to obtain.

The recursive solution provides an estimation framework that is optimal in the Bayesian sense, that is, the characterization of the posterior distribution allows us to compute the minimum mean-squared error (MMSE), the maximum a posteriori (MAP) or the median of the posterior (minimax) estimators, addressing optimality in many senses. The multidimensional integrals in the prediction and update steps are analytically intractable in the general case. Actually, there are few cases where the optimal Bayesian recursion can be analytically solved. This is the case of linear/Gaussian models, where the KF yields to the optimal solution [27]. In more general models, one must resort to suboptimal algorithms. A plethora of methods can be found in the literature [28]. A popular tool are particle filters (PF) [2932], a set of simulation-based methods which are applicable in nonlinear/non-Gaussian setups. Under the Gaussian assumption of interest, the quadrature KF (QKF) [14, 15, 33] and cubature KF (CKF) [34] are typically the methods of choice. In this case, the marginal predictive and posterior distributions are
$$\begin{array}{@{}rcl@{}} p(\mathbf{x}_{k}|\mathbf{y}_{1:k-1}) &=& \mathcal{N}\left(\mathbf{x}_{k}; \hat{\mathbf{x}}_{k|k-1},\mathbf{\Sigma}_{x,k|k-1}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} p(\mathbf{x}_{k}|\mathbf{y}_{1:k}) &=& \mathcal{N}\left(\mathbf{x}_{k}; \hat{\mathbf{x}}_{k|k},\mathbf{\Sigma}_{x,k|k}\right). \end{array} $$
In the prediction step, we compute the marginal predictive distribution mean and covariance as 1
$$\begin{array}{@{}rcl@{}} {} \hat{\mathbf{x}}_{k|k-1} &=& \int \mathbf{f}(\mathbf{x}_{k-1})p(\mathbf{x}_{k-1}|\mathbf{y}_{1:k-1}) d\mathbf{x}_{k-1}, \end{array} $$
$$\begin{array}{@{}rcl@{}} {} \mathbf{\Sigma}_{x,k|k-1} &=& \int \mathbf{f}^{2}(\mathbf{x}_{k-1}) p(\mathbf{x}_{k-1}|\mathbf{y}_{1:k-1})d\mathbf{x}_{k-1}\\ {} && - \hat{\mathbf{x}}_{k|k-1} \hat{\mathbf{x}}_{k|k-1}^{T} + \mathbf{Q}_{k}. \end{array} $$
In the update step, the mean and covariance of the marginal posterior are given by the KF Equations [35]
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{x}}_{k|k} &=& \hat{\mathbf{x}}_{k|k-1} + \mathbf{K}_{k} \left(\mathbf{y}_{k} - \hat{\mathbf{y}}_{k|k-1}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \mathbf{\Sigma}_{x,k|k} &=& \mathbf{\Sigma}_{x,k|k-1} - \mathbf{K}_{k} \boldsymbol\Sigma_{y,k|k-1} \mathbf{K}_{k}^{T}, \end{array} $$
where the Kalman gain is \(\mathbf {K}_{k} = \boldsymbol \Sigma _{xy,k|k-1} \boldsymbol \Sigma _{y,k|k-1}^{-1}.\) The predicted measurement and both innovation and cross-covariance matrices are computed as
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{y}}_{k|k-1} &=& \int \mathbf{h}\left(\mathbf{x}_{k}\right) p(\mathbf{x}_{k}|\mathbf{y}_{1:k-1}) d\mathbf{x}_{k}, \end{array} $$
$$\begin{array}{@{}rcl@{}} \boldsymbol\Sigma_{y,k|k-1} &=& \int \mathbf{h}^{2}(\mathbf{x}_{k}) p(\mathbf{x}_{k} |\mathbf{y}_{1:k-1}) d\mathbf{x}_{k}\\ && -\hat{\mathbf{y}}_{k|k-1} \hat{\mathbf{y}}_{k|k-1}^{T} + \mathbf{R}_{k}, \end{array} $$
$$\begin{array}{@{}rcl@{}} \boldsymbol\Sigma_{xy,k|k-1} &=& \int \mathbf{x}_{k} \mathbf{h}^{T}(\mathbf{x}_{k}) p(\mathbf{x}_{k} |\mathbf{y}_{1:k-1}) d\mathbf{x}_{k} \\ && -\hat{\mathbf{x}}_{k|k-1} \hat{\mathbf{y}}_{k|k-1}^{T}. \end{array} $$

The problem reduces to the approximation of these integrals.

2.2 Gaussian smoothing

In the previous section, we summarized the general Gaussian Bayesian filtering solution but sometimes it may be interesting to obtain an estimate of the smoothing posterior and not its filtering counterpart. In the problem under study, we consider a forward-backward smoother formulation [36] to obtain the marginal smoothing posterior, p(x k |y 1:N ),
$$\begin{array}{@{}rcl@{}} p(\mathbf{x}_{k} | \mathbf{y}_{1:N}) &=& \int p(\mathbf{x}_{k}, \mathbf{x}_{k+1} | \mathbf{y}_{1:N}) d\mathbf{x}_{k+1} \\ &=& \int p(\mathbf{x}_{k} | \mathbf{x}_{k+1}, \mathbf{y}_{1:k}) p(\mathbf{x}_{k+1} | \mathbf{y}_{1:N}) d\mathbf{x}_{k+1}\\ &=& \underbrace{p(\mathbf{x}_{k} | \mathbf{y}_{1:k})}_{\text{filtering pdf}} \int \frac{p(\mathbf{x}_{k+1} | \mathbf{y}_{1:N})p(\mathbf{x}_{k+1} | \mathbf{x}_{k})}{\underbrace{p(\mathbf{x}_{k+1} | \mathbf{y}_{1:k})}_{\text{predictive pdf}}} d\mathbf{x}_{k+1}, \end{array} $$
where we used the state that is Markovian and then
$$\begin{array}{@{}rcl@{}} p(\mathbf{x}_{k} | \mathbf{x}_{k+1}, \mathbf{y}_{1:N}) &=& p(\mathbf{x}_{k} | \mathbf{x}_{k+1}, \mathbf{y}_{1:k}) \\ &=& \frac{p(\mathbf{x}_{k}, \mathbf{x}_{k+1} | \mathbf{y}_{1:k})}{p(\mathbf{x}_{k+1} | \mathbf{y}_{1:k})}\\ &=& \frac{p(\mathbf{x}_{k+1} | \mathbf{x}_{k})p(\mathbf{x}_{k} | \mathbf{y}_{1:k})}{p(\mathbf{x}_{k+1} | \mathbf{y}_{1:k})}. \end{array} $$
The forward-backward smoothing [36] performs two filtering passes, that is, first a standard forward filtering from time k=1 to N, and then, the backward filtering from k=N to 1, backwards in time. Notice that the predictive and filtering distributions may be obtained from the standard Bayesian filtering solution. At time k, if we consider that we know the filtering distribution, \(\mathcal {N}\left (\mathbf {x}_{k} ; \hat {\mathbf {x}}_{k|k}, \mathbf {\Sigma }_{x,k|k} \right)\), and the predictive distribution, \(\mathcal {N}\left (\mathbf {x}_{k+1} ; \hat {\mathbf {x}}_{k+1|k}, \mathbf {\Sigma }_{x,k+1|k} \right)\), from the forward filtering, together with the smoothed density at k+1, \(\mathcal {N} \left (\mathbf {x}_{k+1}; \hat {\mathbf {x}}_{k+1|N}, \mathbf {\Sigma }_{x,k+1|N} \right)\), because the smoother is running backwards, then the analytical solution to the marginal smoothing posterior is obtained as follows: using the Markovian properties of states, we have that p(x k |x k+1,y 1:N )=p(x k |x k+1,y 1:k ), and then we can obtain the conditional smoothing distribution of x k as
$$\begin{array}{@{}rcl@{}} p(\mathbf{x}_{k} | \mathbf{x}_{k+1}, \mathbf{y}_{1:N}) = \frac{p\left(\mathbf{x}_{k}, \mathbf{x}_{k+1} | \mathbf{y}_{1:k}\right)}{p(\mathbf{x}_{k+1} | \mathbf{y}_{1:k})}, \end{array} $$
$$\begin{array}{@{}rcl@{}} p(\mathbf{x}_{k},\mathbf{x}_{k+1} | \mathbf{y}_{1:k}) &=& p(\mathbf{x}_{k+1} | \mathbf{x}_{k}) p(\mathbf{x}_{k} | \mathbf{y}_{1:k}),\\ p(\mathbf{x}_{k+1} | \mathbf{y}_{1:k}) &=& \int p(\mathbf{x}_{k+1} | \mathbf{x}_{k}) p(\mathbf{x}_{k} | \mathbf{y}_{1:k}) d\mathbf{x}_{k}. \end{array} $$
The joint smoothing distribution p(x k ,x k+1|y 1:N ) is
$$\begin{array}{@{}rcl@{}} p(\mathbf{x}_{k} | \mathbf{x}_{k+1}, \mathbf{y}_{1:N}) p(\mathbf{x}_{k+1} | \mathbf{y}_{1:N}), \end{array} $$
which can be used to obtain the smoothing distribution by marginalization over x k+1,
$$\begin{array}{@{}rcl@{}} p(\mathbf{x}_{k} | \mathbf{y}_{1:N}) &= \int p(\mathbf{x}_{k} | \mathbf{x}_{k+1}, \mathbf{y}_{1:N}) \\ \quad \times p(\mathbf{x}_{k+1} | \mathbf{y}_{1:N}) d\mathbf{x}_{k+1}. \end{array} $$
Under the Gaussian assumption, the problem is to recursively obtain the mean and covariance of the Gaussian marginal smoothing posterior distribution, which is given by [37, 38]
$$\begin{array}{@{}rcl@{}} p(\mathbf{x}_{k} | \mathbf{y}_{1:N}) &=& \mathcal{N} \left(\mathbf{x}_{k}; \hat{\mathbf{x}}_{k|N}, \mathbf{\Sigma}_{x,k|N} \right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{x}}_{k|N} &=& \hat{\mathbf{x}}_{k|k} + \mathbf{D}_{k} \left(\hat{\mathbf{x}}_{k+1|N} - \hat{\mathbf{x}}_{k+1|k}\right), \\ \mathbf{\Sigma}_{x,k|N} &=& \mathbf{\Sigma}_{x,k|k} + \mathbf{D}_{k} \left(\mathbf{\Sigma}_{x,k+1|N} - \mathbf{\Sigma}_{x,k+1|k} \right) \mathbf{D}^{\top}_{k}, \end{array} $$

with \(\mathbf {D}_{k} = \mathbf {\Sigma }_{k,k+1|k} \mathbf {\Sigma }^{-1}_{x,k+1|k}\) and Σ k,k+1|k referring to the cross-covariance between x k and x k+1. Note that in practice we do not require the computation of the smoothing estimation error covariance, Σ x,k|N , for the smoother recursion. However, it is useful in order to have a measure of the smoothing uncertainty. The smoother gain D k can be easily obtained from the standard forward filtering pass, therefore adding very few extra computation.

2.3 Sigma-point Gaussian filtering and smoothing

An appealing class of filters and smoothers within the nonlinear Gaussian framework are the sigma-point Gaussian filters (SPGF) [14, 15, 34, 39, 40] and smoothers (SPGS) [37, 38], a family of derivative-free algorithms which are based on a weighted sum of function values at specified (i.e., deterministic) points within the domain of integration, as opposite to the stochastic sampling performed by particle filtering methods. The idea is to use a set of so-called sigma-points to efficiently characterize the propagation of the normal distribution over the nonlinear system. In the sequel, we detail the formulation of such approximation and how it can be used to perform filtering or smoothing.

2.3.1 Filtering

Consider a set of sigma-points, \(\phantom {\dot {i}\!}\{ \boldsymbol {\xi }_{i}, \omega _{i}\}_{i=1,\ldots,L_{M}}\). Then, construct the transformed set which captures the mean and covariance of the posterior distribution, \(\phantom {\dot {i}\!}\mathbf {x}_{i,k-1|k-1}=\mathbf {S}_{x,k-1|k-1} \boldsymbol {\xi }_{i}+\hat {\mathbf {x}}_{k-1|k-1}\), with \(\phantom {\dot {i}\!}\mathbf {\Sigma }_{x,k-1|k-1} = \mathbf {S}_{x,k-1|k-1}\mathbf {S}_{x,k-1|k-1}^{\top }\). The integrals in the prediction step can be approximated as
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{x}}_{k|k-1} &=&\sum_{i=1}^{L_{M}} \omega_{i} \mathbf{f}(\mathbf{x}_{i,k-1|k-1}),\\ \mathbf{\Sigma}_{x,k|k-1} &=& \sum_{i=1}^{L_{M}} \omega_{i} \mathbf{f}^{2}(\mathbf{x}_{i,k-1|k-1}) - \left(\hat{\mathbf{x}}_{k|k-1} \right)^{2} + \mathbf{Q}_{k}. \end{array} $$
In the following update step, first compute the transformed set to capture the mean and covariance of the predictive marginal distribution, \(\phantom {\dot {i}\!}\mathbf {x}_{i,k|k-1}=\mathbf {S}_{x,k|k-1} \boldsymbol {\xi }_{i}+\hat {\mathbf {x}}_{k|k-1}\), with \(\phantom {\dot {i}\!}\mathbf {\Sigma }_{x,k|k-1} = \mathbf {S}_{x,k|k-1} \mathbf {S}_{x,k|k-1}^{\top }\). Then, we approximate the integrals of interest as,
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{y}}_{k|k-1} &=& \sum_{i=1}^{L_{M}} \omega_{i} \mathbf{h}(\mathbf{x}_{i,k|k-1}),\\ \mathbf{\Sigma}_{y,k|k-1} &=& \sum_{i=1}^{L_{M}} \omega_{i} \mathbf{h}^{2}\left(\mathbf{x}_{i,k|k-1}\right) - \left(\hat{\mathbf{y}}_{k|k-1}\right)^{2} + \mathbf{R}_{k}, \\ \mathbf{\Sigma}_{xy,k|k-1} &=& \sum_{i=1}^{L_{M}} \omega_{i} \mathbf{x}_{i,k|k-1} \mathbf{h}(\mathbf{x}_{i,k|k-1})^{\top}\\ & & - \hat{\mathbf{x}}_{k|k-1}\hat{\mathbf{y}}_{k|k-1}^{\top}. \end{array} $$

2.3.2 Smoothing

The smoothed state (17) is obtained using the predicted filtering and smoothing states, \(\hat {\mathbf {x}}_{k+1|k}\) and \(\hat {\mathbf {x}}_{k+1|N}\), respectively. Define again a set of sigma-points and weights, \(\{ \boldsymbol {\xi }_{i}, \omega _{i}\}_{i=1,\ldots,L_{M}}\phantom {\dot {i}\!}\), and the transformed set which captures the corresponding mean and covariance, \(\mathbf {x}_{i,k|k}=\mathbf {S}_{x,k|k}\boldsymbol {\xi }_{i}+\hat {\mathbf {x}}_{k|k}\phantom {\dot {i}\!}\). Use this transformed sigma-points to estimate the predicted subspace state, its prediction error covariance, and the cross-covariance as
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{x}}_{k+1|k} &=& \sum_{i=1}^{L_{M}} \omega_{i} \mathbf{f}(\mathbf{x}_{i,k|k}),\\ \mathbf{\Sigma}_{x,k+1|k} &=& \sum_{i=1}^{L_{M}} \omega_{i} \mathbf{f}^{2}(\mathbf{x}_{i,k|k}) - \left(\hat{\mathbf{x}}_{k+1|k}\right)^{2} + \mathbf{Q}_{k},\\ \mathbf{\Sigma}_{k,k+1|k} &=& \sum_{i=1}^{L_{M}} \omega_{i} \mathbf{x}_{i,k|k} \mathbf{f}(\mathbf{x}_{i,k|k}) - \hat{\mathbf{x}}_{k|k}\hat{\mathbf{x}}_{k+1|k}^{\top}. \end{array} $$
Finally, estimate the smoothed subspace and covariance as
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{x}}_{k|N} &=& \hat{\mathbf{x}}_{k|k} + \mathbf{D}_{k} \left(\hat{\mathbf{x}}_{k+1|N} - \hat{\mathbf{x}}_{k+1|k}\right),\\ \mathbf{\Sigma}_{x,k|N} &=& \mathbf{\Sigma}_{x,k|k} + \mathbf{D}_{k} \left(\mathbf{\Sigma}_{x,k+1|N} - \mathbf{\Sigma}_{x,k+1|k} \right) \mathbf{D}_{k}^{\top}, \end{array} $$

with \(\mathbf {D}_{k} = \mathbf {\Sigma }_{k,k+1|k} \mathbf {\Sigma }_{x,k+1|k}^{-1}\). Notice that the smoother gain can be embedded into the prediction step of the forward filtering, then only the last step is performed in the backward recursion. At time k=N, both filtering and smoothing estimates are the same, then the backward pass runs from time N−1 to 1. Compared to the filtering process, implementation of the smoothing solution only impacts in having additional steps ?? and ?? in Algorithm 1, where we use the notation \(\mathbf {\Sigma }_{x} = \mathbf {S}_{x}\mathbf {S}_{x}^{\top }\) for the factorized covariances.

3 Hierarchically Gaussian measurement noise formulation

In the previous Section 2, we assumed a Gaussian measurement noise with known covariance matrix. But in challenging applications such as the NLOS propagation conditions of interest here, the Gaussian assumption does not hold and noise parameters may be unknown to a certain extent. In such scenarios, one may have outliers or impulsive behaviors that produce biased estimates, for instance, under NLOS conditions the receiver is likely to estimate distances to the anchors larger than the true ones [6]; therefore, we must account for more accurate observation models.

In general, these non-Gaussian behaviors can be effectively characterized by parametric heavy-tailed and positive-skewed noise distributions. It has been recently shown experimentally that TOA-based positioning under NLOS conditions [7] and multipath ranging error distributions in LTE networks [8] can be well approximated by a skew t-distribution [9]. Taking into account the problem at hand, we are interested in measurement models with independent observation components and measurement noise models where the noise components are independently univariate skew t-distributed
$$ \mathbf{y}_{k} = [y_{k,1},\ldots, y_{k,n_{y}}]^{\top},\\ $$
$$\begin{array}{@{}rcl@{}} y_{k,i} &=& \mathbf{h}_{k,i}(\mathbf{x}_{k}) + n_{k,i}, \end{array} $$
$$\begin{array}{@{}rcl@{}} \mathbf{n}_{k} &=& [n_{k,1},\ldots, n_{k,n_{y}} ]^{\top}, \end{array} $$
$$\begin{array}{@{}rcl@{}} n_{k,i} &\sim& \mathcal{ST}\left(n_{k,i}; \mu, \sigma^{2}, \lambda, \nu\right), \end{array} $$

with \(\mathcal {ST}\left (n_{k,i}; \mu, \sigma ^{2}, \lambda, \nu \right)\) defined in Section 1.

A key point on the problem formulation is to take advantage of the hierarchically (conditionally) Gaussian formulation of the measurement noise distribution. The hierarchical Gaussian representation of the skew t-distribution is written as [41]
$$\begin{array}{@{}rcl@{}} n_{k,i}| \gamma_{k,i}, \tau_{k,i} &\sim& \mathcal{N}\left({n}_{k,i}; \mu+\lambda \gamma_{k,i}, \tau_{k,i}^{-1} \sigma^{2} \right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \gamma_{k,i} | \tau_{k,i} &\sim& \mathcal{N}_{+} \left(\gamma_{k,i}; 0, \tau_{k,i}^{-1}\right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \tau_{k,i} &\sim& \mathcal{G}\left(\tau_{k,i}; \frac{\nu}{2},\frac{\nu}{2}\right). \end{array} $$

While τ k,i controls the heavy-tailed behavior, γ k,i controls the skewness of the distribution.

We can define the vector with 2×n y noise distribution latent variables, \(\boldsymbol {\phi }_{k} = \left.\{ \gamma _{k,i}, \tau _{k,i} \}\right |_{i=1,\ldots,n_{y}}\), where we omit the dependence with respect to the hyperparameters (i.e., μ,σ 2,λ,ν) for the sake of clarity.

The measurement noise in (24) can be written as
$$\begin{array}{@{}rcl@{}} \mathbf{n}_{k} | \boldsymbol{\phi}_{k} \sim \mathcal{N}\left(\mathbf{n}_{k}; \mathbf{m}_{k}(\boldsymbol{\phi}_{k}), \mathbf{R}_{k}(\boldsymbol{\phi}_{k}) \right), \end{array} $$

where [m k (ϕ k )] i =μ+λ γ k,i and \([\mathbf {R}_{k}(\boldsymbol {\phi }_{k})]_{i,i} = \tau _{k,i}^{-1} \sigma ^{2}\).

The distribution hyperparameters are application dependent and typically assumed a priori known. The standard Gaussian filter/smoother in charge of the state estimation assumes a zero-mean Gaussian measurement noise with known parameters. In the skew t-distributed case, at every time step, the filter requires an estimate of the corresponding mean and covariance, m k (ϕ k ) and R k (ϕ k ), respectively. In the following, we consider the marginalization of the noise latent variables in the general filter/smoother formulation.

4 Noise latent variables marginalization

In the problem of interest, the measurement noise is conditionally Gaussian with unknown noise latent variables. Therefore, the filtering/smoothing formulation in Section 2 must be modified to take such uncertainty into account. We assume known measurement noise distribution hyperparameters, and thus, we want to marginalize the state estimation with respect to the noise latent variables, γ k,i and τ k,i . We can write the marginalized posterior distribution as
$$\begin{array}{@{}rcl@{}} p(\mathbf{x}_{k}| \mathbf{y}_{1:k}) = \int p(\mathbf{x}_{k}|\boldsymbol{\phi}_{k}, \mathbf{y}_{1:k}) p(\boldsymbol{\phi}_{k} | \mathbf{y}_{1:k}) d\boldsymbol{\phi}_{k}. \end{array} $$
Notice that the measurement noise parameters only affect the computation of the innovation in the measurement update of the filter/smoother. Within this context, the predicted measurement is reformulated as
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{y}}_{k|k-1} &=& \int (\mathbf{h}(\mathbf{x}_{k})+\mathbf{n}_{k}(\boldsymbol{\phi}_{k})) p(\mathbf{x}_{k}| \mathbf{y}_{1:k-1}) d\mathbf{x}_{k},\\ &=& \int \mathbf{h}(\mathbf{x}_{k}) p(\mathbf{x}_{k}| \mathbf{y}_{1:k-1}) d\mathbf{x}_{k}\\ && + \int \mathbf{n}_{k}(\boldsymbol{\phi}_{k}) p(\boldsymbol{\phi}_{k}|\mathbf{y}_{1:k-1}) d\boldsymbol{\phi}_{k}, \end{array} $$
where the first term corresponds to the standard Gaussian case, and the second is a marginalization of the noise term over the last available distribution (i.e., we write n k (ϕ k ) to explicitly emphasize the dependency on the noise latent variables, ϕ k ). Using a similar formulation, we can rewrite the innovation covariance as
$$\begin{array}{@{}rcl@{}} \boldsymbol\Sigma_{y,k|k-1} &=& \int \mathbf{h}^{2}(\mathbf{x}_{k}) p(\mathbf{x}_{k} |\mathbf{y}_{1:k-1}) d\mathbf{x}_{k}\\ && - \left(\hat{\mathbf{y}}_{k|k-1}\right)^{2} \,+\, \int \mathbf{n}_{k}(\boldsymbol{\phi}_{k})\mathbf{n}_{k}^{\top}(\boldsymbol{\phi}_{k})p(\boldsymbol{\phi}_{k}|\mathbf{y}_{1:k-1})d\boldsymbol{\phi}_{k}. \end{array} $$
For the marginalization of the noise latent variables, a key point is to obtain the posterior distributions of γ k,i and τ k,i . The joint posterior is given by
$$\begin{array}{@{}rcl@{}} && p(\gamma_{k,i},\tau_{k,i} | y_{1:k,i})=p(\gamma_{k,i},\tau_{k,i} | y_{k,i}) \\ && \propto p(y_{k,i} | \gamma_{k,i}, \tau_{k,i}) p(\gamma_{k,i} | \tau_{k,i}) p(\tau_{k,i}). \end{array} $$
As the observation components are assumed to be independent, the likelihood function is
$$ y_{k,i}|\gamma_{k,i},\tau_{k,i} \sim \mathcal{N}\left(y_{k,i};\mathbf{h}_{k,i}({\mathbf x}_{k}) + \mu + \lambda \gamma_{k,i}, \tau_{k,i}^{-1} \sigma^{2}\right). $$
We can define a normalized observation
$$ \tilde{y}_{k,i} \triangleq \frac{y_{k,i} - \mathbf{h}_{k,i}(\hat{\mathbf{x}}_{k|k-1}) - \mu}{\sigma}, $$
and \(\tilde {\gamma }_{k,i} \triangleq \lambda \gamma _{k,i}/\sigma \). Then, we have the normalized likelihood is given by
$$ p(\tilde{y}_{k,i} | \gamma_{k,i}, \tau_{k,i}) = \mathcal{N} \left(\tilde{y}_{k,i}; \tilde{\gamma}_{k,i}, \tau_{k,i}^{-1} \right). $$
Using the conjugate nature of the prior distributions [42], it is possible to obtain the analytical solution for the posterior of γ k,i and τ k,i . In this case, we have from (28) that the a priori distributions are
$$\begin{array}{@{}rcl@{}} p(\tilde{\gamma}_{k,i} | \tau_{k,i}) &=& \mathcal{N}\left(\tilde{\gamma}_{k,i}; \gamma_{k-1}, (\kappa_{k-1} \tau_{k,i})^{-1} \right), \end{array} $$
$$\begin{array}{@{}rcl@{}} p(\tau_{k,i}) &=& \mathcal{G}\left(\tau_{k,i}; \alpha_{k-1}, \beta_{k-1} \right), \end{array} $$
with γ 0=0, κ 0=σ 2/λ 2, and α 0=β 0=ν/2. We are interested in updating with the new measurements to get the posterior distributions
$$\begin{array}{@{}rcl@{}} p(\tilde{\gamma}_{k,i} | \tau_{k,i}, \tilde{y}_{k,i}) &=& \mathcal{N} \left(\tilde{\gamma}_{k,i}; \gamma_{k}, (\kappa_{k} \tau_{k,i})^{-1} \right), \end{array} $$
$$\begin{array}{@{}rcl@{}} p(\tau_{k,i} | \tilde{y}_{k,i}) &=& \mathcal{G}\left(\tau_{k,i}; \alpha_{k}, \beta_{k} \right), \end{array} $$
$$\begin{array}{@{}rcl@{}} \gamma_{k} &=& \frac{\kappa_{k-1} \gamma_{k-1} + \tilde{y}_{k,i}}{\kappa_{k-1} + 1}, \end{array} $$
$$\begin{array}{@{}rcl@{}} \kappa_{k} &=& \kappa_{k-1} + 1, \end{array} $$
$$\begin{array}{@{}rcl@{}} \alpha_{k} &=& \alpha_{k-1} + \frac{1}{2}, \end{array} $$
$$\begin{array}{@{}rcl@{}} \beta_{k} &=& \beta_{k-1} + \frac{\kappa_{k-1} (\tilde{y}_{k,i} - \gamma_{k-1})^{2}}{2(\kappa_{k-1} + 1)}, \end{array} $$

from basic conjugate analysis results. Interestingly, the posteriors at k in (38) and (39) can be used as the priors in k+1 instead of (36) and (37). In this way, the algorithm is learning the environment as it progresses over time. However, given the assumed model, it is more meaningful to reset the prior at each time instant instead of sequentially using the latest posterior. The reason is that measurements are assumed independent, so there is no benefit in carrying out information from one time instant to the other. Instead, under these conditions, we suggest to use the values γ 0, κ 0, α 0, and β 0 at k−1 before updating the distribution with \(\tilde {y}_{k,i}\). Sequential use of the posterior will be interesting when the generative model is known to have some memory.

In [25], we proposed to use a point estimate for γ k,i and τ k,i from a single observation using their posterior marginals. The corresponding modes of these distributions were used as point estimates \(\hat {\gamma }_{k,i} = \frac {| \tilde {y}_{k,i} | }{2} \frac {\sigma }{\lambda }\) and \(\hat {\tau }_{k,i} = \frac {\alpha - 1}{\beta }\). where we took into account that \(\hat {\gamma }_{k,i} \in \mathbb {R}^{+}\) by construction. In this contribution, instead of using a point estimate, we consider a Monte Carlo-based marginalization drawing L samples from the joint posterior given by (38) and (39). Using these distributions, we propose to compute the two integrals of interest as
$$\begin{array}{@{}rcl@{}} \tilde{\mathbf{m}}_{k} &=& \int \mathbf{n}_{k}(\boldsymbol{\phi}_{k}) p(\boldsymbol{\phi}_{k}|\mathbf{y}_{1:k-1}) d\boldsymbol{\phi}_{k} \\ & & \approx \frac{1}{L} \sum_{j=1}^{L} \mathbf{m}_{k}\left(\boldsymbol{\phi}^{(j)}_{k}\right), \\ \tilde{\mathbf{R}}_{k} &=& \int \mathbf{n}_{k}(\boldsymbol{\phi}_{k})\mathbf{n}_{k}^{\top}(\boldsymbol{\phi}_{k})p(\boldsymbol{\phi}_{k}|\mathbf{y}_{1:k-1})d\boldsymbol{\phi}_{k}\\ & & \approx \frac{1}{L} \sum_{j=1}^{L} \left(\mathbf{R}_{k}\left(\boldsymbol{\phi}^{(j)}_{k}\right) + \mathbf{m}_{k}\left(\boldsymbol{\phi}^{(j)}_{k}\right)\mathbf{m}^{\top}_{k}\left(\boldsymbol{\phi}^{(j)}_{k}\right)\right), \end{array} $$
with \(\boldsymbol {\phi }^{(j)}_{k}\) being random samples drawn from the joint posterior distribution of the noise latent variables, ϕ k . In practice, this can be easily implemented by first drawing a sample from (39) and then, using that sample, draw from (38). These expressions can be further expanded as follows
$$ \tilde{\mathbf{m}}_{k} \approx \left[\begin{array}{c} \frac{1}{L} \sum_{j=1}^{L} \left(\mu + \lambda {\gamma}^{(j)}_{k,1} \right)\\ \vdots \\ \frac{1}{L} \sum_{j=1}^{L} \left(\mu + \lambda {\gamma}^{(j)}_{k,n_{y}}\right) \end{array}\right], $$
where \({\gamma }^{(j)}_{k,i}\) are random samples drawn from the posterior of \({\gamma }^{(j)}_{k,i}\), and \(\tilde {\mathbf {R}}_{k}\) is approximated by a diagonal matrix, where the p-th element of the diagonal is
$$ \frac{1}{L} \sum_{j=1}^{L} \left(\left(\tau^{(j)}_{k,p}\right)^{-1}\sigma^{2} + \left(\mu + \lambda {\gamma}^{(j)}_{k,p} \right)^{2}\right) $$
Finally, we have that the marginalized Monte Carlo sigma-point Gaussian filter and smoother (MSPGF/S) proposed in this contribution is given by Algorithm 1 with step ?? modified as
$$\begin{array}{@{}rcl@{}} \hat{\mathbf{y}}_{k|k-1} &=& \sum_{i=1}^{L_{M}} \omega_{i} \mathbf{h}(\mathbf{x}_{i,k-1|k-1}) + \tilde{\mathbf{m}}_{k},\\ \mathbf{\Sigma}_{y,k|k-1} &=& \sum_{i=1}^{L_{M}} \omega_{i} \mathbf{h}^{2}(\mathbf{x}_{i,k-1|k-1}) - (\hat{\mathbf{y}}_{k|k-1})^{2} + \tilde{\mathbf{R}}_{k}. \end{array} $$

A further improvement of standard SPGF/S schemes comes from the fact that the filter should preserve the properties of a covariance matrix, namely, its symmetry and positive-definiteness. In practice, however, due to lack of arithmetic precision, numerical errors may lead to a loss of these properties. To circumvent this problem, a square-root filter can be considered to propagate the square root of the covariance matrix instead of the covariance itself [33, 34]. We propose to use square-root cubature and quadrature Kalman filters/smoothers (named SCKF/S and SQKF/S, respectively) [38, 43] as the core implementation of the new square-root MSPGF/S. These methods resort to cubature [34] and Gauss-Hermite quadrature rules [15] to approximate the integrals in the optimal solution. While the SCKF/S uses L c =2n x sigma-points, in the SQKF/S we have \(L_{q} = \alpha ^{n_{x}}\phantom {\dot {i}\!}\), where α determines the number of sigma-points per dimension, which is typically set to α=3. A straightforward solution to avoid the exponential computational complexity increase of the standard QKF in high-dimensional systems is the use of sparse-grid quadrature rules, which reduce the computational complexity with negligible penalty in numerical accuracy [44, 45].

5 Application to indoor localization

5.1 SSM for the TOA-based localization problem

To show the performance of the proposed approach, we consider a TOA-based localization problem, where a set of N anchor nodes at known locations, x k,i =[x k,i ,y k,i ], provide range information. If we define the state to be inferred as position and velocity components of the target, \(\mathbf {p}_{k} \triangleq (x_{k}, y_{k})^{\top }\) and \(\mathbf {v}_{k} \triangleq (\dot {x}_{k}, \dot {y}_{k})^{\top }\), respectively, the observed range from each node i to the target, is modeled as \(\hat {\rho }_{k,i} = \rho _{i}(\mathbf {x}_{k}) + n_{k,i}\), i{1,…,N}, with n k,i denoting the ranging error and \(\rho _{i}(\mathbf {x}_{k}) \triangleq \rho _{k,i} = \|\mathbf {x}_{k} - \mathbf {x}_{k,i}\|\) the true distance from the i-th node to the target node at time k. The complete measurement equation is given by
$$\begin{array}{@{}rcl@{}} {\boldsymbol \rho}_{k} &=& \left[ \rho_{k,1}, \cdots, \rho_{k,N} \right]^{\top} \\ {} &=& \underbrace{\left(\begin{array}{c} \|\mathbf{x}_{k} - \mathbf{x}_{k,1}\| \\ \vdots \\ \|\mathbf{x}_{k} - \mathbf{x}_{k,N}\| \end{array} \right)}_{\mathbf{h}_{k}({\mathbf x}_{k})} + \underbrace{\left(\begin{array}{c} n_{k,1} \\ \vdots \\ n_{k,N} \end{array}\right)}_{{\mathbf n}_{k}}. \end{array} $$
In standard localization applications, the measurement noise is nominally distributed according to \(\mathbf {n}_{t} \sim \mathcal {N}\left (\mathbf {n}_{t} ; \mathbf {0}, \sigma ^{2} \cdot \mathbf {I}_{N}\right)\), where σ depends on the technology used to obtain the ranging estimates. In the case of UWB devices, this is typically on the order of 0.1 to 1 meter. But the Gaussian distribution does not capture the NLOS propagation conditions [7]; thus, we must account for more accurate measurement models such as the skew t-distribution introduced in the previous sections. Using the noise n k defined in (29), the measurement equation is defined as
$$\begin{array}{@{}rcl@{}} {\boldsymbol \rho}_{k} = \mathbf{h}_{k} \left(\mathbf{x}_{k} \right) + \mathbf{n}_{k}, \; {n}_{k,i} \sim \mathcal{ST}\left({n}_{k,i}; \mu, \sigma^{2}, \lambda, \nu\right). \end{array} $$
Considering the state \(\mathbf {x}_{k} = [x_{k},y_{k},\dot {x}_{k},\dot {y}_{k}]^{\top }\), the process equation is defined as a linear constant acceleration model
$$\begin{array}{@{}rcl@{}} \mathbf{x}_{k} = \mathbf{F} \mathbf{x}_{k-1} + \mathbf{G} \mathbf{u}_{k}, \; \; \mathbf{u}_{k} \sim \mathcal{N}(\mathbf{u}_{k}; \mathbf{0},\mathbf{Q}), \end{array} $$
$$ \mathbf{F} = \left(\begin{array}{cccc} 1 & 0 & T & 0 \\ 0 & 1 & 0 & T \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right) \; \text{and} \; \mathbf{G} = \left(\begin{array}{cc} \frac{T^{2}}{2} & 0 \\ 0 & \frac{T^{2}}{2} \\ T & 0 \\ 0 & T \end{array} \right). $$

The Gaussian process noise \(\mathbf {u}_{k} \sim \mathcal {N}\left (\mathbf {u}_{k};\mathbf {0},\sigma ^{2}_{u} \mathbf {I}_{2}\right)\) models an acceleration of σ u m/s2.

5.2 Numerical results

For the numerical evaluation of the proposed method, the root mean square error (RMSE) of position is used as the measure of performance, which is obtained from 1000 Monte Carlo runs. The new method was validated in a realistic scenario composed of N=6 anchor nodes, circularly deployed in a 40×40 m2 area, and considering σ u =0.03 m/s2. We compare the tracking performance obtained with four methods:
  1. 1.

    SQKF/S operating under the Gaussian assumption without accounting for the non-Gaussian nature of the measurement noise (SQKF/S-G).

  2. 2.

    SQKF/S using point estimates of the noise latent variables ψ k,i as proposed in [25] (SQKF/S-P).

  3. 3.

    New square-root SPGF/S-based solution with marginalized noise latent variables ϕ k within the filter/smoother formulation via Monte Carlo sampling (MSPGF/S).

  4. 4.

    A clairvoyant SQKF/S that knows exactly the realization of the latent variables ϕ k at each instant k and thus can use m k (ϕ k ) and R k (ϕ k ). This is the performance benchmark for the new methodology (SQKF/S-K).


We also considered a sampling importance resampling PF with 81 particles (i.e., equivalent to the number of sigma-points in the SQKF/S), but as already shown in [46], the filter is in general not able to correctly localize the target (i.e., the filter diverges). Moreover, to obtain the same performance than the clairvoyant SQKF/S, we must consider a much larger number of particles. This is the reason why these results are not shown in the figures, since for the fair comparison in terms of number of particles the PF does not provide convergent result.

The proposed MSPGF/S can be implemented using cubature [34] and Gauss-Hermite approximations [15], then using respectively L c =2n x =8 and \(L_{q} = \alpha ^{n_{x}}=81\phantom {\dot {i}\!}\) deterministic samples to approximate the integrals of the general solution. In the proposed indoor localization scenario, we tested both cubature and quadrature approximations, and the performance obtained was found strictly equivalent. In practice, the method of choice is the cubature-based solution, which has the lowest computational complexity.

Notice that all the methods consider known distribution hyperparameters, which are application dependent. We consider an UWB TOA-based indoor localization realistic scenario, with hyperparameters given in [7] and adjusted to match real data: μ=−0.1 m, σ=0.3 m, and λ=0.6 m and ν=4. The corresponding Gaussian approximation is given by μ G =1.3 and σ G =1.6.

Figures 1 and 2 show the filtering and smoothing RMSE of position, respectively, for the different methods and considering L=1000 Monte Carlo samples for the MSPGF/S. In both cases, we obtained similar results. Although the clairvoyant filter/smoother (SQKF/S-K) with fully known measurement noise parameters outperforms the rest, we have a small performance loss with the proposed methodology considering unknown noise latent variables. The new marginalized approach is more robust and outperforms the SQKF/S-P using point estimates first proposed in the filtering context in [25]. The SQKF/S-G operating under the full Gaussian assumption, even if the parameters of the underlying Gaussian noise (i.e.,μ G and σ G ) are correctly obtained to fit the real data, shows the worst performance. This is because this filter/smoother does not take into account the NLOS-induced outliers in the measurement noise. For the sake of completeness, we assess the impact of the Monte Carlo sample size in the MSPGF/S performance. The mean RMSE of position and velocity, for the different methods and several representative values of L, are given in Table 1. The performance of the proposed approach is not degraded when using a sample size as low as L=50 samples, therefore being possible to keep a low overall computational complexity.
Fig. 1

Filtering RMSE of position with ν=4

Fig. 2

Smoothing RMSE of position with ν=4

Table 1

Mean RMSE of position and velocity versus Monte Carlo sample size L in the MSPGS


RMSE position

RMSE velocity

SQKS-K (known noise)



MSPGS L=1000


















Notice that the parameter ν of the skew t distribution controls the tails of the distribution. Lower ν implies heavier tails, thus more outliers and impulsive behaviors. To fully characterize the new method, the performance obtained in the UWB TOA-based localization scenario but now with ν=2, to induce more outliers in the measurements, is shown in Figs. 3 and 4. The proposed method correctly deals with the non-Gaussian noise and approaches the optimal solution. In this case, the performance given by the filter/smoother under the Gaussian assumption (SQKF/S-G) is really poor. This is because the underlying noise distribution is more heavy-tailed, then the Gaussian approximation is no longer valid.
Fig. 3

Filtering RMSE of position with ν=2

Fig. 4

Smoothing RMSE of position with ν=2

6 Conclusions

This article presented a new Bayesian filtering and smoothing framework to deal with nonlinear systems corrupted by parametric heavy-tailed skew t-distributed measurement noise. The new method takes advantage of the conditionally Gaussian form of the skew t-distribution, which allows to use a computationally light Gaussian filter and smoother to deal with the state estimation. The unknown non-Gaussian noise latent variables are marginalized from the general filtering/smoothing solution via Monte Carlo sampling. The performance of the new solution was evaluated in a representative TOA-based localization scenario, where the positive skewed behavior of NLOS propagation conditions is typically modeled using such non-Gaussian distributions.

7 Endnote

1 We write (x)2, (y)2, f 2(·), and h 2(·) as the shorthand for x x , y y , f(·)f (·), and h(·)h (·), respectively. We omitted the dependence with time of f k−1(·) and h k (·) for the sake of clarity.



This work has been supported by the Spanish Ministry of Economy and Competitiveness through project TEC2015-69868-C2-2-R (ADVENTURE) and by the Government of Catalonia under Grant 2014–SGR–1567.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

Centre Tecnològic de Telecomunicacions de Catalunya (CTTC/CERCA)
Northeastern University


  1. D Dardari, E Falletti, M Luise, Satellite and terrestrial radio positioning techniques: a signal processing perspective (Elsevier Academic Press, Oxford, 2011).Google Scholar
  2. MG Amin, P Closas, A Broumandan, JL Volakis, Vulnerabilities, threats, and authentication in satellite-based navigation systems [scanning the issue]. Proc. IEEE. 104(6), 1169–1173 (2016).View ArticleGoogle Scholar
  3. JT Curran, M Navarro, M Anghileri, P Closas, S Pfletschinger, Coding aspects of secure GNSS receivers. Proc. IEEE. 104(6), 1271–1287 (2016).View ArticleGoogle Scholar
  4. R Ioannides, T Pany, G Gibbons, Known vulnerabilities of global navigation satellite systems, status, and potential mitigation techniques. Proc. IEEE. 104(6), 1174–1194 (2016).View ArticleGoogle Scholar
  5. H Liu, H Darabi, P Banerjee, J Liu, Survey of wireless indoor positioning techniques and systems. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev.37(6), 1067–1080 (2007).View ArticleGoogle Scholar
  6. D Dardari, P Closas, PM Djuric, Indoor tracking: theory, methods, and technologies. IEEE Trans. Veh. Technol.64(4), 1263–1278 (2015).View ArticleGoogle Scholar
  7. H Nurminen, T Ardeshiri, Piche, Ŕ, F Gustafsson, in Proc. of the 2015 International Conference on Indoor Positioning and Indoor Navigation (IPIN). A NLOS-robust TOA positioning filter based on a skew-t measurement noise model (IEEE, Banff, AB, Canada, 2015).Google Scholar
  8. P Müller, et al, Statistical trilateration with skew-t distributed errors in LTE networks. IEEE Trans. Wirel. Commun.15(10), 7114–7127 (2016).View ArticleGoogle Scholar
  9. T-I Lin, JC Lee, WJ Hsieh, Robust mixture modeling using the skew t distribution. Stat. Comput.17:, 81–92 (2007).MathSciNetView ArticleGoogle Scholar
  10. S Zozor, C Vignat, Some results on the denoising problem in the elliptically distributed context. IEEE Trans. Sig. Process. 58(1), 134–150 (2010).MathSciNetView ArticleGoogle Scholar
  11. MJ Lombardi, SJ Godsill, On–line Bayesian estimation of signals in symmetric alpha-stable noise. IEEE Trans. Sig. Process. 54(2), 775–779 (2006).View ArticleGoogle Scholar
  12. J Vilà-Valls, C Fernández-Prades, P Closas, JA Fernández-Rubio, in Proc. of the European Signal Processing Conference, Eusipco 2011. Bayesian filtering for nonlinear state–space models in symmetric α–stable measurement noise (IEEE, Barcelona, 2011), pp. 674–678.Google Scholar
  13. J Vilà-Valls, P Closas, C Fernández-Prades, JA Fernández-Rubio, in Proc. of the European Signal Processing Conference (Eusipco). Nonlinear Bayesian filtering in the Gaussian scale mixture context (IEEE, Bucharest, 2012), pp. 529–533.Google Scholar
  14. K Ito, K Xiong, Gaussian filters for nonlinear filtering problems. IEEE Trans. Autom. Control.45(5), 910–927 (2000).MathSciNetView ArticleMATHGoogle Scholar
  15. I Arasaratnam, S Haykin, RJ Elliot, Discrete-time nonlinear filtering algorithms using Gauss-Hermite quadrature. Proc. IEEE. 95(5), 953–977 (2007).View ArticleGoogle Scholar
  16. G Agamennoni, JI Nieto, EM Nebot, Approximate inference in state-space models with heavy-tailed noise. IEEE Trans. Sig. Process. 60(10), 5024–5037 (2012).MathSciNetView ArticleGoogle Scholar
  17. R Piché, S Särkkä, J Hartikainen, in Proc. of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP). Recursive outlier-robust filtering and smoothing for nonlinear systems using the multivariate Student-t distribution (IEEE, Bucharest, 2012).Google Scholar
  18. S Saha, Noise robust online inference for linear dynamic systems (2015). Accessed 24 Aug 2017.Google Scholar
  19. H Nurminen, T Ardeshiri, R Piché, F Gustafsson, Robust inference for state-space models with skewed measurement noise. IEEE Sig. Process. Lett. 22(11), 1898–1902 (2015).View ArticleGoogle Scholar
  20. N Chopin, PE Jacob, O Papaspiliopoulos, SMC2: an efficient algorithm for sequential analysis of state-space models. J. R. Statist. Soc. B. 75(3), 397–426 (2013).View ArticleGoogle Scholar
  21. L Martino, J Read, V Elvira, F Louzada, Cooperative parallel particle filters for on-line model selection and applications to urban mobility. Digit. Sig. Process.60:, 172–185 (2017).View ArticleGoogle Scholar
  22. CC Drovandi, J McGree, AN Pettitt, A sequential Monte Carlo algorithm to incorporate model uncertainty in Bayesian sequential design. J. Comput. Graph. Stat. 23(1), 3–24 (2014).MathSciNetView ArticleGoogle Scholar
  23. I Urteaga, MF Bugallo, PM Djurić, in Proc. of the IEEE Statistical Signal Processing Workshop (SSP). Sequential Monte Carlo methods under model uncertainty (IEEE, Palma de Mallorca, 2016).Google Scholar
  24. F Daum, J Huang, in Proc. of IEEE Aerospace Conference. Curse of dimensionality and particle filters (IEEE, Big Sky, 2003).Google Scholar
  25. P Closas, J Vilà-Valls, in Proc. of the IEEE Statistical Signal Processing Workshop (SSP). NLOS mitigation in TOA-based indoor localization by nonlinear filtering under skew t-distributed measurement noise (IEEE, Palma de Mallorca, 2016).Google Scholar
  26. H Sorenson, in Recursive estimation for nonlinear dynamic systems, ed. by JC Spall (Marcel Dekker, New York, 1988).Google Scholar
  27. RE Kalman, A new approach to linear filtering and prediction problems. J. Basic Eng. Trans. ASME. 82(1), 35–45 (1960).View ArticleGoogle Scholar
  28. Z Chen, Bayesian filtering: from Kalman filters to particle filters, and beyond (2003). Tech. Rep. Adaptive Syst. Lab. McMaster University. Ontario.Google Scholar
  29. A Doucet, N de Freitas, N Gordon (eds.), Sequential Monte Carlo Methods in Practice (Springer, New York, 2001).Google Scholar
  30. S Arulampalam, S Maskell, N Gordon, T Clapp, A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing. 50(2), 174–188 (2002).View ArticleGoogle Scholar
  31. PM Djurić, JH Kotecha, J Zhang, Y Huang, T Ghirmai, MF Bugallo, J Míguez, Particle filtering. IEEE Signal Processing Magazine. 20(5), 19–38 (2003).View ArticleGoogle Scholar
  32. B Ristic, S Arulampalam, N Gordon (eds.), Beyond the Kalman Filter: Particle Filters for Tracking Applications (Artech House, Boston, 2004).Google Scholar
  33. I Arasaratnam, S Haykin, Square-root quadrature Kalman filtering. IEEE Trans. Sig. Process.56(6), 2589–2593 (2008).MathSciNetView ArticleGoogle Scholar
  34. I Arasaratnam, S Haykin, Cubature Kalman filters. IEEE Trans. Autom. Control. 54(6), 1254–1269 (2009).MathSciNetView ArticleMATHGoogle Scholar
  35. B Anderson, JB Moore, Optimal Filtering (Prentice-Hall, Englewood Cliffs, 1979).MATHGoogle Scholar
  36. HE Rauch, F Tung, CT Striebel, Maximum likelihood estimates of linear dynamic systems. AIAA J. 3(8), 1445–1450 (1965).MathSciNetView ArticleGoogle Scholar
  37. S Särkkä, Unscented Rauch-Tung-Striebel smoother. IEEE Trans. Autom. Control. 53(3), 845–849 (2008).MathSciNetView ArticleMATHGoogle Scholar
  38. I Arasaratnam, S Haykin, Cubature Kalman smoothers. Automatica. 47:, 2245–2250 (2011).MathSciNetView ArticleMATHGoogle Scholar
  39. SJ Julier, JK Ulhmann, HF Durrant-Whyte, A new method for nonlinear transformation of means and covariances in filters and estimators. IEEE Trans. Autom. Control. 45(3), 472–482 (2000).MathSciNetView ArticleMATHGoogle Scholar
  40. M Norgaard, NK Poulsen, O Ravn, New developments in state estimation of nonlinear systems. Automatica. 36:, 1627–1638 (2000).MathSciNetView ArticleMATHGoogle Scholar
  41. T-I Lin, Robust mixture modeling using multivariate skew t distributions. Stat. Comput. 20:, 343–356 (2010).MathSciNetView ArticleGoogle Scholar
  42. JM Bernardo, AFM Smith, Bayesian Theory, vol. 405 (Wiley, New York, 2009).Google Scholar
  43. S Särkkä, Bayesian Filtering and Smoothing (Cambridge University Press, Cambridge, 2013).View ArticleMATHGoogle Scholar
  44. P Closas, J Vilà-Valls, C Fernández-Prades, in Proc. of the CAMSAP’15. Computational complexity reduction techniques for quadrature Kalman filters (IEEE, Cancun, 2015).Google Scholar
  45. B Jia, M Xin, Y Cheng, Sparse-grid quadrature nonlinear filtering. Automatica. 48(2), 327–341 (2012).MathSciNetView ArticleMATHGoogle Scholar
  46. J Vilà-Valls, P Closas, AF García-Fernández, Uncertainty exchange through multiple quadrature Kalman filtering. IEEE Sig. Process. Lett. 23(12), 1825–1829 (2016).View ArticleGoogle Scholar


© The Author(s) 2017