Robust error estimation based on factor-graph models for non-line-of-sight localization

This paper presents a method to estimate the covariances of the inputs in a factor-graph formulation for localization under non-line-of-sight conditions. A general solution based on covariance estimation and M-estimators in linear regression problems, is presented that is shown to give unbiased estimators of multiple variances and are robust against outliers. An iteratively re-weighted least squares algorithm is proposed to jointly compute the proposed variance estimators and the state estimates for the nonlinear factor graph optimization. The efficacy of the method is illustrated in a simulation study using a robot localization problem under various process and measurement models and measurement outlier scenarios. A case study involving a Global Positioning System based localization in an urban environment and data containing multipath problems demonstrates the application of the proposed technique.

Many data fusion applications require combining measurements from heterogenous sensors or matching features of multiple measurements, both tasks requiring an accurate characterization of the noise covariance matrices of measurements.In particular, GPS-based localization in urban or contested environments relies on fusion of measurements from inertial measurement unit (IMU) or light detection and ranging (LiDAR) systems in the navigation solution [7] to mitigate multipath problems.In vision-based localization, combining sequential measurements requires landmarks identified in these measurements to be matched, which assumes accurate knowledge of covariances [8,9].One of the common assumptions made in Bayesian state estimation of dynamic processes is that the covariance matrices for the noise sources are known or some reliable estimates exist a priori.Unfortunately, accurate a priori knowledge or estimates of the covariance matrices in practice may not be available and inaccurate covariance estimates can lead to a significant degradation in the estimation quality of the system states.In addition, as the applications of Bayesian estimation move from high-quality, expensive systems (e.g., the Apollo mission) to lower-cost systems with lower-quality sensors (e.g., cell phones), the ability to accurately characterize the uncertainty of all inputs to the system become both less repeatable across a class of sensors and less economically feasible.
In this paper, we propose new unbiased estimators of the noise covariances in a factor graph formulation when sensor data is contaminated with outliers and Gaussianity of data is not satisfied, particularly applicable to the GPS-based localization problem under non-line-of-sight conditions.The contributions of the paper are the extending of the unbiased variance estimators for factor-graph problem proposed in [10] to the multipath problem, the study of both linear and nonlinear vehicle motion models, and the investigation of the performance of the approach with real GPS data.A general solution, based on nonlinear regression and robust estimation is proposed that is shown to give unbiased estimators of the multiple variances in factor graph formulation.To jointly compute the proposed variance estimators and the state estimates, an iteratively re-weighted least squares (IRLS) algorithm is presented.By contrast to the existing approaches that rely on the maximum likelihood principle and use sample variances of the residuals to estimate noise variances, the primary contribution of the paper is to incorporate unbiased estimate of the noise variances, which, as will be illustrated, can achieve significant improvement in localization accuracy over the existing approaches. 1

Review of relevant literature
Factor graphs have become a popular smoothing-based alternative to sequential estimators of dynamic systems.Figure 1 illustrates the factor graph representation of a robot localization problem, in which states x 1 , . . ., x n form a sequence of n unknown robot

… Factor Types
Prior Dynamics Measurements Fig. 1 A simple factor graph expression of Eq. ( 4) 1 The software for this study can be obtained at https:// github.com/ avanli/ Factor-Graph-Covar.
locations.The factor graph provides a convenient way to represent the multiplicative relation of the "factors" (shown as boxes with different colors), or the probabilistic representations of states and measurements based on prior beliefs, process dynamics and measurement equations, and construct the optimization problem from which to solve for the robot locations.
In the simultaneous localization and mapping (SLAM) problem, where the factor graph formulation is commonly applied, it was shown that sequential state estimators may provide inconsistent solutions due to the inherent model nonlinearities and the accumulating effects of the linearization choices [11].There has been a recent interest in smoothing as a fast alternative to sequential estimation, with its advantage of retaining the entire robot trajectory that significantly helps the performance [12].Based off prior smoothing work [13,14], the factor graph formulation of the estimation problem enables batch optimization of the system state estimates leading to better performance with nonlinear systems.In addition, the factor graph formulation easily handles situations where the type, frequency or quantity of measurements may change over time.In the SLAM problem, the subset of landmarks being tracked changes over time, meaning that the portions of the state that affect the measurement differ.This complex interaction of the state with the measurements is something easily handled in the factor graph formulation.Despite factor graph implementations being batch approaches, the sparsity of the matrix used to represent the factor graph leads to highly efficient implementations, including many that can be performed in real-time (e.g., [15]).
To estimate the states of the factor graph formulation, several Bayesian estimation algorithms have been introduced, including the sequential approaches of (extended, unscented, ensemble) Kalman filter and particle filter and smoothing-based techniques [16].For non-line-of-sight localization, robust estimation and weighted least squares methods were investigated to mitigate multipath effects [17,18].Recently, methods based on Gaussian mixture models (GMM) and expectation-maximization (EM) framework have been proposed for robust estimation of both the states and noise covariances in a factor graph framework [19][20][21][22].Pfeifer et al. [20] formulated the problem as a maximum likelihood problem and used an EM to achieve robustness against non-Gaussianity.By contrast [19,21,22] formulated the problem as a Bayesian estimation using maximum a posteriori (MAP) estimator and non-informative priors.As it is well known, a Bayesian MAP estimation with non-informative priors is equivalent to maximum likelihood inference of the joint conditional probability of the measurements.
As the review outlined, most of the existing robust estimation methods relied on sample covariances of the residuals, which maximizes the likelihood given the state estimates, as the direct estimators of the noise covariance matrices.As we show in the paper, these estimators are biased [23, p. 103] and may lead to a significant under-estimation of the covariance matrices, adversely impacting the localization accuracy.Other prior efforts focused on estimating the state covariance for factor graphs (see e.g., [8]) all assume accurate input noise covariances are known a-priori.Our focus in this paper, by contrast, is on the estimation of input noise (measurement and process) covariance [24] for the factor graph.

Review of factor graph formulation
We first review the existing factor graph formulation of the state estimation (localization) and the noise variance estimation problems.Consider a discrete-time dynamic system: where x t ∈ IR d x and u t ∈ IR d u are the state and control input vectors at time t, respec- tively, z t ∈ IR d z is the sensor measurement vector, with d x , d u and d z representing the dimensions of the vectors, f(x, u) is the process dynamics model representing what the state is at the next timestep given the current state and inputs, g(x) is the measurement model giving the sensor measurement at the current time given the current state, and v t and w t are white, zero-mean Gaussian noise sources with covariance matrices Q and R, respectively.For the GPS-based localization problem, the states x t may comprise the location coordinates of the vehicle and the measurements z t may comprise vehicle odometry data or pseudorange measurements acquired from visible satellites.The measurements are conditionally independent.Localization problem consists of estimating the states, that is, solving for the sequence of vehicle locations x 1 , . . ., x n on the basis of a sequence of measurements z 1 , . . ., z n where n is the number of measurements available.

State estimation
To find the estimators of the system states, the following optimization problem is solved [15]: where x 0 is the initial pose, and X = {x t } n t=0 is the sequence of system states to be solved.This optimization problem can also be expressed as a factor graph as shown in Fig. 1, where the factors (shown as squares) connect only with the state (or portion of the state) that are required for that measurement/dynamic equation to be evaluated.
Based on the Gaussian process dynamics and measurement models, the probability densities shown in the factorization are p( , where the sign ∝ means "proportional to" and implies the density functions are given up to a proportionality constant.Note that because Q and R are assumed constant across time, the proportionality constants can be safely ignored.Because log(x) is a strictly monotonic function in x, we can rewrite Eq. (3) as: assuming p(x 0 ) ∼ N (µ 0 , � 0 ) , where µ 0 and 0 are the mean and covariance of the initial pose.Since the model equations can be nonlinear in general, the states are solved for by linearizing the process and measurement equations using a first-order Taylor series expansion around an operating point {x 0 t } n t=0 as (1) where F (x t−1 ) = ∂f (x t−1 , u t )/∂x t−1 and G(x t ) = ∂g(x t )/∂x t are the Jacobian matri- ces of the process and measurement equations, respectively, and Defining the vectors a 0 t x 0 t − f (x 0 t−1 , u t ) and c 0 t z t − g(x 0 t ) and letting x 0 = x 0 − x 0 0 , a linear least squares representation of the problem in ( 4) is obtained as: , and G t G(x 0 t ) .Each iteration of the state esti- mation problem can be written as that of finding the solution of the following system of linear equations: where A ∈ IR m×p is the coefficient matrix, b ∈ IR m is the standardized measurement vector, ǫ ∈ IR m is the vector of least squares solution errors, m = (n + 1)d x + nd z is the number of observations and p = (n + 1)d x is the number of states.If the initial pose x 0 is known and not to be included in the estimation problem then X = {x t } n t=1 are the states to be solved for, m = nd x + nd z and p = nd x .
The usual practice in factor graph models is to follow sparse linear algebra and solve for X using a matrix factorization algorithm such as QR or Cholesky factorization fol- lowed by forward or back substitution [12].The state estimates for the next iteration are found as X ← X + X , and the new matrices A and b are found at the linearization point of X.These iterations are continued until convergence.However, to facilitate stating the problem and proving the unbiasedness of the variance estimates we will show in the proposed method (see Sect. 4.1) the solution of X in terms of a generalized inverse, such as the Moore-Penrose pseudo-inverse (typically (A T A) −1 A T ).
As an example, for n = 4 poses, the factor-graph model given in Fig. 1 is written as arg min

Noise variance estimation
In the existing smoothing-based factor graph methods [19][20][21][22], the measurement noise covariances are estimated iteratively in conjunction with the state estimates, following a Gaussian mixture model (GMM) and an expectation and maximization (EM) approach.The maximization step in EM maximizes the likelihood, which minimizes the negative log likelihood (4), given current state estimates, to find the variances.This maximum likelihood (ML) estimator of variances estimates the variances as the sample variance of the measured state estimation error sequences e t = x t − xt [25, p. 79] as A separate maximum likelihood estimate with the residuals of the initial pose, process dynamics and measurements are calculated, respectively, and these estimates form the constant diagonal elements of the covariance matrices 0 , Q and R. To make the estimates robust against outliers, Sunderhauf et al. [1,4] introduced the concept of switchable constraints (SC), which utilizes an augmented optimization problem to concurrently solve for states and weighting values of loop closure or prior constraints.The SC approach, however, adopts ML estimates of the noise variances.
A drawback of the ML estimators, such as (10), in the context of localization is that they are asymptotically (as the data size grows) unbiased estimates of the true variances [24].However, as we show in our derivations and results below, the ML estimators are biased [25] and loose their effectiveness when data size is relatively small compared to the size of the state vector, a common situation in factor graph models where the number of states grow with data size.In control theory, correlation methods, which analyze the innovation sequence of a linear estimator, have been used to provide unbiased estimates of the noise variances of state space models for small data sizes [26].However, such methods do not make any assumptions about the noise probability density functions and do not consider robust estimation to handle outliers and hence are not applicable to non-line-of sight conditions.In the present paper, we develop unbiased estimators of the multiple noise variances of the factor graph representation of dynamic systems, with a particular focus on GPS-based localization under non-line of sight conditions.Another important contrast of the proposed method over the existing methods [1,4,[19][20][21][22] is that, while these methods present methodology to estimate the measurement noise covariance R, none provide explicit estimates of the covariances of the prior 0 or of the process dynamics Q.By contrast, the proposed method is that it provides estimates of all these parameters, in addition to ensuring the unbiasedness of the resulting estimators.

Methods
In this section, we present the proposed unbiased estimators for the noise variances of the factor graph problem and an iteratively reweighted least squares algorithm for jointly finding the variance and state estimates in the robust M-estimation framework. (10) e T t e t .

Unbiased noise variance estimation in factor graph problem
In a linear regression problem, a set of parameters for the model y = B� + ǫ are esti- mated, where y ∈ IR m is the measurement vector containing m observations, B ∈ IR m×p is the regressor matrix for p regressor variables and ǫ ∈ IR m is the vector of Gaussian dis- tributed errors with 0 mean and a common variance of σ 2 .It is clear that the linearized factor graph formulation ( 8) is a linear regression problem, where b, A and X corre- spond to y, B and , respectively.
In linear regression problems, the variance of the measurements y is estimated based on the residuals of the fitted regression model, expressed as [25] where ˆ is the estimated parameter vector and H = B(B T B) −1 B T is an orthogonal pro- jection matrix and the superscript T denotes the matrix transpose.From this equation, the sum of squared residuals r T r is The expectation of the random quantity r T r with respect to the Gaussian probability dis- tribution of the measurements can be found as [25, pp. 554-555] where p = rank( H ) .Following the "method of moments approach" for estimation [27], an unbiased estimate of the variance is [25, p. 77] There are two assumptions in linear regression work, however, that differ from the factor graph formulation.First, linear regression assumes that a there is a single covariance for all inputs.The factor graph formulation, however, has at least two different covariances (R and Q), and often many more.Second, the residuals computed in a factor graph are pre-weighted by the current covariance estimates.In our proposed method, we extend the linear regression based method for finding variances to the factor graph formulation of the system state estimation problem by incorporating these two important aspects.
Assume a reordering of the factor graph linear system (8) and divide the equations in to s partitions as where b 1 , . . ., b s correspond to the standardized measurements of the partitions contain- ing n 1 , . . ., n s elements such that s i=1 n i = m and ǫ 1 , . . ., ǫ s are mutually un-correlated random errors of the partitions with 0 mean and unit variances.Each partition has its own covariance matrix before standardization.For example, the system described in Eq. ( 9) can be split into s = 3 partitions corresponding to unstandardized measurement (11) noise covariances 0 , Q and R, where σ 2 1 , σ 2 2 and σ 2 3 are the constant diagonal elements of the respective 3 matrices.
The proposed method will provide unbiased estimators of the (unstandardized) measurement variances σ 2 1 , . . ., σ 2 s .The standardized measurement vector b is a random variable with an identity covariance matrix, that is, � b = diag(I n 1 , . . ., I n s ) , where I n i is an n i dimensional identity matrix.The unit variances of the standardized partitions are satisfied if the estimated noise variances σ 2 1 , . . ., σ 2 s are correct.However, before convergence is achieved (when noise variances are incorrect) the standardized measurements has the covariance matrix Therefore, if k i , i = 1, . . ., s converge to 1 then the standardized noise variances converge to unity and the unstandardized noise variances converge to the true values σ 2 1 , . . ., σ 2 s .In our approach, to accomplish convergence, the variance for the next iteration is found by scaling the variance in the current iteration with k i as To do this scaling, we must estimate k i in each iteration.Let us define H = I − H where H ∈ IR m×m and H = A(A T A) −1 A T is the projection matrix found by the coefficient matrix A obtained with the standardized measurements and the covariance estimates in the current iteration.Partition H as where H ij ∈ IR n i ×n j and i, j = 1, 2, . . ., s .Note that because H is symmetric, H ji = H T ij .The residual vector for the ith partition, r i ∈ IR n i ×1 , is obtained as leading to the sum of squared residuals of the i-th partition: Because this is a scalar, we can take the "trace" of the right-hand side, leading to: Applying expectation operation, E[.], we obtain: (14) Appendix shows the calculations to obtain Eq. ( 20) from Eq. (19).Equation ( 21) is obtained by defining Equation (19) shows how the expected sum of squared residuals E[r T i r i ] is related to the covariance b of the standardized measurements and therefore can be used to find its unbiased estimator.Similarly, Eq. ( 21) can be used to find unbiased estimates of the scaling factors k j , j = 1, 2, . . ., s (diagonals of b before convergence) needed to drive the unstandardized variances to their true values via Eq.(15).Following the method of moments approach, the unbiased estimators of the scaling factors for all partitions i = 1, 2, . . ., s can be solved by writing out a matrix equation and setting equal the observed values r T i r i to their expected values, i.e., the right hand side of ( 21): The solution k 1 , . . ., k s of the system ( 22) is a set of unbiased estimates for the scaling factors in each iteration.The unbiased variance estimators for the unstandardized partitions σ 2 1 , . . ., σ 2 s are obtained by using Eq. ( 15), given the scaling factors and the state estimates ("maximization" step).The states are then estimated by solving (7), given the variance estimates ("expectation" step), and the iterations continue until convergence.The following two subsections discuss the state estimation ("expectation") procedure and the IRLS procedure that implements the expectation and maximization steps.

Robust state estimation
To make the state estimates robust to outliers that arise in non-line-of sight conditions, we present an approach to incorporate the proposed factor graph unbiased variance estimation method with M-estimators.M-estimators [25] are a class of popular robust estimators in linear regression that make use of a weight matrix W that assigns a weight of 1 to inlying observations (which are consistent with the assumed distribution) and smaller weights to outlying observations.We implement the M-estimator by incorporating a weight matrix W within the coefficient matrix A and the measurement vector b of the system of equations (8) as follows: in which the weight matrix is defined as (20) where τ i is the weight found as a function of the ith scaled measurement defined as z i = b i /γ and using the definitions given in Table 1, where γ is a nonparametric estimate of the standard deviation, defined as: where the constant 0.6745 is chosen so that γ is an unbiased estimator based on Gauss- ian error distribution [25, pp. 373-374].The proposed variance estimators are unbiased within the robust M-estimation framework.The Appendix proves this result, by extending the proof shown in Sect.4.1 for the least squares estimation problem.
While there is a large number of weight functions for M-estimators available, we focus here on two of the most commonly utilized choices, Huber and Cauchy functions, to illustrate the main features of the weight and loss functions.We refer the reader to [28], for an extensive review of other M-estimators in regression.Huber and Cauchy weight and loss functions are plotted in Fig. 2 and defined in Table 1, along with the ordinary least squares (OLS) estimator (3), which is a quadratic function of the standardized residuals and does not provide any robustness.For a given loss function, the tuning parameter ( a C in Cauchy, a H in Huber), to be specified by the user, determines the strength of downweighting applied to outliers.For the factor graph formulation, the weights are only applied to measurement residuals and weights for process residuals are unity, that is: τ i for i = nd x + 1, . . ., m is found from Table 1 and τ i = 1 for i = 1, . . ., nd x .
From Fig. 2b, the Huber weight function decreases for residuals larger than a H , while Cauchy is smooth.Barron [29] proposed a family of robust loss functions to generalizes Loss fnc several popular robust kernels such as Huber, Cauchy, Geman-McClure and Welsh.The choice of the tuning parameter is typically made based on the measurement noise for the inlier observations or can be treated as an additional unknown parameter determined by minimizing a generalized loss function [2,30].In this paper, we illustrated the use of proposed variance estimators following the first approach of manually determining the tuning parameter based on the measurement residuals of the inlier data.However, the optimal tuning approaches [2,30] can easily be adopted with the proposed estimators.
The manual tuning is accomplished by overlaying the weight function (Fig. 2b) with the histogram of the residuals and adjusting the tuning parameter until the residuals greater than about 3 standard deviations are weighed down almost completely (i.e., τ i is almost 0).For the Cauchy weight function, this corresponds to setting a C equal to the standard deviation of the inlying data residuals, such that residuals larger than about 3a C receive a very small weight.

Iteratively reweighted least squares
In implementing the robust estimators in regression [31], an iteratively reweighted least squares (IRLS) algorithm is often used to solve iteratively, for the best state estimate for a given set of noise variance values and the best variance values for a given state solution until convergence.These steps are analogous to the expectation and maximization steps of the methods in [19][20][21][22] where the residuals are used to compute a covariance (the M-step), and the estimated covariances are used to compute new residuals (the E-step).
The proposed IRLS algorithm to implement the proposed unbiased variance estimators of the factor graph formulation and the M-estimators for robust state estimation following this iterative framework is described in Algorithm 1.
Note that, the measurement vector b w and matrix A w used in the algorithm already incorporates the M-estimator weight matrix W, as shown in Eq. (23).The iterative approach solves for the best state estimate in the factor graph framework using the current unbiased variance estimates σ 2 1 , σ 2 2 , . . ., σ 2 s and the scaling factors k 1 , k 2 , . . ., k s of the partitions.This state estimate is then used to scale the variance estimates for the next iteration and the process is repeated until convergence.

Experiments
This section illustrates the efficacy of the proposed method using a simulation experiment involving a mobile robot and a case study experiment based on real GPS data.

Simulation experiment
The mobile robot simulation was based on a two-dimensional constant velocity model as shown in Fig. 3. Figure 3a, b shows, respectively, the simulations without and with outliers introduced due to non-line-of-sight conditions.
Two different motion models are considered.(i) A linear model where the state vector x t = (x t , ẋt , y t , ẏt ) T gives the position and velocity of the robot, the measurement vector z t = (x t , y t ) gives the position of the robot in two dimensions and the control vector u t = (u 1t , u 2t ) T affects the velocities ẋt and ẏt . (ii) A nonlinear model, with state vector x t = (x t , y t , θ t , t ) T where the θ t is rotational velocity and t is forward velocity; measurement vector z t = (x t , y t ) is the position of the robot, and control vector u t = (u 1t , u 2t ) T affects the velocities θ t and t .Figure 3 shows simulations with n = 20 timesteps.The performance of robust estimators is studied by considering sce- narios involving outliers for the linear robot motion model.
Table 2 shows the definitions of the terms of the dynamics and measurement equations (1) and (2) of the linear and nonlinear motion models.The sampling period was taken as T = 1 and the initial pose was x 0 = (0, 2, 0, 0) T .According to these models, variances correspond to process dynamics and measurements as follows.For linear motion σ 2 Q1 is for the dynamics of (x t , ẋt ) ; σ 2 Q2 is for the dynamics of (y t , ẏt ) and σ 2 R is for the measurements of (x t , y t ) .For nonlinear motion σ 2 Q1 is for the dynamics of (x t , y t ) ; σ 2 Q2 is for the dynamics of (θ t , t ) and σ 2 R is for the measurements of (x t , y t ).The proposed variance estimation method was applied by performing the partitioning shown in Eq. ( 13) on the factor graph formulation with s = 3 partitions.The par- titioned matrices A 1 , A 2 , A 3 are formed such that they correspond to the rows of A containing σ 2 Q1 , σ 2 Q2 and σ 2 R , respectively, and the partitions and b 1 , b 2 , b 3 are similarly obtained from the rows of b .See Table 2 for the definitions of Q and R and Eq. ( 9) for how A and b are defined for an example with n = 4 .In the simulations, the initial pose x 0 is assumed to be known and therefore, it treated as a constant in the factori- zation (7) and x 0 and 0 are not estimated.The control input for the linear motion case was defined as: which has the effect of (without process and measurement noises, i.e., v t = 0, w t = 0 ) making the robot move forward 10 units, then up 10 units, forward 10 units, then back down 10 units and as this pattern repeats, a "square wave" would be created.Figure 3a shows an example of running this system for 20 timesteps.The control input for the nonlinear motion case was defined as: Similar to the linear motion model, this control input will make the robot to move forward 10 units, then up 10 units, forward 10 units, then back down 10 units.The control trajectory of the nonlinear motion is identical to that of linear motion shown in Fig. 3a.

Linear motion
Nonlinear motion The motion and measurement simulations were generated by using σ 2 Q1 = 0.5, σ 2 Q2 = 0.2 and σ 2 R = 1.5 as the true variances of the models.In order to realistically represent the non-line-of-sight conditions, we used a two component mixture process for the measurements of the linear motion model in Table 2 where α is the mixture proportion and the second mixture component with variance 10 2 represents outliers with large variance (compared to the variance σ 2 R = 1.5 2 of the inlying measurements).The mixture model injects unusually large or small deviations in randomly selected measurements, mimicking the effect of reflections of satellite signal from buildings.Figure 3b shows one simulation with the mixture parameter α = 0.1 and n = 20 poses that resulted in only one outlier.

Case study experiment: GPS-based localization
In this section, we illustrate the application of the proposed method in a GPS-based localization under non-line-of-sight conditions using the real data set presented in Pfeifer and Protzel The data set was collected in the city center of Chemnitz, Germany, by driving an instrumented vehicle several times over a road network, as shown in Fig. 4. The urban setting and the road layout consist of several tall buildings that cause a large number of outliers due to satellite blockages or reflections from the buildings.The data set contains high-precision inertial measurements of the vehicle position with a precision of 2 cm used as ground truth and GPS pseudorange measurements that will be used in the estimation procedure.The data set contained (30) w t ∼ (1 − α)N (0, σ 2 R I 2 ) + αN (0, 10 2 ) Fig. 4 Chemnitz City urban environment and ground truth pseudorange measurements at 8570 timesteps.At each timestep pseudorange measurements are gathered from 7 to 12 satellites that are visible to the vehicle.The proposed unbiased variance estimation method with M-estimators was applied to find the location of the vehicle while making the solution robust to outliers.Our results are compared to those of [1,19,20] who used the same data set.
To develop the factor graph model we define the state vector as x t = (x t , y t , z t , b t , d t ) where b t is the satellite clock error and d t = ḃt is clock error drift.The pro- cess dynamics model, incorporating only the satellite clock error and drift, is where the sampling period in the data set was T = 0.25 s and v t ∼ N (0, Q) is the process error, with covariance matrix where σ 2 b and σ 2 d are the error variances of clock offset and clock drift.The Jacobian matrix F = ∂f /∂x t−1 of the process equation is The measurement equation consists of the pseudorange from the vehicle to the ith satellite at time t where ns t is the number of satellites visible at time t, w it is the measurement error of the ith satellite.The measurement errors of all satellites are Gaussian distributed w t ∼ N (0, R) with covariance matrix is R = σ 2 R I ns t .The noiseless measurement equation is (omitting the time index t in the state variables and satellite coordinates): where (S xi , S yi , S zi ) are the coordinates of the ith satellite, c = 3 × 10 8 m/s is the speed of light, and γ = 7.3 × 10 −5 rad/s is the earth's rotation speed.The measurement equation for the complete set of satellites is where g = (g 1 , . . ., g ns t ) T .The Jacobian matrix G t = ∂g/∂x t of the measurement equa- tions is (31)

Mobile robot experiments
Using the example mobile robot systems, we evaluate the performance of the factor graph noise variance and localization estimates.1000 Monte Carlo replications of the robot simulation were generated for n = 20 timesteps and in each replication the loca- tion sequence of the robot and the noise variances are estimated.The proposed unbiased noise variance estimator [specified by Eqs. ( 22) and ( 15)] is compared to the to the existing ML estimator [specified by Eq. ( 10)].Both estimators are implemented within the IRLS scheme given in Algorithm 1.
We compared the performance of the estimators based on (i) bias in the variance estimates and (ii) Mahalanobis distance of the location solution.Bias of the estimates (of all 3 variances) measures the difference between the computed variances and the true variances.Since we have a variance estimate in each simulation, the average bias from N simulations is calculated as where σ 2 Ti is the true value of ith variance, σ 2 ij is its estimated value in jth simulation, i = 1, 2, 3 is the index of the variance component, j = 1, . . ., N is the index of the simula- tion run and N = 1000.
Total squared Mahalonabis distance of the estimated locations from the truth over all timesteps is used to quantify the localization error.The total Mahalanobis localization error is found (in each simulation) as the sum of the errors over n poses as in which x t is the state and xt is the state estimate of tth pose, P t ∈ IR d x ×d x is the covari- ance matrix of the state estimate xt that can be obtained from the corresponding entries of the matrix (A T A) −1 , which is an m × m matrix (see Sect. 3 for definition of m and d x ).The quantity (36) is also known as average normalized (estimation) error squared, or ANEES, in the tracking literature [32].If the predicted uncertainty on the state estimate is correct, then G should follow a Chi-squared distribution with degree of freedom nd x , with an expected value nd x [23,33].
The estimated poses and the 95% confidence ellipses using the two approaches superimposed with the true poses from a single simulation of the linear motion problem is shown in Fig. 5.The location estimates from both methods are very similar.However, the confidence ellipses, which are the direct result of the covariance estimates, from the methods result in two very different conclusions about the true (unknown) location of the robot.The localization with the proposed unbiased variance estimates has the true location within its confidence interval at every timestep, while the ML method variance estimates frequently has the true location either outside or very close to the confidence ellipse.As the robot moves, the uncertainty in position estimates grows due to the increasing variance estimates, as indicated by uncertainty ellipses growing in diameter over time.Further, the variance estimates from the proposed method (Fig. 5a) are larger than those from ML method (Fig.  For all variances, the (existing) ML estimates are significantly smaller than the true value (underestimate), while the proposed variance estimates are centered around the true value (unbiased).See Fig. 6.The smaller ML variance estimates result in Mahalanobis errors being too large (Fig. 7).Further, the proposed variance estimates have improved precision with less variability while remaining unbiased with larger   n (Fig. 8).For both motion models, the existing ML estimates have far more bias and the resulting average Mahalanobis error is significantly larger than it should be, which is nd x = 20 × 2 = 40 (Table 3).In comparison, the proposed variance estimates agree much more closely with true values (significantly smaller bias) resulting in about 50% smaller Mahalanobis localization errors.
Performance under non-line-of-sight conditions: Monte Carlo simulations of the linear robot motion model with the mixture meausurement noise distribution equation (30) and n = 20 poses were conducted to study the performance of the proposed variance estimator with robust M-estimators.Outlier mixture proportions α = 0.10 and 0.25 were considered, in which the latter case contains more outliers than the former.The proposed method was implemented with a Cauchy M-estimator with parameter a C = 1.645 .Figure 9 shows the estimates of robot location for one realization with α = 0.10 , with and without an M estimator.It can be seen that the localization estimates are similar with both methods; however, due to the presence of outliers, the confidence ellipses are significantly different depending on whether or not an M-estimator is utilized.The confidence ellipses of the position solution with the M-estimator (Fig. 9a), by virtue of downweighting the unusual observations, are smaller than those of the nonrobust method (Fig. 9b) which are much larger (more uncertain).
Table 4 summarizes the means and biases of the variance estimates and the Mahalanobis distances from 1000 Monte Carlo simulations.The benefit of using an M-estimator in conjunction with the unbiased variance estimators on the quality of the estimates is evident, in particular when the outliers are more frequent (i.e., α = 0.25 versus 0.10).With the presence of more outliers, the variance estimates deviate more drastically from the true values if M-estimator is not employed (bias increases from 94.3 to 570.3 due to outliers).By contrast, when an M-estimator is used, in conjunction with the unbiased variance estimators, the bias in the estimates is much smaller (bias increases only from 0.05 to 4.31 due to outliers).The average Mahalanobis localization error is not drastically different from the expected value of 2n = 40 due to the outliers, regardless of whether or not an M-estimator is used, since unbiased variance estimators are used in both results.However, when M-estimator is used, the localization error is smaller, due to the downweighting of the unusual observations.

GPS experiments
The proposed unbiased variance estimation and the existing ML variance estimation were applied to the GPS data set of the vehicle driving experiment shown in Fig. 4. The methods were used to estimate the variances σ 2 b , σ 2 d and σ 2 R of the clock error offset [seconds 2 ], drift (unitless) and measurement error [meters 2 ], respectively, and the vehicle trajectory by incorporating M-estimators following the procedure described in Sect. 4. The Cauchy M-estimator with parameter a C = 3.5 was used to achieve robustness against outliers with both methods.
We compare the Mahalanobis localization errors of the proposed method to those reported by the existing approaches [1,19,20] that used the same data set.Reference [1] considered several combinations of the multiple factors of (i) pseudo range, (ii) prior, (iii) state transition and (iv) dependence between successive measurements.In our comparisons, we considered the results reported for the combinations (i & ii) and (i, ii, iii, & iv), which we refer to as SC method 1 and 2, respectively.Reference [19] used a Gaussian mixture model (GMM).Reference [20] improves the GMM by using an expectationmaximization (EM) algorithm.Both [19,20] use a max-mixture approach as an efficient approximation of the Gaussian sum-mixture, as discussed in [3].Self-tuning GMM approach, by contrast to the static GMM approach, includes the weights, means and variances of mixtures as additional variables in the optimization problem.
Figure 10 shows the localization results obtained with M-estimation and the proposed variance estimates and the ML variance estimates and Table 5 summarizes the variance estimates.It is clear that the localization with the proposed (unbiased) variance estimates is much closer to the ground truth than the ML (biased) estimates.From Table 5, it can be seen that, similar to the simulation study, the ML variance estimates are much smaller than the proposed variance estimates, which indicates that it is likely that the ML estimates are underestimating the actual variances.
Table 6 summarizes the Mahalanabis localization errors from the truth (the mean, median and max over all poses) computed using the proposed unbiased estimators, ML variance estimators and the existing approaches [1,19,20].It can be seen that the proposed variance estimators, in conjunction with an M-estimator, achieve smaller median localization error than the existing SC method 1 and static GMM method, while the median error is slightly higher than those of SC method 2 and self-tuning GMM counterparts.The median and maximum errors are not reported in [20] and could not be included in Table 6.While the existing approaches with some more complex configurations have better performance in localization error, a significant benefit of the proposed method is that of providing unbiased estimates of the prior and noise covariances.As discussed previously, and in literature [8], having good variance estimates is crucial for localization applications such as matching observations with landmarks and loop closure detection.
To get a better understanding of the quality and reasonableness of the variance estimates, the uncertainty of the localization results is visualized by plotting the 95% confidence ellipses of the true vehicle positions obtained with both the proposed and ML variance estimates in Fig. 11.In order to maintain clarity of the display, we showed only the initial 2870   poses out of the entire 8570 poses and plot the confidence ellipse of 1 pose for every 10 poses.The size and shape of the ellipse for each true position is proportional to the variance estimate and quantifies the uncertainty of the localization estimate.The estimated locations and their uncertainty ellipses with the proposed variance estimates more closely follow the ground truth than with the ML variance estimates.To summarize the performance of both methods, we compare the number of times the ground truth positions fall within the confidence ellipses for the entire data set.Using the proposed variance estimates, the ground truth is captured correctly for 7283 out of the 8570 of the vehicle poses (85.0% of the time), while using the ML variance estimates, the ground truth is captured correctly for the 6,832 poses (79.7% of the time).This demonstrates that the proposed variance estimates achieves a 6.6% improvement in quantifying the uncertainty for GPS-based localization over the ML variance estimates.

Conclusions
This paper proposed a new methodology to estimate multiple noise variances in a factorgraph based formulation of the GPS-based localization problem under non-line-of-sight conditions.Unbiased variance estimators of the factor graph formulation are developed following the method of moments approach in linear regression theory.An iteratively reweighted least squares (IRLS) approach was presented for jointly estimating the system state and the noise variances.The existing smoothing-based state estimation approaches have largely relied on the maximum likelihood (or residual sample variance) estimators of variances in an expectation-maximization framework.While the ML estimators are easy to compute and have good performance in many applications, they can result in significant underestimation of the true variances which may in turn impact the localization accuracy.
A simulated mobile robot motion analysis showed that using the proposed variance estimators, the localization errors can be significantly improved over the ML estimators, considering model nonlinearities and measurement outliers.In addition, a case study involving real GPS data compared the performance of the method to an existing switching constraints method for a localization problem in an urban environment with significant multipath problems.A major assumption in the current study was that the noise components are mutually uncorrelated and covariance matrix we estimate was diagonal.A future work of practical interest is to extend the presented approach to unbiased estimators of non-diagonal covariance matrices.Further, in our evaluations we demonstrated the benefits of proposed unbiased variance estimators over maximum likelihood variance estimators in an M-estimation framework.It is important to note that the approach is general and can be implemented with other weighting based robust estimators, such as switchable constraints and dynamic covariance scaling.As most of the existing robust estimation frameworks use ML-based variance estimators, if the unbiased variance estimators were to be implemented within these frameworks it is reasonable to expect a similar improvement in localization accuracy to that observed within the M-estimator framework.

Appendix: Unbiasedness of the Variance Estimators
In this appendix, we present the calculations needed to show unbiasedness of proposed variance estimators under both least squares (non-robust) and robust M-estimator based state estimation.First we prove how Eq. ( 20) follows from Eq. ( 19) for the least squares estimation problem.
To prove unbiasedness when robust M-estimators are used, we need to consider the weight matrix.When a weight matrix W is used, the system coefficient and observation matrices are where A is an m × p matrix, W is an m × m diagonal matrix, with elements τ i for i = 1, 2 . . ., m .Define H w = I − Hw where H w is the projection matrix with the weighted system (37) Analogous to the way matrix H is partitioned in Eq. ( 16), partition H w to find H w ij , which is an n i × n j matrix for i, j = 1, 2, . . ., s , and H w i , which is an n i × m matrix for i = 1, 2, . . ., s .The sum of squared residuals with M-estimators for the ith partition is then written as Note that the sum of squared measurements is Therefore, its expected value can be found as Applying some matrix algebra, similar to Eqs. ( 19)-( 21), we find where and D ij is an n i × n j matrix obtained by partitioning the n i × m matrix D i as Equation (40) proves that, when this system of equations, with the observed residuals replacing the expected values on the left hand side (based on the method of moments approach), is solved for k i , i = 1, 2, . . ., s , one obtains unbiased estimators of the scaling factors needed for each linearizing iteration.Equation (42) follows from Eq. (41) using the steps shown in Eqs.(37)-(39) for the least squares case, but by replacing T ij with D ij .The system of equations for (42) is obtained, analogous to (22), but by replacing T ij with D ij .This completes the proof that the proposed variance estimators are unbiased within the M-estimation framework.(40)

Fig. 2
Fig. 2 OLS, Huber's t and Cauchy M-estimators: a loss functions, b weight functions

Fig. 3
Fig. 3 Mobile robot simulation for n = 20 poses showing control ( v t = 0, w t = 0 ), true trajectories ( w t = 0 ) and measured trajectories a with no outliers b with outliers

− 2 t
= 5, 15, 25, . . .+2 t = 10, 20, 30, . . .+2 t = 5, 20, 25, 40, 45, . . .−2 t = 10, 15, 30, 35, 50, 55, . . .t = 5, 15, 25, . . .−π/2 t = 10, 20, 30, . . .0 otherwise (29) u 2t = 0 for all 5b) as evidenced by the larger ellipses of the former, indicating larger uncertainty about the true location.This demonstrates some of the practical concerns of underestimation with the existing ML-based covariance estimation techniques.The variance estimates σ 2 R , σ 2 Q1 and σ 2 Q2 obtained across the Monte Carlo simulations (of robot motion with n = 20 steps) are shown in Fig. 6 (for linear and nonlinear mod- els) and the distributions of the Mahalanobis distances (for linear motion model) are shown in Fig. 7.The cases of n = 20 and n = 100 timesteps are considered to study the influence of larger data sets on the quality of the noise variance estimates.The variance estimates obtained with n = 20 and 100 are shown in Fig. 8.The summaries of the dis- tributions are given in Table 3 which reports the average variance estimate, the bias of the variance estimates C and the average Mahalanobis localization error G, defined in Eqs.(35) and (36), respectively, computed in the simulations of both linear and nonlinear models with n = 20.

Fig. 5 Fig. 6
Fig. 5 Estimated poses and 95% confidence ellipses for the linear robot motion using a proposed covariance estimates and b existing ML covariance estimates

Fig. 7
Fig. 7 Mahalanobis localization errors for linear motion.Vertical dashed line is the true mean 2n = 40 .Based on variance estimates from a proposed method and b ML method

Fig. 8
Fig. 8 Effect of data size n used on the quality of the proposed variance estimates with the linear vehicle motion model.Horizontal dashed lines show the true variances

Fig. 9
Fig. 9 Estimated poses and 95% confidence ellipses using proposed method for linear motion with outliers with α = 0.1 .a With M-estimator and b without M-estimator

Fig. 10
Fig. 10 GPS-based localization results with robust M-estimation and using the proposed variance estimator and the ML variance estimator

Fig. 11
Fig. 11 Estimated poses and 95% confidence ellipses for the GPS data using a proposed variance estimators and b maximum likelihood variance estimators
r T i r i =b wT H wT i H w i b w =trace(b w b wT H wT i H w i ).b w b wT = W 1/2 bb T W 1/2 .E[b w b wT ] = W 1/2 E[bb T ]W 1/2 = W 1/2 � b W 1/2 .

Table 1
Huber, Cauchy and OLS loss and weight functions xi,t , S yi,t , S zi,t ) are the coordinates of the ith satellite at time t and gi = (S xi,t − x t ) 2 + (S yi,t − y t ) 2 + (S zi,t − z t ) 2.

Table 3
Average variance estimates and localization errors, from simulations with n = 20

Table 4
Variance estimates and localization errors, linear model with outliers, n = 20

Table 5
GPS data variance estimates using the proposed method