 Research
 Open Access
 Published:
Approximate Kalman filtering by both Mrobustified dynamic stochastic approximation and statistical linearization methods
EURASIP Journal on Advances in Signal Processing volume 2023, Article number: 69 (2023)
Abstract
The problem of designing a robustified Kalman filtering technique, insensitive to spiky observations, or outliers, contaminating the Gaussian observations has been presented in the paper. Firstly, a class of Mrobustified dynamic stochastic approximation algorithms is derived by minimizing at each stage a specific timevarying Mrobust performance index, that is, general for a family of algorithms to be considered. The gain matrix of a particular algorithm is calculated at each stage by minimizing an additional criterion of the approximate minimum variance type, with the aid of the statistical linearization method. By combining the proposed Mrobust estimator with the onestage optimal prediction, in the minimum meansquare error sense, a new statistically linearized Mrobustified Kalman filtering technique has been derived. Two simple practical versions of the proposed Mrobustified state estimator are derived by approximating the meansquare optimal statistical linearization coefficient with the fixed and the timevarying factors. The feasibility of the approaches has been analysed by the simulations, using a manoeuvring target radar tracking example, and the real data, related to an object video tracking using shortwave infrared camera.
1 Introduction
One of the most important contributions to the estimation theory is the optimal linear Kalman filter. The simplicity of the optimal Kalman filter is contained in its linear predictor–corrector structure, making this result attractive from a practical point of view [1,2,3,4,5,6,7]. The Kalman filter produces optimum on the average by minimizing the expectation of a scalarvalued penalty, score or loss function, having the random estimation error as the argument. Such a criterion function that is symmetric, convex and equal to zero for the zerovalued argument is known as the admissible one [3, 5]. Kalman filter is the optimal state estimator within a class of the admissible score functions, and as a consequence, it also represents an optimal estimator in the minimum variance sense [1,2,3,4,5,6,7]. To obtain the optimal performance by the Kalman filter, it is necessary to provide a correct a priori description of the system state dynamics and the statistics of random observations. In this sense, if a system state dynamics and the associated observations are confined to severe nonlinearities that cannot be described properly by linearization, and/or if the underlying stochastic sequences are not Gaussian, the Kalman filter may degrade its performance [8, 9]. In general, under nonlinear state dynamics and/or nonGaussian observations, the design of an optimal state estimator can be rather cumbersome [1,2,3,4,5]. Therefore, there exists an interest in a class of estimation procedures that is not optimal, concerning some statistical performance measure, but produces the bounded total estimation error. A family of dynamic stochastic approximation procedures offers a reasonable choice, since it produces fairly well results in many applications, including parameter and state estimation, optimization, pattern classification and signal processing [10,11,12,13,14]. In this sense, any Kalman filter with an erroneous gain sequence, owing to departure from the theoretically optimal conditions in practice, may be considered as a dynamic stochastic approximation algorithm. On the other hand, it is commonly assumed that real measurements are approximately Gaussian distributed, due to the central limit theorem of the statistics [5]. Moreover, statistical analysis of the numerous industrial and scientistic observations has shown that these contain, as a rule, five to ten percentages of outliers [15]. Therefore, in many practical situations, the real probability distribution function (pdf) of the random observations is similar in the middle to the assumed Gaussian one, but differs from it by the heavier tails, generating the spiky observations, or the outliers, contaminating the mainly Gaussian distributed observations [15,16,17,18,19,20]. Particularly, the optimal Kalman filter is sensitive to outliers, due to its linear dependence upon observations, or it is nonrobust. Therefore, there is also an additional practical interest in designing a class of robust filtering techniques that can cope with outliers.
A simple concept of the robustness is the socalled censored data, where the measurement data that differ sufficiently from the predicted values are discarded [15]. Such type of the robust procedure suffers from several faults, and the principle one is that it is often hard to distinguish an outlier from a large, but not unnatural, deviation. Therefore, to handle the outliers more efficiently, several robust procedures are proposed in the statistical literature [15,16,17,18,19]. Particularly, the Huber’s Mrobust estimator is frequently applied, because it approximates the optimal maximum likelihood (ML) estimator [17]. Thus, it can be understood and implemented easily by the practical workers. In this sense, many various combinations of the Mrobust estimator and the optimal Kalman filter, or the linear leastsquare estimator, have been proposed in the literature [20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]. In general, any estimation procedure is a combination of the criterion to be minimized, the model of the variables to be estimated and an estimation algorithm [7]. In this sense, the proposed robust estimators in the selected literature may be classified into the two groups. The first one is a family of the nonrecursive or offline robust schemes, where the Kalman filtering problem is recast as a linear regression problem, which is solved by the Mrobust estimator [20,21,22,23,24,25,26,27]. The posed optimization problem is nonlinear, and an iterative numerical method is required to solve it. Thus, the standard or the simplified Newton’s method, as well as the iteratively reweighted leastsquare method, are recommended [22, 24, 27, 38]. A such derived robust estimator is in a batchmode regression form, processing the observations and the predictions simultaneously, that makes it very effective in suppressing the outliers. However, the robustness in these estimators is achieved at the cost of the increasing computational requirements. In general, a nonrecursive, or offline, estimator may also be used in a realtime application by introducing a onestep rectangular sliding window of a proper length [20]. The basic problems in choosing the window length are related to timevarying parameter changes, together with the influence of outliers contaminating the observations. In general, a smaller parameter estimates variance is obtained at a longer window length, as a consequence of a larger averaging of the measurement data. However, this is in collision with the requirement to follow possible timevarying changes in the parameters to be estimated. Moreover, a short window length may result in unreliable parameter estimates, because of a high order of the underlying parameter regression model. Furthermore, a bias, of shift, in the parameter estimates is unavoidable since the sliding window permanently encompasses the observations contaminated by outliers. In this sense, the Mrobust procedures are efficient in suppressing the influence of outliers, thus reducing significantly the bias and the variance of the robust estimates. Finally, as mentioned above, a nonrecursive robust estimator is rather computationally complex, and an increase in the computational complexity basically depends on the number of necessary iterations to solve the parameter regression problem. Therefore, to solve the posed problems, it is more natural to use a recursive robust procedure than the nonrecursive one. In this sense, starting from the computational considerations, the second group represents a family of the estimators that calculate an estimate recursively, because of the practical requirements to online or realtime signal processing. Such derived robust recursive estimator represents an acceptable balance between the computation efforts and the practical robustness performance [20, 31,32,33,34,35]. A new member of this family has been proposed in this article. The mentioned recursive robust estimators differ from the newly proposed one by the level of models that are used for the modelbased signal processing. In this sense, the above estimators are based on the blackbox models that have a parametric or polynomial form (FIR, AR, ARMA, etc.) [5,6,7]. Moreover, blackbox models are basically used as a data prediction mechanism, and the estimated parameters may be used to extract limited physical information. However, the newly proposed recursive robust estimator is derived from the true modelbased technique, using lumped physical model structure characterized by a statespace representation. Such a true modelbased approach is incorporating the mathematical models of both the physical phenomenology, or system state dynamics, and measurement process, including noise, into the estimation process to extract the desired information [1,2,3,4,5,6,7]. This, in turn, produces better estimator performance than the blackbox modelbased estimation techniques. In general, the computational requirements depend on the order of the underlying statespace model, and for a not too large number of dynamic system states there are no significant additional requirements to the computational resources. Moreover, a recursive weighted least squarestype estimator, representing a combination of the Huber’s Mrobust estimator with a specific linear form of the dynamic stochastic approximation procedure, has been proposed recently to redesign the measurementupdate recursion in the optimal Kalman filter [36, 37]. Here, the resulting state update recursion is still linear in the observations, but insensitivity to outliers is achieved by using a nonlinear weighting factor in the Kalman gain calculation. Such a quasilinear robust state estimator produces worse estimation performance than the proposed true nonlinear robust estimator in this article. The last one treats the outlier more severely by both the nonlinear residual processing and the Kalman gain calculation using the nonlinear weighting factor. In addition, many suboptimal nonlinear state estimators have been designed by applying the Taylor series expansion to describe a nonlinear system state dynamic [1,2,3,4,5,6,7]. Another frequently used method is the statistical approximation, generally producing a better nonlinearity approximation than the Taylor series method [1,2,3]. The simplest form of such method is known as the statistical linearization. Here, the linear approximation of the nonlinearity is used and, analogously to the estimation problem, the meansquare error (MSE) criterion is minimized to calculate the underlying coefficients. This, in turn, assumes that the pdf of the nonlinearity random argument is known in advance, and the Gaussian one is adopted frequently. Moreover, the statistical linear approximation can often be made for an adopted pdf, in such a manner that the calculated coefficients provide for a more accurate result, in the statistical sense, than the truncated Taylor series of a high order. Therefore, the statistical linearization method has a potential advantage for designing a suboptimal nonlinear filter [1,2,3]. Also, the Huber’s Mrobust approach has been proposed to making a suboptimal nonlinear filter more robust [28,29,30].
In this article, a new combination of the Huber’s Mrobust estimator and the nonlinear dynamic stochastic approximation algorithm of the approximate minimum variance type has been proposed. In this sense, the Huber’s Mrobust concept is utilized to design a family of the Mrobustified dynamic stochastic approximation procedures, by minimizing at each stage the general timevarying Mrobust performance index, based on the Huber’s Mrobust score function. To produce fast convergence, the gain matrix of a particular algorithm is derived by stepbystep minimization of an approximate minimum variancetype criterion. The posed nonlinear optimization problem is solved approximately, by using the statistical linearization method. Furthermore, by approximating, at each stage, the meansquare optimal statistical linearization coefficient by the average slope of the Huber’s Mrobust influence function, representing the first derivative of the underlying score function, a new feasible statistically linearized Mrobustified dynamic stochastic approximation procedure is derived. Moreover, by approximating the average slope of the Huber’s influence function with the current sample, an adaptive version of the proposed robust recursive state estimator has been obtained. Starting from the optimal Kalman filter structure, in which the prediction and the correction terms are independent, regarding the state estimate given the predicted one and vice versa, the derived robust recursive state estimator is used to redesign the correction phase, making the Kalman filter more robust. The practical robustness of the designed versions of the statistically linearized Mrobustified Kalman filter has been analysed by both the simulations, using an example of single target radar tracking under an impulsive noise environment, and the real data, concerning an object tracking in a video sequence, generated by the shortwave infrared camera.
The paper organization is the next. A brief description of the Kalman filtering technique, and some discussion on the robustness issues, are presented in Sect. 2. Section 3 is devoted to the synthesis of a new statistically linearized Mrobustified Kalman filtering technique, by both the Mrobustified dynamic stochastic approximation algorithm of the approximately minimum variance type and the statistical linearization method. Moreover, both the fixed and the timevarying suitable approximations of the meansquare optimal statistical linearization coefficients are considered in Sect. 3. Experimental results obtained by both the simulations, using a manoeuvring target radar tracking scenario, and the real data, related to an object video tracking using the shortwave infrared camera are presented in Sect. 4. The concluding remarks are given in Sect. 5. The complete derivation of the proposed statistically linearized Mrobustified Kalman filtering technique is given in Appendix 1, while the derivation of the optimal statistical linearization coefficients is presented in Appendix 2.
2 Problem formulation
Let us consider a linear dynamic stochastic system which is represented by the firstorder linear difference state vector equation
and the linear algebraic measurement vector equation
where \(x_{k}\) is the state vector, \(y_{k}\) is the observation vector, \(w_{k}\) is the zeromean state noise or disturbance vector with covariance matrix \(Q_{k}\), and \(v_{k}\) is the zeromean observation noise vector with the covariance matrix \(R_{k}\), at the discrete time index, \(k\). Moreover, the timevarying matrices \(F\), \(G\) and \(H\) are also known in advance for each discrete time index, \(k\).
Here, the initial random state vector, \(x_{0}\), is the Gaussian one with known both the mean value, \(m_{0}\), and the corresponding covariance matrix, \(P_{0}\). Also, it is assumed that the zeromean white Gaussian noise sequences, \(\left\{ {w_{k} } \right\}\) and \(\left\{ {v_{k} } \right\}\), are mutually uncorrelated, and uncorrelated with the initial state, \(x_{0}\), for all discrete time indices, \(k\).
Let \(\hat{x}_{kl} = E\left\{ {x_{k} Y^{l} } \right\}\), \(\left( {l = k  1,k} \right)\) denote the optimal linear leastsquare estimates of the state, \(x_{k}\), given the observations \(Y^{l} = \left\{ {y_{j} ,j \le l} \right\}\), where \(E\left\{ { \cdot  \cdot } \right\}\) is underlying conditional expectation, and let \(P_{kl} = E\left\{ {\tilde{x}_{kl} \tilde{x}_{kl}^{T} } \right\}\) denote the corresponding covariance matrix of the estimation error, \(\tilde{x}_{kl} = x_{k}  \tilde{x}_{kl}\). Then, the standard Kalman filter recursions are given by [1,2,3,4,5,6,7].

1) Time update (prediction phase):
$$\hat{x}_{k + 1k} = E\left\{ {x_{k + 1} Y^{k} } \right\} = F_{k} \hat{x}_{kk}$$(3)$$P_{k + 1k} = E\left\{ {\tilde{x}_{k + 1k} \tilde{x}^{T}_{k + 1k} } \right\} = F_{k} P_{kk} F_{k}^{T} + G_{k} Q_{k} G_{k}^{T}$$(4) 
2) Measurement update (correction, estimation or filtering phase):
$$\hat{x}_{k + 1k + 1} = E\left\{ {x_{k + 1} Y^{k + 1} } \right\} = \hat{x}_{k + 1k} + K_{k + 1} \varepsilon_{k + 1} ;\varepsilon_{k + 1} = y_{k + 1}  H_{k + 1} \hat{x}_{k + 1k}$$(5)$$\begin{gathered} K_{k + 1} = P_{k + 1k} H_{k + 1}^{T} S_{k + 1}^{  1} \hfill \\ P_{k + 1k + 1} = \left[ {I  K_{k + 1} H_{k + 1} } \right]P_{k + 1k} \hfill \\ \end{gathered}$$(6)$$S_{k + 1} = E\left\{ {\varepsilon_{k + 1} \varepsilon_{k + 1}^{T} } \right\} = H_{k + 1} P_{k + 1k} H_{{_{k + 1} }}^{T} + R_{k + 1}$$(7)
The Kalman filter is initialized with \(\hat{x}_{{00}} = m_{0}\), \(P_{00} = P_{0}\). The Kalman filter optimality is contained in its suitable predictor–corrector form, and the associated calculation of the gain matrix, \(K\), [1, 5]. However, as mentioned before, the Kalman filter is nonrobust, in the sense of its sensitivity to spiky observations, bad data or outliers. In the statistical literature, there exist at least four definitions of robustness, [15,16,17,18,19,20]. The two of them, named the qualitative and the min–max robustness, respectively, are based on the strong mathematical treatments, [17, 19]. The other two, the socalled resistant and the efficiency robustness, are primarily oriented towards data, and are based on the empirical reasoning, [15, 16, 18]. Roughly speaking, the resistant robustness means that an estimator eliminates successfully the influence of outliers, while the efficiency robustness denotes that an estimator provides for an acceptable estimation quality under both the pure Gaussian observation, and the Gaussian one contaminated by outliers. Both robustness features designate the practical robustness, and are emphasized by the practitioners. Also, although there exist several robust estimation procedures in the statistical literature, the Huber’s Mrobust approach is preferable, since it originates from the optimal maximum likelihood (ML) concept, making it more natural and easier to implement, [17]. In this sense, an estimator must not be exactly the optimal ML estimator, but has to approximate the optimal one in such a manner to achieve the practical robustness goals. It should be noted that the min–max robust estimation is exactly the optimal ML estimation based on the loss or score function, \(\rho \left( \cdot \right) =  \ln p_{0} \left( \cdot \right)\), named the likelihood function, with \(p_{0} \left( \cdot \right)\) being the worstcase pdf within the given pdf’s class. The worstcase pdf contains the minimal information about a variable to be estimated and minimizes the Cramer–Rao lower bound. This represents a nonclassical variational problem that can be solved exactly only for the static models, when the posed problem reduces to minimizing the Fisher information, [17, 20]. In addition, the qualitative robustness is based on the Hampel’s definition of the influence function, as a suitable measure of the robustness capacity [19]. In this sense, the influence function represents the first derivative, or the slope, of the robust score function, \(\rho\), used to define the Huber’s Mrobust performance index. [17, 19].
As mentioned before, any Kalman filter whose gain differs from the optimal one, owing to the errors in the presumed noise statistics, or due to an inadequate representation of the system state dynamics, can be viewed as a dynamic stochastic approximation algorithm, [1, 10,11,12,13]. Therefore, this algorithm may represent a suitable substitution to an optimal estimation technique, when the assumptions on which the latter one is based are not fulfilled in practice. Starting from the practical limitations of the linear optimal Kalman filter, this approach can be applied further to making the optimal Kalman filter more robust.
3 Statistically linearized MRobustified Kalman filtering
As mentioned above, the Huber’s Mrobust approach combined with the dynamic stochastic approximation method may be used to computing a robust recursive state estimates of the dynamic stochastic system represented by (1), assuming the scalar observations, in (2). Also, a case of the multidimensional measurements, in (2), may be considered in the same manner by processing the individual observations one at a time. This approach assumes that the components of the measurement vector, in (2), can be processed sequentially, as the uncorrelated scalar observations. In this sense, one has to redefine the measurement vector, in (2), to making the corresponding measurement errors, or the noise vector components, to be mutually uncorrelated. This, in turn, results in a diagonal form of the measurement uncertainty covariance matrix,\(R_{k}\). A suitable stable numerical decomposition method that is frequently used in practice is the Cholesky factorization, or its modification named the UDdecomposition, [2, 3].
The Huber’s Mrobust estimator minimizes an empirical average loss, being defined by the nonlinear score function, \(\rho\), to estimate the constant parameters in a linear regression problem, [17]. To apply this robust approach to a dynamic system recursive state estimation, the timevarying Mrobust performance index is introduced, instead of the Mrobust performance measure in the form of the empirical average loss, that is
with \(E\left\{ {\left. { \cdot \,} \right\overline{x},Y^{k} } \right\}\) being the conditional expectation under the known predicted state, \(\overline{x}\), at the present stage, \(k\), as well as the known observations up to the current stage, \(Y^{k}\), where \(y\) is a scalar system output in (2), [12, 20]. Starting from (8), one can define a family of the dynamic stochastic approximation recursive estimators, minimizing the Mrobust performance index (8) at each stage, \(k\)
where \(\Gamma_{k}\) is the matrix gain, and \(\overline{x}_{k}\) is a onestep prediction, to take into account for the changes in the current state, \(x_{k}\), in (1). The term, \(\nabla_{{\overline{x}}} J_{k} \left( \cdot \right)\) in (9), designates the gradient vector of the scalarvalued deterministic Mrobust criterion in (8). Taking into account (8), one obtains
with \(\psi \left( \cdot \right)\) being the first derivative, named the influence function, of the robust score function, \(\rho \left( \cdot \right),\) in (8). Moreover, the term, \(\partial \left( \cdot \right)/\partial x = \left\{ {\partial \left( \cdot \right)/\partial x_{1} \cdots \partial \left( \cdot \right)/\partial x_{n} } \right\}^{T}\), denotes the partial derivative operator, where \(x\) is the \(n \times 1\) column vector.
Analogously to (5), the measurement prediction residual, or the innovation, is defined by
where \(y_{k}\) is the scalar system output, and \(H_{k}\) is the observation vector in (2), at the stage, \(k\). In addition, \(s_{k}\) is the normalizing (scaling) factor that provides for the scaleinvariant state estimates, and represents an estimate of the corresponding standard deviation.
In general, the conditional expectation in (10) is indeterminable and, analogously to the dynamic stochastic approximation approach, can be approximated by the current sample, [1, 3]. Thus, the unknown expectation, in (10), can be estimated at each stage, \(k\), by the current realization of the underlying random argument. This, in turn, results in the stochastic gradient vector representation
Furthermore, by replacing (12) with (9), a family of the Mrobustified dynamic stochastic approximation recursive state estimators takes the form
In words, the posed optimization problem (8) reduces to finding the solution of the equation, \(g_{k} \left( \cdot \right) = 0\), at each stage, \(k\), with \(g_{k}\), in (9), being the socalled regression function. Since this function is unknown, it is replaced by the random sample realization, (12), and the resulting estimation scheme (13) is known as the dynamic stochastic approximation algorithm, [1, 10,11,12].
The role of an admissible score function, \(\rho\), in (8) is to provide for the practical robustness of the estimation procedure (13). To achieve such performance, the Mrobust influence function, \(\psi = \rho ^{\prime}\), has to be a bounded and continuous function [15,16,17,18,19]. This, in turn, produces that both the single and the grouped outliers will not have a significant impact on the state estimates (13), satisfying the resistant robustness requirement. Additionally, to obey the efficiency robustness feature, the estimation procedure (13) has to perform fairly well under both the pure Gaussian observations and the Gaussian one contaminated by outliers. The Huber’s Mrobust score function, \(\rho_{H}\), that is quadratic in the middle, but increases slowly than the quadratic one in the tails, obeys both the practical robustness requirements [17]. The corresponding Mrobust influence function is the monotonously nondecreasing saturationtype nonlinearity, given by
where \(\Delta\) is the tuning constant that controls the efficiency robustness. The choice \(\Delta = 1.5\) often produces an acceptable result, and such procedure is known as the Huber’s 1.5 Mrobust approach [17]. Nonlinear data processing using the saturation function (14) is known in the statistical literature as the winsorization [15,16,17,18,19]. As mentioned before, statistical analysis has shown that the various measurement data contain, as a rule, 5 to 10 percentage of outliers [15]. In this sense, it is frequently assumed that the observations are generated by the Gaussian mixture pdf
where \(\delta\) is the contamination degree, and \(\sigma_{n}^{2}\) is the unit variance of the mostly observations generated by the standard zeromean Gaussian pdf with the unit variance, \(N\left( { \cdot 0,1} \right)\), while \(\sigma_{o}^{2}\) is a large variance of outliers, generated by the zeromean normal pdf, \(N\left( { \cdot 0,\sigma_{o}^{2} } \right)\). Such pdf is also known as the \(\delta\)contaminated normal one [15,16,17,18,19,20].
Particularly, for the pdf class (15), with an arbitrarily zeromean symmetric contaminating pdf, instead of the Gaussian one, \(N\left( { \cdot 0,\sigma_{o}^{2} } \right)\), the worstcase pdf, \(p_{0}\), in the sense of the minimal Fisher information, is the Gaussian one in the middle and the Laplace, or double exponential, one in the tails. The influence function, \(\psi = \rho^{\prime}\), of the associated likelihood function, \(\rho \left( \cdot \right) =  \ln p_{0} \left( \cdot \right)\), is the saturationtype nonlinearity, in (14), [17]. Examples of the pdf’s classes commonly used in engineering problems, and the derivation of the worstcase pdf within the prespecified class, are presented in the literature, [17, 20].
The role of the matrix gain, \(\Gamma_{k}\), in (13) is to control the convergence speed. At this moment, the gain, \(\Gamma_{k}\), is not connected to any assumption about the random state to be estimated, and the corresponding noise sequences. Therefore, to link the optimal Kalman filter with the recursive robust state estimator (13), an additional optimization criterion of the approximate minimum variance type is introduced
where the matrix \(P_{k}\) is the estimation error covariance at the stage \(k\), with \(Trace\) being the matrix trace. Minimization of the scalar, deterministic criterion, in (16), at each stage, \(k\), with respect to the gain matrix \(\Gamma_{k}\), represents a complex nonlinear problem, and an approximate optimal solution can be obtained by using the statistical linearization technique, [1,2,3]. Starting from an odd \(\psi\)function, in (14), where its random argument, \(z\), is a sample from a zeromean white scaled measurement residual sequence, \(\left\{ {\varepsilon_{k} /s_{k} } \right\}\) in (13), with a symmetric pdf belonging to the class (15), the application of the statistical linearization method results in the following approximation of the influence function
with \(\alpha\) being the meansquare optimal statistical linearization coefficient, while \(\sigma_{z}^{2}\) is the variance of the random argument \(z\), (for more details, see Appendix 2).
Particularly, if \(\psi \left( \cdot \right)\) in (17) is the saturation function (14), the coefficient, \(\alpha\), is dependent on the linear segment of \(\psi \left( \cdot \right)\), the saturation threshold, \(\Delta\), and the variance, \(\sigma_{z}^{2}\). In general, for the small \(\sigma_{z}\)values, in comparison with the \(\Delta\)values, the probability of saturation occurrence is low, resulting in the \(\alpha\)values close to one. Moreover, for the higher \(\sigma_{z}\)values, the \(\alpha\)values are significantly smaller than one, due to a larger probability of the saturation existence. Therefore, for the prespecified \(\psi\)function, in (14), and the different \(\sigma_{z}\)values, a set of the \(\alpha\)coefficients from the interval [0, 1] is obtained. Furthermore, since the normalized residual, in (17), has the unit standard deviation that is smaller than the threshold, \(\Delta ,\) the corresponding coefficient, \(\alpha\), is close to one. By substituting (17) in (13), one obtains the following relation for the statistically linearized Mrobustified Kalman state estimates, instead of the recursions (5)–(7),
Here, the standard deviation, \(s_{k}\), of the measurement residual, \(\varepsilon_{k}\), in (18) may be defined by the associated variance in (7), yielding
with the matrix, \(M_{k}\), being the prediction error covariance defined by (4), that is
Particularly, for the measurement noise model, in (15), the underline observation noise variance in (19) is given by
In general, the contamination degree, \(\delta\), is not exactly known in practice and cannot be determined adequately from the measurement residuals, [15,16,17,18,19,20]. As mentioned before, a reasonable choice in practice is to adopt the \(\delta\)value in advance within the interval from 0.05 to 0.1, corresponding to 5 to 10 percentage of outliers in the Gaussian distributed measurement data. Furthermore, the standard deviation of outliers, \(\sigma_{o}\), is also unknown, but it is significantly greater than the unit nominal standard deviation, \(\sigma_{n}\), of the mainly zeromean Gaussian noise samples, in (15). Taking into account (1), (2), (11), (18)–(20), together with the convenient simplifications, one obtains an approximate optimal solution, by minimizing the adopted criterion, in (16), (for more details, see Appendix 1)
In addition, starting from (13)–(22), the statistically linearized Mrobustified dynamic stochastic approximation recursive state estimator is defined by
Here, the residual, \(\varepsilon_{k}\), scaling factor, \(s_{k}\), Huber’s influence factor, \(\psi_{H}\), and the coefficient, \(\alpha\), are defined by (11), (14), (17) and (19), respectively, while the estimation error covariance matrix, \(P_{k}\), is given in (22). The recurrent relations (20), (22) and (23), are similar to the measurementupdate recursions, (5)–(7), in the filtering stage of the linear optimal Kalman filter. Since the design of the prediction and the estimation processes in the optimal Kalman filter are independent, the last one may be robustified by combining the recursive robust estimation process in (22) and (23), instead of the measurementupdate recursions (5)–(7), with the onestep meansquare optimal prediction in (3), (4), to derive a new Mrobustified version of the optimal Kalman filter. In this sense, the timeupdate recurrent relations, (3) and (4), in the prediction stage of the linear optimal Kalman filter define also the recursive prediction process in the Mrobustified statistically linearized Kalman filter. Thus, for the onestep prediction, \(\overline{x}_{k}\) in (23), and the corresponding prediction error covariance matrix, \(M_{k}\) in (20) and (23), the same recursions as in (3) and (4) are obtained, that is
Here, \(F_{k}\) and \(G_{k}\) are the system state transition matrix and the state noise matrix, in (1), with the state noise covariance matrix, \(Q_{k}\).
Unfortunately, the measurement noise statistics are not exactly known in many applications, and in such circumstances the meansquare optimal linearization coefficient \(\alpha\) in (17) is indeterminable. Therefore, the optimal coefficient, in (17), may be approximated by the fixed coefficient, \(\alpha_{f}\), defined by the relation
where \(\psi^{\prime}\left( \cdot \right)\) is the first derivate, or the slope, of the \(\psi\)function.
Particularly, for the Huber’s \(\psi_{H}\)function in (14), the relation (25) may serve to explain the physical meaning of the fixed coefficient, \(\alpha_{f}\), and to estimate its value. Starting from (14) and (25), one gets
where \(p\left( \cdot \right)\) is the unknown measurement noise pdf, in (15). Here is assumed that the real pdf of the scaled residual, \(\varepsilon /s\) in (23), also belongs to the given pdf class, in (15), (for more details, see Appendix 2). The integral, in (26), is equal to the probability that the observations are generated by the nominal standard Gaussian pdf, corresponding to the linear part of the \(\psi_{H}\)function in (14), with the slope, \(\psi^{\prime}_{H}\), being equal to one. In accordance with (15), the underlying probability may be estimated by (26), using the assumed contamination degree, \(\delta\). It should be noted that the calculation of the fixed coefficient, \(\alpha_{f}\), in (26) may be also based on the worstcase pdf, \(p_{0}\), within the given class (15), resulting in
where \(erf\) is the error function, [17, 20]. This solution is asymptotically equal to (26), since the value of the \(erf\) function is close to 0.5 for a large enough argument.
Thus, the relations (26), or (27), define the fixed and feasible approximations of the optimal statistical linearization coefficient, in (17), representing an approximation of the Huber’s Mrobust influence function average slope, where the influence function, \(\psi_{H}\), is defined by (14).
Thus, in the absence of outliers, corresponding to the zerovalued contamination degree, \(\delta\), the \(\alpha\) value, in (26) or (27), is equal to one, reducing the robust gain, \(K_{k}\) in (23), to the optimal Kalman gain, in (6). Since the robust influence function, \(\psi_{H}\), in (14), operates in its linear regime, corresponding to the linear influence function of the optimal Kalman filter, (3)–(7) the robust recursive estimator, (23), (24), performs as the optimal Kalman filter. However, in the presence of outliers, the fixed factor, \(\alpha_{f}\) in (26) or (27), decreases with the contamination degree increased values, decreasing further the values of the robust gain matrix, \(K_{k}\). in (23). Since the influence function, \(\psi_{H}\) in (14), now operates in its saturation regime, the combination of these two effects suppresses the influence of outliers to the robust recursive estimates, in (23).
On the other hand, the variable approximation of the optimal statistical linearization coefficient, in (25), is given by
where the expectation, in (26), is replaced by the current sample. This approximation represents the current slope of the Huber’s Mrobust influence function, in (14), being approximately equal to zero or one. Thus, in the absence of outliers, the variable coefficient, \(\alpha_{k}\) in (28), has the unit value, so the robust gain matrix, \(K_{k}\) in (23), is reduced to the optimal Kalman gain, in (6). Since the Huber’s influence function, in (14), operates in the linear regime, the robust recursive estimator, in (23), performs as the optimal linear Kalman filter. On the other hand, in the presence of outliers, the variable \(\alpha_{k}\)coefficient, in (28), is close to zerovalue, decreasing significantly the robust gain, \(K_{k}\) in (23), while the influence function, \(\psi_{H}\), now is confined to its saturation regime, thus suppressing more efficiently the influence of outliers, in comparison with the application of the fixed coefficient, in (26) or (27).
In summary, the proposed statistically linearized Mrobustified Kalman filtering algorithm consists of the time update, in (24), and the measurement update given by (11), (14), (19), (22), (23) and (26), or (28). It belongs to a class of recursive stochastic procedures, and some theoretical analysis of the estimates convergence is very difficult, due to a nonlinear form of the robust recursive state estimator by itself and a timevarying system state dynamics. Therefore, further analysis is based on both the simulations, using a manoeuver target radar tracking example, and the real data, related to an object tracking in the video sequence, generated by the shortwave infrared camera.
4 Experimental results and discussion
As mentioned above, the simulation example is related to the radar tracking problem in conditions close to reality. In this sense, the radar measurements usually consist of range, azimuth and elevation angles, since the observation noises are uncoupled in the spherical coordinate system (SCS). However, a requirement for simple filtering implies the desirability of the uncoupled filtering in the Cartesian coordinate system (CCS) [40]. Thus, the CCS would employ the three independent Kalman filters in each of the coordinates, \(\left( {x,y,z} \right)\). In addition, if the sampling period is larger than the target manoeuver time constant, the computationally convenient reduction to the three independent twostate (position and velocity components) Kalman filters in the \(\left( {x,y,z} \right)\) directions is recommended, because the Kalman gains associated with the acceleration terms are rather small [40]. As a consequence, the state noise covariance, \(Q\), has to be chosen so to compensate for the missing acceleration terms. Bearing in mind that the measurements when transformed from the SCS to the CCS are no more uncoupled, the proposed approach represents a tradeoff between a potential performance loss and computational feasibility. Since the obtained simulation results are very similar in each of the \(\left( {x,y,z} \right)\) CCS directions, only the simulation results related to the \(x\)CCS axis are presented in the sequel.
The first simulation task is to model the system state dynamics, using the kinematic equations of motion [40]. Thus, if \(x_{k} ,\) \(v_{k}\) and \(a_{k}\) indicate the target position, velocity and acceleration, respectively, at the discrete time, \(t_{k} = kT\), \(k = 0,1,...\), \(x\)CCS axis, with \(T\) being the uniform sampling period, and assuming that the acceleration is constant over the sampling interval, \(t_{k} \le t \le t_{k + 1}\), one obtains, by integrating the acceleration twice time over the given interval, the following set of equations (the equivalent equations may be written for the \(y\) and \(z\)CCS directions)
A particular target position trajectory may be obtained from (29), by defining in advance a piecewise constant acceleration profile. The model (29) with the zerovalued acceleration term is known as the constant velocity (CV) one. Thus, any target movement that cannot be represented by the CVmodel may be considered as the target manoeuvre [40]. An example of such target position trajectory, used in the simulations, is presented in Fig. 1. The measurement sequence is simulated using the linear position sensor, represented by
where the zeromean white measurement noise sequence, \(\left\{ {v_{k} } \right\}\), is confined to the pdf (15).
In a monopulse radar, such heavytailed feature of the underlying observation noise pdf is associated with the large target glint spikes, representing the outliers [40]. A sample of the random variable, \(v_{k}\), with such pdf may be generated by firstly taking a sample, \(u\), belonging to the (0,1)uniform pdf. Thus, if the sample, \(u\), is greater than the \(\delta\)value, the sample, \(v_{k}\), is generated from the standard zeromean Gaussian pdf with the unit variance; otherwise, the sample, \(v_{k}\), is generated from the contaminating zeromean Gaussian pdf with the assumed huge variance \(\sigma_{o}^{2} > > 1\). Observations are generated in a separate computer program from a set of the true kinematic equations of the target motion, (29) and (30), and the previously obtained noise sample, \(v_{k}\). A typical observation noise record is depicted in Fig. 2. Besides, the filter world is represented by the twodimensional, discrete, timeinvariant statespace model in the form (1), (2), given by
The state transition matrix, \(F\), follows directly from (29) by taking \(t = t_{k + 1}\) and neglecting the acceleration term, while the observation or information vector, \(H\), follows from (30). The zeromean white state noise sequence, \(\left\{ {w_{k} } \right\}\), is introduced artificially to compensate for the unmodelled system dynamics, associated with the unknown target manoeuvre. The variances of the noise sequences, \(\left\{ {v_{k} } \right\}\) and \(\left\{ {w_{k} } \right\}\), are given by \(Q = 0.1\) and \(R = 1\), respectively. Moreover, the uniform timestep \(T = 0.02s\) is used. The following algorithms have been compared: the linear optimal Kalman filter (3)–(7), designated as A1; the statistically linearized Mrobustified Kalman filter (11), (14), (17), (19), (22)–(24) with the variable statistical linearization coefficient in (28), designated as A2; the statistically linearized Mrobustified Kalman filter (11), (14), (17), (19), (22)–(24) with the fixed statistical linearization coefficient in (26), designated as A3; and the quasilinear approximation of the algorithm A2, based on the linear residual transformation, in (23), instead of the nonlinear one in (14), together with application of the same nonlinear residual processing in computing the adaptive gain matrix, \(K_{k}\), as in A2, designated as A4.
Here, the initial state estimate, \(\hat{x}_{0}\), and the corresponding covariance, \(P_{0}\), are calculated using a suboptimal procedure based on the first two observations, [40]
The performances of the analysed filters are compared both in terms of the estimated and the true position profiles, as well as the cumulative estimation error criterion
with \(\left\ \cdot \right\\) being the Euclidean norm, where the true target position trajectory, \(x_{k}\), is depicted in Fig. 1. The \(CEE\) criterion values obtained for different algorithms and different measurement noise realizations in (15) are presented in Figs. 3 and 4. The results plotted in Fig. 3 have shown that the robustified Kalman filters A2A4 satisfy the efficiency robustness requirement, since the obtained values of the criterion (33) for these algorithms are not significantly larger, in comparison with the optimal Kalman filter, A1, under the pure Gaussian observations. In addition, the state estimators A2A4 also satisfy the resistant robustness requirement, producing significantly smaller values of the criterion (33) than the optimal Kalman state estimator, A1, in the presence of outliers within the Gaussian observations, as depicted in Fig. 4. The parts of the true and the estimated target trajectories, generated by the algorithms A1 and A2, are depicted in Fig. 5. Similar results are obtained for the algorithms A3 and A4. However, an analysis of the estimator performances using the true and the estimated profiles is not suitable, since the target positions on the trajectories are expressed in much larger units, in comparison with the values of the underlying estimation errors. Therefore, the adopted \(CEE\) criterion in (33) is a more suitable factor of goodness, concerning the estimation quality. In this sense, the simulation results presented in Figs. 3, 4, 5 have shown that the proposed robust filters A2A4 obey the practical robustness requirements.
Moreover, extensive Monte Carlo simulations have shown that the robustified versions A2A4 of the optimal linear Kalman filter, A1, perform fairly well for the contamination degree \(\delta \le 0.3\), since for greater \(\delta\)values the observation noise model (15) is no more adequate. Furthermore, the best performances are obtained for the algorithm A2, owing to the common effects of the nonlinear residual transformation, in (14), and the calculation of the gain matrix, in (23), using an adaptive robustifying linearization coefficient, in (28). In words, these effects result in the values of the gain matrix large enough to produce a good tracking feature, but also small enough to provide for the noise reduction. The algorithm A4 produces a slightly worst result than A2. A disadvantage of the algorithm A3 is the application of the fixed linearization coefficient, in (26), depending on the exactly unknown contamination degree, \(\delta\), in (15) that cannot be estimated properly from the residuals, [15, 17]. Additionally, in the presence of outliers, the fixed linearization coefficient, in (26), reduces the gain matrix values, in (23), but the soobtained gain factor values are larger than the gain values generated by the adaptive factor, in (28). This, in turn, makes the underlying state estimates more sensitive to impulsive noise, or outliers, in comparison with the algorithms A2 and A4. In this sense, although the algorithm A4 is linear in the observations, as A1, it utilizes the nonlinear robust data processing in calculating the gain matrix, as in A2, suppressing efficiently the influence of outliers.
The second part of the experimental results is devoted to the real data, concerning to object tracking in the video sequence using the shortwave infrared camera. In this sense, the goal of the video tracking is an estimation of the location of a moving object in the video sequence. For the experimental analysis of a single moving object tracking in the video sequence, a kernelized correlation filter (KCF), [41], is used as a basic tracker since it is the one of the fastest trackers that does not require the graphics processing unit for realtime processing, [42]. In video tracking, the occlusions are among the most challenging problems, [43]. Although the KCF algorithm performs very well under the regular conditions, its performance decreases in the presence of occlusions. In this sense, when the tracked object disappears, due to the full occlusion, the KCF tracker will get stuck at the position of occlusion and continue to track the background, as the object of interest. To overcome this problem, the prediction and the estimation of the object’s motion dynamics are required. Thus, the KCF tracker is synced with the Kalman filter to improve the tracking performance when the object is occluded. Furthermore, during the occlusion period, the tracked object may perform a manoeuver. However, in a case of the object manoeuvre under the occlusion, it may be happened that the object is not redetected after the occlusion, [44]. In this sense, the search area (window) used by the KCF tracker is not sufficient for object redetection after occlusion, [41]. Therefore, when the occlusion is detected, the extended search area is used for the possible object redetection after occlusion. Here, for occlusion detection, the peaktosidelobe ratio (PSR) metrics is used, [44]. The extended object search area is implemented by replicating the search windows around the central one, [45]. The dynamics of the tracked object, and those of the central search window position, are estimated by the Kalman filter, defining at the same time the central position of the extended search area. In this way, by estimating the dynamics of the object’s motion, and by expanding the object search area, it is possible to overcome occlusions, and redetect the object after occlusion, providing for continued tracking.
The object being tracked is represented by a bounding box, which is defined with the centre (\(x_{c} ,y_{c}\)) in the image plane, and the corresponding height and width. To approximate the interframe object position (bounding box centre) displacements, the linear constant velocity model, in (31), is applied in the two directions, \(x\) and \(y\), yielding the statespace model in the form (1), (2). Thus, the system state vector, \(X\), and the corresponding system matrices, \(F\), \(G\), and \(H,\) are given by
with \(\dot{x}\), \(\dot{y}\), being the first derivative, or the velocity, of the state vector components, \(x\) and \(y\), respectively, and \(T\) is equal to one. Thus, the two independent twostate (position and velocity components) Kalman filters in the \(x\) and \(y\) directions are used, in (34). Kalman filter, defined by (34), is initialized on the first video sequence frame with the groundtruth object position. The corresponding estimation error covariance matrix, \(P_{0}\), the state noise covariance matrix, \(Q\), and the observation noise covariance matrix, \(R\), are defined as follows:
Particularly, the object tracking in infrared imagery is considered in the sequel. For this purpose, the two characteristic video sequences are recorded, using the “Vlatacom electrooptical surveillance system,” [46]. The video sequences are recorded using the shortwave infrared camera, with the resolution of 576 × 504 pixels and the 25 frames per second. The recorded scenarios cover the typical urban scenes in reallife surveillance applications, including the single moving object and the various type of occlusions, such as static and moving, partial or complete, and shortterm and longterm occlusions. The objects of interest for tracking are pedestrians. The first video sequence contains 825 frames with the partial and the full static occlusions. The second one has 225 frames, and the tracked object is occluded by the partial moving and the full static occlusions.
Starting from the two recorded sequences, and analysing the moving object tracking, using the combination of the improved KCF algorithm, with an extended search area and the standard Kalman filer, it can be concluded that the appearance of different types of occlusions results occasionally in large intensity errors, which may be treated as outliers. Thus in Fig. 6 are shown the position errors for the vertical, \(y\), and the horizontal, \(x\), directions. The position measurement errors, which deviate significantly from the majority of the population in the central cluster, in Fig. 6, present the outliers, caused by various occlusions. As mentioned before, the standard Kalman filter has the linear influence function, so that it is sensitive to outliers or nonrobust. As a consequence, these errors in the measurement data can lead to the object loss and the tracking failure. Therefore, the Mrobustified statistically linearized Kalman filter with the variable linearization coefficient, denoted above as A2, is proposed to supress the influence of outliers in the video tracking applications.
The position errors of the different algorithms, based on the combination of the improved KCF tracker with the standard and the Mrobustified adaptive Kalman filters, are shown in Fig. 7 (the first recorded video sequence) and in Fig. 8 (the second recorded video sequence). To clearly demonstrate tracking performance on real data in Fig. 9 (the first video sequence) and Fig. 10 (the second video sequence) are shown frames from these sequences with groundtruth bounding boxes, and bounding boxes generated by the standard and the robust Kalman filter. Figures 9 and 10 show that in the case of occlusions, the algorithm using the standard Kalman filter significantly deviates from the groundtruth position. In the scenario in Fig. 9, there is a complete loss of the object of interest, which can also be seen in Fig. 7. In the scenario in Fig. 10, although there is no loss of the object, at the moment of occlusion, the error of the algorithm is very large, which is confirmed by the graphic in Fig. 8. In the system with a camera on the pantilt [46], these large position errors may lead to a sudden movement of the system and the loss of the object from the field of view. On the other hand, the robust Kalman filter approach successfully overcomes occlusions and continues tracking more smoothly. The obtained results, based on the real data, confirm the earlier derived conclusions from the simulation results. The presented experimental results also indicate a possibility of designing an efficient robust tracking system in the video surveillance applications, being a combination of the KCF tracker and the proposed adaptive Mrobustified version of the optimal Kaman filter.
5 Conclusion
Kalman filter produces the optimal state estimates of a linear dynamic stochastic system in the presence of the Gaussian distributed both the random input, the socalled state noise and the additive measurement noise. The optimality of Kalman filter is related to its predictorcorrector recursive form and the computation of the gain sequence. However, presence of the erroneous noise statistics and/or miss modelling may cause significant deviations from the theoretically optimal performances.
Starting from these practical limitations of the optimal Kalman filter, a new class of the statistically linearized Mrobustified Kalman filtering algorithms has been proposed in this article. The proposed robust algorithms are feasible and provide for the recursive dynamic system state estimation. The article also produces the complete derivation of the algorithms to be handled. In this sense, the timeupdate recursion is designed accordingly to the optimal Kalman filter, and the measurementupdate recursion is designed as the nonlinear dynamic stochastic approximation procedure, generated by minimizing at each stage the generalized timevarying Huber’s Mrobust performance index. Thus, the optimal Kalman filter robustification is obtained by nonlinear transformation of the scaled residuals through Huber’s Mrobust influence function. Analogously to the standard Kalman filter, the robust recursive estimator gain matrix is computed from an additional optimization procedure of the minimum variance type. The posed nonlinear optimization process applies the statistical linearization technique to provide for a suboptimal robust version of the Kalman gain matrix. Since the determination of the meansquare optimal statistical linearization coefficient assumes the exact knowledge of the observation noise statistics, both the fixed and the variable approximations of the optimal coefficient are proposed. Thus, the fixed approximation of this coefficient represents an approximation of the average slope of the Huber’s Mrobust influence function that is estimated further by the assumed probability of outliers occurrence. A variable version of such fixed coefficient is obtained by approximating the expectation by the current sample, resulting in the present slope of Mrobust influence function.
Theoretical convergence analysis of the proposed robust algorithms is difficult, due to both their nonlinear forms and a timevarying multidimensional system dynamics. Therefore, practical robustness of the derived state estimators, including the resistant and the efficiency robustness, is analysed by simulations, using a single manoeuvring target tracking example. The experimental results also allow understanding of the algorithms operations, with and without outliers, where each case is accomplished by an adequate robust gain matrix. Additionally, it indicates to the conclusion that both the nonlinear transformation of the scaled measurements residuals, using the Huber’s Mrobust influence function, and the robustified computation of the gain matrix, applying an adaptive statistical linearization coefficient, provide for a good compromise between the tracking performance and the noise immunity. In words, the variable coefficient is adapted properly to the nonlinear form of the Mrobust influence function, reducing the effects of outliers. Moreover, the fixed statistical linearization coefficient results in a slower decrease in the gain matrix, in comparison with the variable one. This, in turn, eliminates the effects of outliers worser than in the case of the variable statistical linearization coefficient. A quasilinear approximation of the proposed statistically linearized Mrobustified Kalman state estimator, based on the linear residual transformation, together with the adaptive nonlinear residual processing in calculating the robust gain matrix, produces a slightly worse performance than the starting nonlinear robust estimator. Moreover, the experimental results based on real data, concerning a video tracking, using shortwave infrared camera, are also analysed. The real data consist of the two recorded video sequences, representing the typical urban scenes in the reallife surveillance applications including the pedestrian as an object of interest in the scenarios with various types of occlusions (static or moving, partially or complete, short and long term). The application of the proposed video tracker, being the combination of the improved kernelized correlation filter and the Mrobustified statistically linearized Kalman filter, provides an efficient, robust method for tracking of manoeuvring object in the presence of occlusions. These results are in accordance with the conclusion derived from the simulations and indicate to a possibility of designing an efficient robust video tracking algorithm.
The proposed statistically linearized Mrobustified filtering technique can be also applied to some redescending influence function that may be better in eliminating the influence of outliers. However, the robust score function associated with such influence function, and determining the Mrobust performance measure to be minimized, is not the convex one. Therefore, there could be convergence problems during the robust filter initialization. The problem may be circumvented by applying the twostep estimation procedure, where in the first step the proposed Mrobust version of the Kalman filter, based on the Huber’s monotonously nondecrease influence function, is applied. This, in turn, generates the good initial guesses to the Mrobust version of the Kalman filter, based on a redescending Mrobust influence function, in the second step.
Finally, an approximation of the nonlinearities in the motion and measurement equations of a nonlinear stochastic dynamic system by the statistical linearization technique can be combined with the proposed statistically linearized Mrobustified Kalman filter to obtain a robust recursive state estimator of the approximate minimum variance type. Although these equations look like those of the proposed statistically linearized Mrobustified Kalman filter, they are much more complex. In this sense, much auxiliary computation is needed to obtain the various vectors and matrices of the corresponding expectations, defining the statistical linearization coefficients. Thus, the degree of difficulty in calculating the underlying coefficients is a significant argument in deciding whether to apply this method in practice. However, calculated values of the required quantities can be stored for “lookup” during the state estimation, greatly reducing the computation efforts.
In summary, since the most industrial and scientistic data contain unavoidable outliers, owing to metre and communication errors, incomplete measurements, errors in mathematical models, etc., the proposed recursive robust estimator may be applied in different problems, including system identification, state estimation, signal processing and adaptive control.
Availability of data and materials
Data are available on request from the authors.
Abbreviations
 A1:

Linear optimal Kalman filter algorithm
 A2:

Statistically linearized Mrobustified Kalman filter algorithm with the variable statistical linearization coefficient
 A3:

Statistically linearized Mrobustified Kalman filter algorithm with the fixed statistical linearization coefficient
 A4:

Quasilinear approximation of algorithm A2, based on the linear residual transformation, together with application of the nonlinear residual processing in computing the gain matrix
 CCS:

Cartesian coordinate system
 CEE:

Cumulative estimation error criterion
 CV:

Constant velocity
 KCF:

Kernelized correlation filter
 ML:

Maximum likelihood
 MSE:

Mean square error
 pdfP:

Probability distribution function
 PSR:

Peaktosidelobe ratio
 SCS:

Spherical coordinate system
 SWIR:

Shortwave infrared
References
A. Gelb (ed.), Applied optimal estimation, Analytic Sciences Corporation (MIT Press, Cambridge, MA, 2010)
M.S. Grewal, A.P. Andrews, Kalman filtering theory and practice using Matlab (Wiley, Hoboken, NJ, 2015)
R. Stengel, Stochastic optimal control (Wiley, New York, 1986)
F. van der Heijden, B. Lei, G. Xu, F. Ming, Y. Zou, D. de Ridder, D.M. Tax, Classification, parameter estimation, and state estimation: an engineering approach using Matlab (Wiley, Hoboken, NJ, 2017)
Kovačević B., & Đurović Ž., Fundamentals of stochastic signals, systems and estimation theory with worked examples (Springer, Berlin, 2011)
M. Verhaegen, V. Verdult, Filtering and system identification: a least squares approach (Cambridge University Press, Cambridge, 2012)
J.V. Candy, Modelbased signal processing (Wiley, Hoboken, NJ, 2006)
C. Price, An analysis of the divergence problem in the Kalman filter. IEEE Trans. Autom. Control 13(6), 699–702 (1968). https://doi.org/10.1109/tac.1968.1099031
P. Hanlon, P. Maybeck, Characterization of Kalman filter residuals in the presence of mismodeling. IEEE Trans. Aerosp. Electron. Syst. 36(1), 114–131 (2000). https://doi.org/10.1109/7.826316
A.E. Albert, L.A. Gardner, Stochastic approximation and nonlinear regression (The MIT Press, Cambridge, MA, 1967)
G. Saridis, Z. Nikolic, K. Fu, Stochastic approximation algorithms for system identification, estimation, and decomposition of mixtures. IEEE Trans. Syst. Sci. Cybernet. 5(1), 8–15 (1969). https://doi.org/10.1109/tssc.1969.300238
S. Stanković, B. Kovačević, Analysis of robust stochastic approximation algorithms for process identification. Automatica 22(4), 483–488 (1986). https://doi.org/10.1016/00051098(86)900531
T͡sypkin, I., Adaptation and learning in automatic systems (Academic Press, New York, 1971)
J.M. Mendel, Adaptive, learning and pattern recognition systems: theory and applications (Acad. Press, New York, 2012)
V. Barnett, T. Lewis, Outliers in statistical data (Wiley, Chichester, 2000)
W.N. Venables, B.D. Ripley, Modern applied statistics with S (Springer, New York, 2011)
P.J. Huber, E.M. Ronchetti, Robust statistics (Wiley, Hoboken, NJ, 2009)
R.R. Wilcox, Introduction to robust estimation and hypothesis testing (Academic Press, Amsterdam, 2017)
F.R. Hampel, E.N. Ronchetti, P.J. Rousseeuw, W.A. Stahel, Robust statistics: the approach based on influence functions (Wiley, Hoboken, NJ, 1986)
B. Kovačević, M. Milosavljević, M. Veinović, M. Marković, Robust digital processing of speech signals. Springer (2017)
C. Boncelet, B. Dickinson, An approach to robust Kalman filtering, in The 22Nd IEEE Conference On Decision And Control (1983). https://doi.org/10.1109/cdc.1983.269847
B. Kovačević, Ž Đurović, S. Glavaški, On robust Kalman filtering. Int. J. Control 56(3), 547–562 (1992). https://doi.org/10.1080/00207179208934328
Ž Đurović, B. Kovačević, Robust estimation with unknown noise statistics. IEEE Trans. Autom. Control 44(6), 1292–1296 (1999). https://doi.org/10.1109/9.769393
G. Chang, M. Liu, Mestimatorbased robust Kalman filter for systems with process modeling errors and rank deficient measurement models. Nonlinear Dyn. 80(3), 1431–1449 (2015). https://doi.org/10.1007/s1107101519530
L. Chang, B. Hu, G. Chang, A. Li, Robust derivativefree Kalman filter based on Huber’s Mestimation methodology. J. Process Control 23(10), 1555–1561 (2013). https://doi.org/10.1016/j.jprocont.2013.05.004
D. deMenezes, D. Prata, A. Secchi, J. Pinto, A review on robust Mestimators for regression analysis. Comput. Chem. Eng. 147, 107254 (2021). https://doi.org/10.1016/j.compchemeng.2021.107254
M.A. Gandhi, L. Mili, Robust Kalman filter based on a generalized maximumlikelihoodtype estimator. IEEE Trans. Signal Process. 58(5), 2509–2520 (2010). https://doi.org/10.1109/tsp.2009.2039731
J. Valluru, S. Patwardhan, L. Biegler, Development of robust extended Kalman filter and moving window estimator for simultaneous state and parameter/disturbance estimation. J. Process Control 69, 158–178 (2018). https://doi.org/10.1016/j.jprocont.2018.05.008
M. Murata, H. Nagano, K. Kashino, Unscented statistical linearization and robustified Kalman filter for nonlinear systems with parameter uncertainties, in 2014 American Control Conference (2014). https://doi.org/10.1109/acc.2014.6858583
K. Li, B. Hu, L. Chang, Y. Li, Robust squareroot cubature Kalman filter based on Huber’s Mestimation methodology. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 229(7), 1236–1245 (2014). https://doi.org/10.1177/0954410014548698
Y. Zou, S. Chan, T. Ng, Robust Mestimate adaptive filtering. IEE Proc. Vis. Image Signal Process. 148(4), 289 (2001). https://doi.org/10.1049/ipvis:20010316
Z.D. Banjac, B.D. Kovacevic, M.M. Milosavljevic, M.D. Veinovic, Local echo canceler with optimal input for true fullduplex speech scrambling system. IEEE Trans. Signal Process. 50(8), 1877–1882 (2002). https://doi.org/10.1109/tsp.2002.800415
Z. Banjac, Kovačević, B., Veinović, M., & Milosavljević, M., Robust least mean square adaptive FIR filter algorithm. IEE Proc. Vis. Image Signal Process. 148(5), 332–336 (2001). https://doi.org/10.1049/ipvis:20010594
B. Kovačević, Z. Banjac, M. Milosavljević, Adaptive digital filters (Springer, Berlin, 2013)
B. Kovačević, Z. Banjac, I.K. Kovačević, Robust adaptive filtering using recursive weighted least squares with combined scale and variable forgetting factors. EURASIP J. Adv. Signal Process. (2016). https://doi.org/10.1186/s1363401603413
Z. Banjac, Z. Durovic, B. Kovacevic, Approximate Kalman filtering using robustified dynamic stochastic approximation method, in 2018 26th Telecommunications Forum (TELFOR) (2018). https://doi.org/10.1109/telfor.2018.8611817
Z. Banjac, B. Kovačević, Robustified Kalman filtering using both dynamic stochastic approximation and Mrobust performance index. Tehnicki Vjesnik  Technical Gazette 29(3), 907–914 (2022). https://doi.org/10.17559/tv20200929143934
S.C. Chapra, R.P. Canale, Numerical methods for engineers (McGrawHill Education, New York, NY, 2015)
T. Young, R. Westerberg, Error bounds for stochastic estimation of signal parameters. IEEE Trans. Inf. Theory 17(5), 549–557 (1971). https://doi.org/10.1109/tit.1971.1054696
S.S. Blackman, R. Popoli, Design and analysis of modern tracking systems (Artech House, Norwood, MA, 1999)
J.F. Henriques, R. Caseiro, P. Martins, J. Batista, Highspeed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015). https://doi.org/10.1109/tpami.2014.2345390
S. Javed, M. Danelljan, F.S. Khan, M.H. Khan, M. Felsberg, J. Matas, Visual object tracking with discriminative filters and siamese networks: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 45(5), 6552–6574 (2022). https://doi.org/10.1109/TPAMI.2022.3212594
R. Xia, Y. Chen, B. Ren, Improved antiocclusion object tracking algorithm using Unscented RauchTungStriebel smoother and kernel correlation filter. J. King Saud Univ. Comput. Inf. Sci. 34(8), 6008–6018 (2022). https://doi.org/10.1016/j.jksuci.2022.02.004
M. Zolfaghari, H. GhaneiYakhdan, M. Yazdi, Realtime object tracking based on an adaptive transition model and extended Kalman filter to handle full occlusion. Vis. Comput. 36(4), 701–715 (2020). https://doi.org/10.1007/s00371019016523
J. Shin, H. Kim, D. Kim, J. Paik, Fast and robust object tracking using tracking failure detection in kernelized Correlation Filter. Appl. Sci. 10(2), 713 (2020). https://doi.org/10.3390/app10020713
N. Latinović, I. Popadić, B. Tomić, A. Simić, P. Milanović, S. Nijemčević, M. Perić, M. Veinović, Signal processing platform for longrange multispectral electrooptical systems. Sensors 22(3), 1294 (2022). https://doi.org/10.3390/s22031294
Acknowledgements
The authors are grateful for the valuable comments and suggestions of the unknown reviewers that improved the final version of the manuscript.
The authors are also grateful to the Vlatacom Institute of High Technologies for using real datasets for infrared video surveillance.
Funding
This paper is an outcome of activities under project #157 VMSIS3_Advanced supported by the Vlatacom Institute of High Technologies, Belgrade, Serbia.
Author information
Authors and Affiliations
Contributions
MP designed the work, analysed and interpreted the data and drafted the manuscript. ZB participated in the design of the study, performed the experiments and analysis and helped to draft the manuscript. BK performed in the theoretical analyse and helped to produce the final version of the manuscript. All the authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1 Derivation of the statistically linearized Mrobustified versions of the Kalman filter (algorithms A2, A3 and A4)
Starting from (1) and (2), one obtains for the prediction error in (20) the relation
where the estimation error, \(\tilde{x}_{k} \left( + \right)\), is defined by (16), while the variable, \(w_{k}\), is a sample from a zeromean white state noise, in (1). Furthermore, taking into account (1), (2), (11) and (18), the estimation error in (16) is given by
where the variable, \(v_{k}\), is a sample from the zeromean white measurement noise, in (2). Analogously to the linear optimal Kalman filter, the initial condition defined in Sect. 2 guarantees that the prediction and the estimation errors are unbiased at each stage, \(k\), resulting in
The proof is based on the mathematical induction, [1,2,3,4,5]. In addition, under the assumed hypothesis in Sect. 2, the random errors, \(\tilde{x}_{k} \left( + \right)\) and \(\tilde{x}_{k} \left(  \right)\), are uncorrelated with the state and the measurement noises, producing the underlying crosscovariances
where 0 denotes a zero matrix. Starting from (A1), (A2) and (A4), one obtains for the prediction error covariance in (20) the relation (24), that is identical to the time update in (3), (4). Moreover, due to (A2) and (A4), the estimation error covariance in (16) can be calculated at each stage by the recursive relation
where \(R_{k}\) is the scalar measurement noise variance. By substituting the gain, \(K_{k}\), from (18) into (A5), and applying the matrix trace operation, one obtains for the approximate minimum variance criterion, in (16), the following expression
where the scaling factor, \(s_{k}\), is defined by (19). In deriving (A6) is used the fact that the third matrix term in (A5) represents the transpose matrix of the second one, yielding the same matrix trace. Further step in the derivation is based on the usage of the rules for the partial derivate of trace of product of matrices, [1,2,3,4,5].
By comparing (A7) with (A6), one concludes that \(B = I\), \(A = \Gamma_{k}\) and \(C = H_{k}^{T} H_{k} M_{k}\) for the second term in (A6), while \(A = \Gamma_{k}\) and \(B = H_{k}^{T} H_{k}\) for the third one. Taking into account these equivalences, one obtains, by partially differentiating (A6) and equating the resulting matrix equation with the zero matrix, the following algebraic equation
The matrix equation (A8) requires further simplifications, to generate a feasible suboptimal solution for \(\Gamma_{k}\). Firstly, since the value of the optimal statistical linearization coefficient, \(\alpha\), in (17) is close to one, one can replace \(\alpha^{2}\) in the second term of (A8) by \(\alpha\). Moreover, by using (19), it further follows
Namely, the first term in brackets of (A9) is proportional to the uncertainty in the prediction, expressed by the covariance, \(M_{k}\), in (20) and (24), but inversely proportional to the measurement noise average power, \(R_{k}\). Moreover, the proposed nonlinear filter, based on the winsorization technique, in (14) and (23), obeys the efficiency robustness requirement. In this sense, it is almost efficient as the optimal linear Kalman filter under the pure Gaussian observations, but retains a good efficiency under the existence of outliers within the Gaussian observations. Therefore, the estimation error covariance, \(P\), in (24) is rather small, so that the uncertainty in the prediction, \(M_{k}\), is directly proportional to the state noise average power, \(Q\). Moreover, the measurement noise variance, \(R\), determined by (21), is significantly larger than the state noise variance, \(Q\). As a consequence, the equation (A9) reduces to the unit value, as it is expressed by its righthand side. Under the adopted two approximations, the relation (A8) takes the form
Bearing in mind (18) and (A10), it further follows
Finally, by replacing (A11) into (A5), one obtains the estimation error covariance expressed by
By applying the mentioned two approximations, \(\alpha^{2} \approx \alpha\) and (A9), as well as by subtracting the gain, \(K\), from (A11) into (A12), the last relation reduces to (22).
Finally, starting from the approximation of \(\alpha\) by \(\alpha^{2}\) in (18), the relation (23) is obtained by substituting \(\Gamma_{k}\) from (22) into (18), and including in the soobtained equation the \(\psi\)function, in (14), instead of its statistical linear approximation in (17). Thus, if one uses the variable \(\alpha\) in (28), having the value close to zero or one, the above approximation of \(\alpha\) with \(\alpha^{2}\) is reasonable. On the other hand, the fixed approximation of \(\alpha\), in (26) or (27), is equal to the probability of the regular observations, and for a small or moderate contamination degree, \(\delta\) in (15), the probability value is close to one, justifying the above approximation. Of course, this value decreases with the contamination degree increased values, reducing the gain factor values, \(K_{k}\) in (23), supressing the influence of outliers.
A class of statistically linearized Mrobustified Kalman filtering algorithms, with \(\alpha\) being the free parameter, is defined by the prediction recursions (24), and the estimation, or filtering, recursions defined by (11), (14), (19), (22) and (23). Particular robust algorithm is defined by choosing a suitable approximation of the indeterminable meansquare optimal statistical linearization coefficient, \(\alpha\), in (17). Thus, the choice of the fixed approximation, in (26) or (27), of the optimal coefficient, \(\alpha\), in (17) results in the algorithm A3. Furthermore, one obtains the algorithm A2 by choosing the variable approximation, in (28), of the optimal coefficient, \(\alpha\), in (17). The algorithm A4 follows from the algorithm A2 by replacing the nonlinear Mrobust influence function, \(\psi\), in (23) with the linear one, but the robust mechanism to generating the gain matrix, \(K\), in (23), remains unchanged, as in A2. The derivation of the proposed recursive robust algorithms is thus completed.
Appendix 2 Derivation of the optimal statistical linearization coefficients
Statistical linearization is a type of the statistical approximation techniques, where the basic principle is to approximate a given vectorvalued function, \(\psi \left( z \right)\), of a random vector argument, \(z\), by the linear matrix form
Here, the parameters, \(\alpha\) and \(\beta\), are some matrix coefficients that have to be determined. Analogously to the estimation problem, by defining the function representation error
these coefficients may be calculated by minimizing the meansquare error (MSE) criterion
Here, \(Trace\) is the matrix trace, with \(\left\ \cdot \right\\) being the Euclidian norm, while \(E\left\{ \cdot \right\}\) denotes the mathematical expectation, [1,2,3,4,5]. Substituting (B2) in (B3), and changing the order of linear operators, \(Trace\) and \(E\left\{ \cdot \right\}\), together with the application of the rules in (A7), one obtains
By setting (B4) equal to zero, it further follows
Moreover, by substituting the coefficient, \(\beta\), from (B5) into (B2) and (B3), and differentiating again the soobtained equation with respect to the matrix \(\alpha\), using the rules (A7), one obtains
Setting (B6) equal to zero, and solving the resulting equation, it further follows
If \(\psi \left( z \right)\) is a vectorvalued function of the multidimensional argument, \(z\), then the statistically linearized solutions in (B5) and (B7) require evaluations of the multidimensional integrals, following from the definitions of the corresponding multidimensional mathematical expectations. This assumes the joint pdf for the components of random vector, \(z\), to be given in advance. Moreover, the most frequently adopted joint pdf is the multidimensional Gaussian one. The computation may be much simplified for nonlinearities with a small number of argument variables. Particularly, if both the random variable \(\psi\) and the random variable \(z\) are scalars with the zeromean values, then \(\beta\) in (B5) and \(\alpha\) in (B7) reduce to the scalarvalued deterministic quantities, given by
with \(\sigma_{z}^{2}\) being the variance of the random argument, \(z\). More precisely, the equation (B8) assumes that the argument, \(z\), is a zeromean random variable with a symmetric pdf, while \(\psi\) is an odd real function of the scalar argument, \(z\).
Particularly, the second assumption is fulfilled for the Huber’s robust influence function, \(\psi \left( \cdot \right)\), in (14). Moreover, the random argument, \(z\), in (B9) corresponds to the normalized scaled measurement residual, \(\tilde{\varepsilon }_{k} = \varepsilon_{k} /s_{k}\), in (17), where the residual, \(\varepsilon_{k}\), is given by (11), with \(s_{k}\) being its standard deviation calculated in (19). Therefore, by substituting the current measurement, \(y_{k}\), from (2) into (11), one gets
where the prediction error, \(\tilde{x}_{k} \left(  \right)\), in (A1) is a zeromean random variable with the covariance \(M_{k}\), computed by (24). Analogously to the optimal Kalman filter, the zeromean error, \(\tilde{x}_{k} \left(  \right)\), in (B9) is the Gaussian distributed if both the zeromean white Gaussian noises, \(w_{k}\) and \(v_{k}\), and the Gaussian initial state vector, \(x_{0}\), are also Gaussian distributed. In this sense, one can write
where \(N\left( { \cdot ,a,b} \right)\) denotes the Gaussian pdf with the mean \(a\) and the covariance \(b\). Moreover, starting from (B10), the random variable,\(H_{k} \tilde{x}_{k} \left(  \right)/s_{k}\), in (B9) has the zeromean Gaussian pdf, defined by
Furthermore, a zeromean observation noise, \(v_{k}\), is confined to the Gaussian mixture pdf in (15), yielding the unit nominal variance, \(R_{k}\), in (B9). This, in turn, results that the scaled random variable, \(v_{k} /s_{k}\), in (B9) has the following Gaussian mixture pdf
Thus, normalized residual, \(\tilde{\varepsilon }_{k}\), in (B9) is defined by the sum of the zeromean Gaussian random variables, so that its conditional pdf given the past observations, \(p\left( {\tilde{\varepsilon }_{k} Y^{k  1} } \right)\), is defined by the convolution between the underlying Gaussian pdfs, to which the random addends in (B9) are confined. Additionally, since the convolution between the Gaussian pdfs yields also the Gaussian one, with the corresponding mean and covariance, one gets from (B9), (B11) and (B12) the conditional pdf of the scaled residual given the past measurement
where \(\otimes\) denotes the convolution integral, [5]. Since the convolution is a linear operator, the relation (B13) can be rewritten
Taking into account (19), the variance of the second normal pdf in (B14) can be approximated by
The righthand side approximation in (B15) follows from the approximate relation (A9), yielding
By substituting (B15) into (B14), one obtains
Since (B17) is a symmetric pdf, the first assumption under which the expression (B8) is derived is also fulfilled, justifying the application of (B8) to the relation (17). Thus, the derivation is completed.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Pavlović, M., Banjac, Z. & Kovačević, B. Approximate Kalman filtering by both Mrobustified dynamic stochastic approximation and statistical linearization methods. EURASIP J. Adv. Signal Process. 2023, 69 (2023). https://doi.org/10.1186/s13634023010301
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13634023010301
Keywords
 Impulsive noise
 Kalman filtering
 NonGaussian noise
 Nonlinear filtering
 Outliers
 Robust estimation
 Statistical linearization
 Stochastic approximation
 Tracking