Skip to content


Open Access

Iterative unbiased FIR state estimation: a review of algorithms

EURASIP Journal on Advances in Signal Processing20132013:113

Received: 6 November 2012

Accepted: 28 April 2013

Published: 30 May 2013


In this paper, we develop in part and review various iterative unbiased finite impulse response (UFIR) algorithms (both direct and two‐stage) for the filtering, smoothing, and prediction of time‐varying and time‐invariant discrete state‐space models in white Gaussian noise environments. The distinctive property of UFIR algorithms is that noise statistics are completely ignored. Instead, an optimal window size is required for optimal performance. We show that the optimal window size can be determined via measurements with no reference. UFIR algorithms are computationally more demanding than Kalman filters, but this extra computational effort can be alleviated with parallel computing, and the extra memory that is required is not a problem for modern computers. Under real‐world operating conditions with uncertainties, non‐Gaussian noise, and unknown noise statistics, the UFIR estimator generally demonstrates better robustness than the Kalman filter, even with suboptimal window size. In applications requiring large window size, the UFIR estimator is also superior to the best previously known optimal FIR estimators.


Unbiased FIR estimatorKalman filterIterative algorithmFilteringSmoothingPrediction

1 Review

1.1 Introduction

In optimal estimation theory, unbiasedness is a key condition that is used to derive linear and nonlinear estimators. A classical example is the ordinary least squares (OLS) estimator proposed by Gauss in 1795 [1]. The Gauss‐Markov theorem states that if the noise is white and has the same variance at each time step, the OLS estimator is also the best linear unbiased estimator (BLUE) [2]. In convolution‐based optimal filtering, the unbiasedness constraint [2] leads to the unbiased finite impulse response (UFIR) estimator [3, 4]. An extremely useful property of the BLUE and UFIR is that noise statistics are not required. Another example is the maximum likelihood estimator (MLE), which obtains the estimate at an extremum of the density function of the state conditioned on the measurements [5]. Like the BLUE and UFIR, the MLE is suboptimal for finite data. However, if the sample size (memory) increases to infinity, each of them are optimal.

For finite data, unbiasedness does not guarantee the minimum mean square error (MSE), which is comprised of standard deviation and bias:
MSE = Var + Bias 2 ,

where ‘Var’ is the error variance. Since the minimum MSE is required by many applications, a minimization of (1) is often desired at the expense of a small increase in bias. That leads to different kinds of optimal solutions such as the minimum variance unbiased estimator (MVUE), the recursive Kalman filter[6], and the optimal FIR (OFIR) filter [7, 8]. The common disadvantage of these filters is that noise statistics and initial errors are required. In view of the fact that noise statistics and initial errors are commonly not well known, especially for time‐variant models, theoretically optimal estimators end up being suboptimal in practical applications. In this regard, engineering experience says the following [9]:

Practical implementation of the Kalman filter is often difficult due to the inability in getting a good estimate of the noise covariance matrices.

That means that due to insufficient knowledge about noise statistics, optimal estimators that minimize (1) may be less accurate than unbiased ones that are derived under the invariant E { x n x ̂ n } = 0 , which leads to the unbiasedness condition
E { x ̂ n } = E { x n } ,

where x n indicates a state variable at discrete time step n, x ̂ n its estimate, and E{x} is the expected value of x. Note that the cost of equipment that is required for the characterization of noise statistics cannot commonly be afforded by users, and methods for the estimation of noise covariance matrices via measurements are not well developed. On the other hand, noise statistics are not always necessary to get a good estimate that we illustrate below based on an example.

Example 1

A linear signal x n is measured as z n =x n +v n in the presence of zero mean white Gaussian noise v n having variance σ v 2 = 1 . The p‐shift ramp UFIR estimator matches this signal and is given by the convolution‐based estimate
x ̂ n + p = i = 0 N 1 h 1 i ( N , p ) z n i ,
where the impulse response function is (eq. (89) of [17])
h 1 n ( N , p ) = 2 ( 2 N 1 ) 6 n N ( N + 1 ) + 6 p ( N 1 2 n ) N ( N 2 1 ) ,
p=0 corresponds to filtering, p<0 corresponds to |p|‐lag smoothing, and p>0 corresponds to p‐step prediction. The estimation variance is defined as σ 2 = G ( N , p ) σ v 2 = G ( N , p ) , where G ( N , p ) = i = 0 N 1 h 1 i 2 ( N , p ) is the noise power gain (NPG) [11],
G ( N , p ) = 2 ( 2 N 1 ) N ( N + 1 ) + 12 p ( N 1 + p ) N ( N 2 1 ) .
The estimation variance is sketched in Figure 1 as a function of N. Here, the 1/N bound is obtained by simple averaging that is optimal in the sense of error variance, although with a 50% bias for a linear x n . The case of p=0 corresponds to the ramp UFIR filter with h 1n (N,0) given by (4), and we notice that denoising is inherently less efficient in this case. If we set p=−N/2, the (N/2)‐lag smoother estimation error variance rapidly converges to that of simple averaging. A similar effect can be observed in the one‐step prediction error variance with p=1. Here, a large error variance for small values of N reduces to that of the ramp filter as N increases. A common feature of these plots is that the error variances decrease with the reciprocal of N. That means that noise in UFIR estimators with large memory N1 may be very low and the estimate may be almost optimal. This leads to the following statement:
Figure 1
Figure 1

Estimation error variance σ 2 for different UFIR structures (Example 2). 1/N corresponds to simple averaging, p = 0 to the ramp filter, p = 1 to the one‐step predictor, and p = −N/2 to the (N/2)‐lag smoother.

There is no need to use optimal estimators in many applications. UFIR structures that ignore noise statistics and initial estimation error statistics are able to produce acceptable suboptimal estimates.

UFIR estimators have attracted researcher’s attention for decades, beginning with the work of Johnson [12] and others, in which they extended the Wiener filter theory to discrete finite time. Further, the ability of UFIR estimators to produce nearly optimal estimates while ignoring noise statistics was greatly regarded in the development of estimators for polynomial signals [13]‐[15]. Most recently UFIR methods were extended to state space in batch form [3, 4, 16]‐[18] and in an iterative Kalman‐like form [19, 20]. The latter has made the UFIR estimator a significant rival of the Kalman filter and its applications can be found in [10, 21]‐[24]. Even so, UFIR estimators still remain somewhat beyond the typical range of traditional signal processing techniques.

The basic operating principles of the optimal Kalman and UFIR filters are summarized in Figure 2. At time n, the Kalman filter requires the noise statistics at time n−1, such as the process and measurement noise covariance matrices Q n−1 and R n−1 respectively, as well as the estimation error covariance P n−1. The optimal UFIR filter ignores these statistics. Instead, it requires the optimal averaging interval of N opt points in order to be optimal.
Figure 2
Figure 2

Basic operating diagrams of the optimal Kalman and UFIR filters.

In this paper, we develop in part the results achieved in the field of UFIR filtering and review a family of iterative UFIR algorithms for filtering, smoothing, and prediction of time‐varying (TV) and time‐invariant (TI) discrete state‐space models in white Gaussian noise environments. The following definitions will be used: UFIR estimator satisfies the unbiasedness condition (2), OFIR estimator minimizes the MSE (1), and Optimal UFIR (OUFIR) estimator minimizes the MSE in the UFIR estimator by using a window size N opt.

Section 1.2 presents the linear state‐space model, formulates the problem, and considers the batch p‐shift UFIR estimator along with the generalized NPG. Section 1.3 presents two forms of the p‐shift iterative UFIR algorithm. Section 1.4 discusses the estimation errors of the UFIR estimators. Sections 1.5, 1.6, and 1.7 give the reader a number of practical algorithms for filtering, smoothing, and prediction. Section 1.8 considers an extension to nonlinear systems. Section 1.9 discusses methods for the determination of the optimal memory size N opt. Finally, section 1.10 concludes with some useful generalizations.

1.2 Linear model and batch UFIR estimator

Consider a class of discrete TV linear models represented in state space with the state and observation equations as follows:
x n = F n x n 1 + B n w n ,
z n = H n x n + v n ,
where x n K and z n M are the state and observation vectors, respectively. Here, F n K×K , B n K×P , and H n M×K . Let us suppose that the state noise vector w n P and measurement noise vector v n M have zero mean white Gaussian components, E{w n }=0 and E{v n }=0. We also assume that these vectors are mutually uncorrelated, E { w i v j T } = 0 , for all i and j, and have the following covariances:
Q n = E { w n w n T } ,
R n = E { v n v n T } ,

where Q n and R n may be unknown to the engineer.

Now suppose that the p‐shift estimatea x ̂ n + p | n of x n is provided at time n+p with the UFIR estimator proposed in [19, 20]. We would like to modify this estimator and review engineering algorithms for different kinds of filtering, q‐lag smoothing, and p‐step prediction. We also wish to estimate the estimation errors and generalize the properties to facilitate a comparison with the OFIR and Kalman algorithms.

1.2.1 Time‐variant models

In convolution‐based filtering (3), we suppose that measurements z n are available on a time horizon of N points (memoryb), from time m=nN+1 to time n, that the estimator is causal, and that m0. In order to find x ̂ n + p in state space, the batch p‐shift UFIR estimator [8, 20] can be applied. For TV models, the p‐shift UFIR estimator was derived in [8], assuming that the negative shift p is no smaller than −N+1. Below, we modify this estimator for arbitrary p, which is needed for one of the smoother forms.

Let p=−N+1 and consider the estimate (eq. (21)) of [19]) at the initial point m that gives us
x ̂ m = H n , m 1 Z n , m ,
where H n , m 1 = ( H n , m T H n , m ) 1 H n , m T is the generalized left inverse, and
H n , m = H ̄ n , m F n , m ,
Z n , m = z n T z n 1 T z m T T ,
F n , m = F n , 0 m + 1 T F n , 1 m + 1 T F m + 1 T I n m + 1 T ,
H ̄ n , m = diag H n H n 1 H m n m + 1 ,
F r , h r g = i = h g F r i = F r h F r h 1 F r g .

One can notice that (10) is reminiscent of the familiar OLS or BLUE, although the matrices are different.

To provide the estimate for any p, we find the state transition matrix B n , m ( N , p ) by writing (10) as x ̂ n + p | n = B n , m ( N , p ) x ̂ m | n . By combining the forward‐time and backward‐time solutions [26], B n , m ( p ) B n , m ( N , p ) becomes
B n , m ( p ) = F n + p , 0 m + 1 = i = 0 N 2 + p F n + p i , p > N 1 I , p = N 1 . F m , 0 n + p + 1 1 = i = 0 | p | N F m i 1 , p < N 1
where N 1=−N+1. The most general batch form of the p‐shift UFIR estimator for TV models is thus
x ̂ n + p = A n , m ( p ) Z n , m ,
= B n , m ( p ) H n , m 1 Z n , m ,

where A n,m (p) is the UFIR estimator gain; and p can be arbitrary, − p . In the case of −N+1<p<0, one may also use a particular form of (17b) shown in [19], (21) with B n , m ( p ) = F n + p , 0 m + 1 .

If we now observe that the filter estimate with p=0 is
x ̂ n = F n , 0 m + 1 H n , m 1 Z n , m ,
then (17b) can alternatively be written as
x ̂ n + p = B n , m ( p ) ( F n , 0 m + 1 ) 1 x ̂ n .

This suggests that prediction and smoothing can be organized based on the filtering estimate (18) if we use an auxiliary p‐shift gain matrix. We will show below that (19) plays an important part in the design of UFIR algorithms.

1.2.2 Time‐invariant models

In the special TI case, we have B n , m ( p ) = F n m + p = F N 1 + p and the estimator becomes
x ̂ n + p = A ( N , p ) Z n , m
= F N 1 + p H ̄ N 1 1 Z n , m ,
where H ̄ N 1 1 = ( H ̄ N 1 T H ̄ N 1 ) 1 H ̄ N 1 T and
H ̄ N 1 = H ̂ N 1 F ̄ N 1 ,
F ̄ N 1 = F N 1 T F T I N T ,
H ̂ N 1 = diag H H H N .
Following (19), the estimate (20b) can alternatively be written as follows:
x ̂ n + p = F p x ̂ n ,
where the TI filter estimate is given by
x ̂ n = F N 1 H ̄ N 1 1 Z n , m .

A distinctive feature of both TV and TI batch UFIR estimators is that they can be applied to models with noise having arbitrary distributions and covariances. They can also be represented in fast iterative Kalman‐like forms using an auxiliary matrix called the generalized NPG (GNPG), which will be discussed next.

1.2.3 Generalized noise power gain

It follows from (5) that the NPG is a measure of how much the measurement noise is suppressed at the FIR estimator output. In state space, the NPG is defined via the MSE [27], with the assumption that B n =0:
P ̄ n + p = E { ( x n + p x ̂ n + p ) ( ) T } = E { [ x n + p A n , m ( p ) Z n , m ] [ ] T } = E { [ x n + p A n , m ( p ) H n , m X n , m A n , m ( p ) V n , m ] [ ] T } ,
X n , m = x n T x n 1 T x m T T ,
V n , m = v n T v n 1 T v m T T .
In view of the fact that the estimate is unbiased, two first‐two terms in the brackets of (26) are zero by (2), which gives
P ̄ n + p = A n , m E { V n , m V n , m T } A n , m T ,
where E { V n , m V n , m T } is the measurement noise covariance on the averaging interval. A simplification follows instantly if one lets p=0 and supposes that the model is one‐state and time‐invariant one. That leads to the estimation variance as follows:
σ est 2 = σ v 2 A ( N ) A T ( N ) ,

where the product A(N)A T (N) is known as the NPG [11].

More generally, the GNPG can thus be written as
G n , m ( p ) = A n , m ( p ) A n , m T ( p )

to characterize the noise strength at the estimator output. In particular, if the GNPG is an identity matrix, then no noise reduction is provided by the estimator. If the GNPG has components that are equal to zero, then the noise is fully suppressed by the estimator.

By transforming (31) and utilizing (17b), (18), and (19), one can find two equivalent GNPG forms corresponding to TV models:
G n , m ( p ) = B n , m ( p ) ( H n , m T H n , m ) 1 B n , m T ( p ) ,
= B n , m ( p ) ( F n , 0 m + 1 ) 1 G n , m ( F n , 0 m + 1 ) T B n , m T ( p ) ,
where the GNPG G n,m =G n,m (0) associated with filtering is given as follows:
G n , m = F n , 0 m + 1 ( H n , m T H n , m ) 1 F n , 0 m + 1 T .
Similarly, using (20a), (24), and (25), the GNPG for TI models can be represented as follows:
G ( N , p ) = F N 1 + p ( H N 1 T H N 1 ) 1 F N 1 + p T ,
= F p G ( N ) F p T ,
where the GNPG G(N)=G(N,0) for filtering is
G ( N ) = F N 1 ( H N 1 T H N 1 ) 1 F N 1 T .

Summarizing the generalizations provided for the batch UFIR estimator, we notice again that this estimator ignores noise statistics and initial errors in solving the problems of smoothing, filtering, and prediction in a unified scheme. Its important applied property is that the estimate becomes virtually optimal when N1 [20]. On the other hand, large N leads to computational problems owing to the large dimensions of the augmented matrices and vectors. For fast computation, iterative Kalman‐like UFIR forms can be used, which will be discussed next.

1.3 Iterative Kalman‐like UFIR estimation

Similar to the recursive OLS [28], the UFIR estimator can also be represented in a fast iterative form similar to the Kalman filter as shown in [19, 20]. The iterative UFIR estimator requires that we start with initial values that are available from the batch algorithm, which typically requires matrix computations on the order of K×K dimensions, and then we iteratively update the estimator output. The state estimate is taken when an iterative variable reaches the current time n.

1.3.1 Time‐varying models

For TV models, the estimates (17b) and (19) suggest two forms of iterative UFIR computation. The direct form
Following the derivations given in Appendices I and II of [20], the direct form of the iterative algorithm corresponding to (17b) is the following:
x ̂ l + p = F l + p x ̂ l + p 1 + K l ( z l H l Y l x ̂ l + p 1 ) ,
where Y l Y l ( p ) = Y ̄ l ( p ) F l + p and
Y ̄ l = F l , 0 l + p + 1 = i = 0 | p | 1 F l i , p < 0 I , p = 0 . F l + p , 0 l 1 1 = i = 0 p 1 F l + p i 1 , p > 0
The bias correction gain K l K l , m ( p ) is given here as
K l = G l Y ̄ l T H l T ,
where the GNPG G l G l , m ( p ) is computed iteratively by
G l = F l + p ( Y l T H l T H l Y l + G l 1 1 ) 1 F l + p T ,
= [ Y ̄ l T H l T H l Y ̄ l + ( F l + p G l 1 F l + p T ) 1 ] 1 ,
where Y ̄ l = Y l F l + p 1 . The initial values, x ̂ s + p and G s , are computed in short batch forms as
x ̂ s + p = B s , m ( p ) H s , m 1 Z s , m ,
G s = B s , m ( p ) ( H s , m T H s , m ) 1 B s , m T ( p ) ,

where s=m+K−1; and the iterative variable l ranges from m+K to n. The estimator output is taken when l=n. Since the UFIR estimate does not require initial conditions, one may approximately set (40) to zero and let (41) be the identity when N1. However, this simplification may not always lead to good estimates for smaller values of N.

The estimate at time n+p appears in (36) from an iterative update beginning with time step m+K+p−1. Its flaw is that |p| past points before the N‐point estimator window are formally required for smoothing. This disadvantage is overcome in the two‐stage form, which will be discussed next. The two‐stage form
The batch estimate (19) suggests that another iterative UFIR form can be found if we first set p=0 in (36) and find the filter estimate:
x ̂ l = F l x ̂ l 1 + K l ( z l H l F l x ̂ l 1 ) ,
in which
K l = G l H l T ,
G l = [ H l T H l + ( F l G l 1 F l T ) 1 ] 1 ,
and the initial values are given as follows:
x ̂ s = F s , 0 m + 1 H s , m 1 Z s , m ,
G s = F s , 0 m + 1 ( H s , m T H s , m ) 1 F s , 0 m + 1 T .

Here, s=m+K−1 and l ranges from m+K to n, as before.

Given x ̂ n from (42) with l=n, the p‐shift estimate can then be computed utilizing (19) as
x ̂ n + p = B n , m ( p ) ( F n , 0 m + 1 ) 1 x ̂ n .

As can be seen, this second form available does not require extra data points before the filtering window. However, it requires two computational steps, unlike the direct form (36).

1.3.2 Time‐invariant models

Employing (20b) and (24), the p‐shift estimate for TI models can also be found in two equivalent iterative forms. The direct form
If all of the model matrices are TI, we have Y l = F 1 p . Accordingly, (36) becomes
x ̂ l + p = F x ̂ l + p 1 + K l ( z l H F 1 p x ̂ l + p 1 )
and the bias correction gain (38) attains the form
K l = G l F p T H T ,
where G l is computed iteratively as
G l = [ F p T H T H F p + ( F G l 1 F T ) 1 ] 1 .
The initial values are computed as
x ̂ s + p = F s m + p H ̄ s m 1 Z s , m ,
G s = F s m + p ( H s m T H s m ) 1 F s m + p T ,

where s=m+K−1 and l ranges from m+K to n. The desired state estimate is taken at l=n. The two‐stage form
Provided the TI filtering estimate
x ̂ l = F x ̂ l 1 + K l ( z l H F x ̂ l 1 )
K l = G l H T ,
G l = [ H l T H l + ( F l G l 1 F l T ) 1 ] 1 ,
and the initial conditions
x ̂ s = F s m H s , m 1 Z s , m ,
G s = F s m ( H s , m T H s , m ) 1 F s m T ,
where s=m+K−1 and the iterative variable l ranges from m+K to n, the p‐shift estimate can alternatively be computed by (24) as follows:
x ̂ n + p = F p x ̂ n .

One may conclude that the algorithm of (53) and (58) is very simple from a programming perspective. As was shown in [8, 19, 20] and in many other studies, the UFIR estimator is a strong rival to the Kalman filter if the noise covariances are not known exactly.

1.4 Estimation errors

Next, we discuss errors in UFIR estimators assuming white Gaussian noise. Given the instantaneous error
ε n + p = x n + p x ̂ n + p ,
where the estimate x ̂ n + p comes from either a TV or TI model, the MSE P n+p at time n+p can be defined as
P n + p = E { ε n + p ε n + p T } .

In spite of the fact that the UFIR estimator has two equivalent forms (batch and iterative), the MSE can rigorously be determined only via the batch form. Finding closed analytical solutions for the MSE via (19) and (25) implies a large mathematical burden and is still a challenging problem. On the other hand, a rigorous error computation may be unnecessary since estimation error covariances are not used in the UFIR algorithms, and so reasonable approximations can serve us well in practical applications. Such an approximation provided following [23] is given in the Appendix.

The MSE upper bound (UB) P n + p UB can be obtained from an iterative computation of (114) for the general TV model. Equation (114) implies that process noise covariances are accumulated at each iteration. Therefore, the predicted value from (114) is a bit larger than the actual estimation error covariance for small N and significantly larger for N1. For the same reason, the estimate of (114) also diverges as p increases. The UB can thus be very useful for filtering (p=0) when N is not large and for smoothing with small lags. In the case of prediction, the future noise is neglected in (114) so it can serve as a tight upper bound even for very large p.

The MSE lower bound (LB) can be found if we take into consideration the fact that if NN opt the UFIR estimator order fits the system order. Therefore the system noise can be neglected in (114) and the LB P n + p LB can be found by iterating

P l + p LB = ( I K l H l Y ̄ l ) F l + p P l + p 1 LB F l + p T ( ) T + K l R l K l T
until l reaches n. For TI models, (61) becomes
P l + p LB = ( I K l H F p ) F P l + p 1 LB F T ( ) T + K l R l K l T .

Equations (61) and (62) correspond to the direct estimator forms of (36) and (48) respectively.

If one employs the two‐stage forms of (47) and (58) and the MSE LB for filtering (p=0), this is defined using (61) as follows:
P l LB = ( I K l H l ) F l P l 1 LB F l T ( ) T + K l R l K l T ,
then the p‐shift LB for TV and TI models can be computed, respectively, as follows:
P n + p LB = B n , m ( p ) F n , 0 m + 1 1 P n LB F n , 0 m + 1 T B n , m T ( p ) ,
P n + p LB = F p P n LB F p T ,

where P n LB is provided from (63) with l=n. Note that the LB is associated with the NPG and serves well in the three‐sigma sense [27].

1.5 Filtering

Filtering is used when the state estimate is required at the current time point n. It can also be projected to the future (prediction), or smoothed by combining several filtering estimates from the past. By letting p=0 in (36), the UFIR filtering estimate becomes
x ̂ l = F l x ̂ l 1 + K l ( z l H l F l x ̂ l 1 ) ,
where the bias correction gain is
K l = G l H l T
and the GNPG G l is computed iteratively by
G l = H l T H l + ( F l G l 1 F l T ) 1 1 .
The initial values for (66) and (67) are respectively specified in short batch forms as follows:
x ̂ s = F s , 0 m + 1 H s , m 1 Z s , m ,
G s = F s , 0 m + 1 ( H s , m T H s , m ) 1 F s , 0 m + 1 T ,

where s=m+K−1, F s , 0 m + 1 = i = 0 K 2 F s i , the iteration index l ranges from m+K to n, and the estimate of the current state is taken when l=n.

The MSE UB can be found for (66) by setting p=0 in (114), which gives
P l UB = ( I K l H l ) P l ( I K l H l ) T + K l R l K l T ,
where the predicted (a priori) estimate covariance P l P l | l 1 is given by
P l = F l P l 1 UB F l T + B l Q l B l T

with the initial value P l 1 LB specified as for the Kalman filter.

The LB appears from (72) by neglecting Q l , which gives
P l LB = ( I K l H l ) F l P l 1 LB F l T ( I K l H l ) T + K l R l K l T ,

where the initial value P l 1 LB can also be specified as in the Kalman filter.

It can easily be shown that (71) is the Kalman a posteriori estimate covariance, if we substitute K l with the Kalman gain. However, unlike the Kalman filter, (66) can be applied to deterministic models. If that is the case (R l =0 and Q l =0), then the estimation error is zero.

Several particular filtering solutions can now be discussed, which will be done in the following sections.

1.5.1 Fixed‐horizon filtering

The fixed‐horizon (fixed memory size N) iterative UFIR filtering algorithm is summarized for TV models in Table 1.
Table 1

Fixed-horizon TV UFIR filtering algorithm




K, N, m=nN+10, s=m+K−1,




x ̂ s by (69) and G s by (70).


G l = [ H l T H l + ( F l G l 1 F l T ) 1 ] 1 ,


x ̂ l = F l x ̂ l 1 + G l H l T ( z l H l F l x ̂ l 1 ) .


Use the estimate when l=n.

It is implied that measurements are available beginning at time index 0, and the time index n starts at n−1. The initial values x ̂ s and G s are computed using (69) and (70), respectively. For each n, the iterative variable l increments from m+K to n, and the desired estimate is taken when l=n. Note that the estimation error computed by (71) is minimal if one sets N=N opt. A simplification for the TI model is straightforward. One must just let all of the matrices be TI in Table 1.

1.5.2 Full‐horizon filtering

Full‐horizon filtering can be applied to highly stable or highly predictable systems such as those in astronomy, precise clocks and oscillators [27], and others associated with near deterministic state‐space models. The full‐horizon TV algorithm is given in Table 2.
Table 2

Full-horizon TV UFIR filtering algorithm




K, nK.


x ̂ K 1 by (69) and G K−1 by (70) for m=0.


G n = [ H n T H n + ( F n G n 1 F n T ) 1 ] 1 ,


x ̂ n = F n x ̂ n 1 + G n H n T ( z n H n F n x ̂ n 1 ) .

This algorithm is the most simple. It requires only the number of the states K since the filter memory window size changes with time; so, N=n+1. No additional information is needed, and the algorithm thus has extremely desirable engineering features. A natural extension of the TV algorithm (Table 2) to the TI case is provided by removing the time dependencies from the matrices.

The MSE UB and LB can be computed by (71) and (73) if we substitute l with n. Note that the full‐horizon UFIR filter may demonstrate substantial decrease in the output noise as n becomes large.

1.5.3 Tricky‐horizon filtering

The tricky‐horizon (time‐variant memory size N) algorithm can be used in adaptive filtering [29, 30] and whenever some reference information about the process behavior is available. It implies an individual N opt at each time index n. Such flexibility allows better system tracking with minimum residuals [19]. To implement tricky‐horizon filtering, the algorithm (Table 1) can be used if we allow N to be variable.

1.6 Smoothing

Smoothing is commonly associated with a lag q>0 relating the estimate at a given time index to measurements up to and including some past index. By combining ‘future’ and past estimates, it becomes possible to obtain better noise reduction for many practical applications. Note that an infinity of smoother solutions exists [31]. We will discuss two basic schemes for UFIR smoothers in this section.

The direct form

By letting q=−p>0 in (36), the smoothing estimate at nq can be found iteratively as follows [23]:
x ̂ l q = F l q x ̂ l q 1 + K l z l H l Y l x ̂ l q 1 ,

where Y l = F l , 0 l q = i = 0 q F l i and l ranges from m+K to n. The estimate x ̂ n q is traditionally taken at l=n in each iterative cycle.

The bias correction gain can be computed here using the following:
K l = Y ̄ l 1 G l H l T ,
where Y ̄ l = Y l F l q 1 and G l is given by (39b). The initial values x ̂ s and G l can be defined at s=m+K−1 as, respectively,
x ̂ s = B ̄ s , m ( q ) H s , m 1 Z s , m ,
G s = B ̄ s , m ( q ) ( H s , m T H s , m ) 1 B ̄ s , m T ( q ,
where B ̄ s , m ( q ) B ̄ s , m ( K , q ) is given by
B ̄ n , m ( N , q ) = i = 0 N 2 q F n q i , q < N 1 , I , q = N 1 , i = 1 q N F m i 1 , q > N 1 .
Following (114), the MSE UB for (74) can be found to be
P l q UB = ( I K l H l Y ̄ l ) P l q ( ) T + K l R l K l T + K l H l Q ~ l H l T K l T
where P l q is given by (72) and
Q ~ l = Q ̆ l + B l Q l B l T ,
Q ̆ l = i = 2 q Y l ( q i ) B l q 1 + i Q l q 1 + i B l q 1 + i T Y l T ( q i ) .
The MSE LB is obtained by neglecting Q n as
P l q LB = ( I K l H l Y ̄ l ) F l q 1 P l q 1 LB F l q 1 T ( ) T + K l R l K l T .

The two‐stage form

Provided the filtering estimate (66), the second form for the TV and TI UFIR smoothers become by (19) and (24) respectively
x ̂ n q = B ̄ n , m ( q ) ( F n , 0 m + 1 ) 1 x ̂ n ,
x ̂ n q = F q x ̂ n ,

where B ̄ n , m ( q ) B ̄ n , m ( N , q ) is given by (77).

The relevant estimation error covariance LBs become, respectively,
P n q LB = B ̄ n , m ( q ) F n , 0 m + 1 1 P n LB F n , 0 m + 1 T B ̄ n , m T ( q ) ,
P n q LB = F q P n LB F q T ,

where P n LB is provided by (63) at l=n. As in filtering, here, the LB can serve well in the three‐sigma sense [27].

1.6.1 Fixed‐interval smoothing

Among various smoothing problems, the fixed‐interval one is basic and often associated with smoothing [25, 32]‐[34]. The fixed‐interval UFIR smoother is intended to provide an estimate x ̂ n q | n with any lag 0<q<M utilizing measurement on a fixed interval of M points, from time index nM+1 to n. Although M may not be equal to N opt, UFIR smoothing is most efficient when M=N opt. In fact, If M>N opt, smoothing is inefficient when N opt<q<M, because q exceeds the length of the averaging interval and smoothing virtually provides the backward prediction, which has an estimation error larger than in filtering. On the other hand, N opt should not be larger than M, because M is commonly associated with an available database.

Provided M=N opt, two traditional forms can be suggested for fixed‐interval UFIR smoothing.

The direct form
This form implements (74) as listed in Table 3. A special peculiarity is that n starts at N−1+q in order for the smoother to process only nonnegative time indices. For TI models, the modification of this algorithm is straightforward.
Table 3

Direct fixed-interval TV OUFIR smoothing algorithm




K, N=constant, q, m=nN+10,


s=m+K−1, m+Kln.


x ̂ s by (75) and G s by (76).


G l = [ H l T H l + ( F l G l 1 F l T ) 1 ] 1 ,


K l = i = 0 q 1 F l i 1 G l H l T ,


x ̂ l q = F l q x ̂ l q 1 + K l z l H l i = 0 q F l i x ̂ l q 1 .


The algorithm is valid for any nN−1+q. Use the


smoothed estimate when l=n. The fixed interval


of M=N=N opt points is from time index m to n.

The two‐stage form
The two‐stage form implementing (82) can be used as shown in Table 4. To apply this algorithm to TI models, one must compute x ̂ n q = F q x ̂ n . A common feature of this algorithm is that two stages are required: first filtering must be provided to get the estimate at time n, then the obtained filter estimate is projected to time nq.
Table 4

Two-stage fixed-interval TV OUFIR smoothing algorithm




K, N=constant, q, m=nN+10,


s=m+K−1, m+Kln.


x ̂ s by (75) and G s by (76).


G l = H l T H l + ( F l G l 1 F l T ) 1 ] 1 ,


x ̂ l = F l x ̂ l 1 + G l H l T ( z l H l F l x ̂ l 1 ) .


Use x ̂ n when l=n and compute


x ̂ n q = B ̄ n , m ( q ) ( F n , 0 m + 1 ) 1 x ̂ n .


This algorithm is valid for any nN−1. The fixed


interval of M=N=N opt points is from time


index m to n.

1.6.2 Fixed‐lag smoothing

Fixed‐lag smoothing is commonly used for denoising if a time delay of q points is allowed [31, 32, 35, 36]. Two basic fixed‐lag algorithms can be designed based on the UFIR technique.

Fixed‐lag OUFIR smoothing

Provided N opt, the fixed lag q can be specified based on the process properties to obtain the best denoising. Intuition indicates that smoothing is best if the estimation time is the center of the observation interval. This holds true if the polynomial describing the process behavior on the observation interval is of odd degree. Otherwise, if the degree is even, denoising may be better with shorter lags as shown in Figure eight in [17]. The fixed‐lag OUFIR smoothing algorithm is listed in Table 4 if one sets N=N opt and q=constant. Its extension to the TI case can be provided by replacing the x ̂ n q equation in Table 4 with x ̂ n q = F q x ̂ n .

Fixed‐lag full‐horizon UFIR smoothing
This approach implies that the filter window includes all the available data, but the lag is fixed. Examples can be found in highly predictable or quasi deterministic slow dynamics, for which the estimates at time n and nq do not significantly vary from each other in terms of bias. The relevant algorithm for TV models is listed in Table 5. Its extension to the TI case can be obtained by replacing the x ̂ n q equation in Table 5 with x ̂ n q as x ̂ n q = F q x ̂ n .
Table 5

Fixed-lag full-horizon TV UFIR smoothing algorithm




K, q=constant, nK.


x ̂ K 1 by (75) and G K−1 by (76) for m=0.


G n = [ H n T H n + ( F n G n 1 F n T ) 1 ] 1 ,


x ̂ n = F n x ̂ n 1 + G n H n T ( z n H n F n x ̂ n 1 ) .


Compute x ̂ n q for nq as


x ̂ n q = B ̄ n , m ( q ) ( F n , 0 m + 1 ) 1 x ̂ n .

1.6.3 Fixed‐point smoothing

Fixed‐point smoothing implies that measurements are available from time index 0 up to the current time point n, but the estimate is required at some fixed past point 0v<n, where v is a constant [32, 37]. The time‐varying lag is thus q=nv. In such a formulation, the UFIR smoother is always full‐horizon as shown by the TV algorithm in Table 6. By replacing the x ̂ n q equation with x ̂ n q = F q x ̂ n , it becomes applicable for TI models. Probably the most interesting application for such algorithms is initial state estimation with v=0. Note that the same problem can be solved using the fixed‐interval smoother (Table 4) if we set the initial interval point to zero.
Table 6

Fixed-point TV UFIR smoothing algorithm




K, v=constant0, q=nv, nK.


x ̂ K 1 by (75) and G K−1 by (76) for m=0.


G n = [ H n T H n + ( F n G n 1 F n T ) 1 ] 1 ,


x ̂ n = F n x ̂ n 1 + G n H n T ( z n H n F n x ̂ n 1 ) .


Compute x ̂ n q for n>v as follows:


x ̂ n q = B ̄ n , m ( q ) ( F n , 0 m + 1 ) 1 x ̂ n .

1.7 Prediction

State prediction plays a key role in many applications. The one‐step predictor is fundamental for digital feedback control systems [38]. It is also commonly provided when measurements are unavailable at some points [39] and when estimates of long‐term future behavior are required [40]. Predictive estimation is necessary for global positioning system (GPS)‐based applications when the GPS receiver temporarily fails or when a signal is temporarily unavailable [27]. Predictive estimation is required for holdover in digital communication networks [41], for maintaining normal functioning of certain systems during down time [42, 43], and for astronomy and climate forecasting. The predictor goal is thus to compensate for unavailable information. In many cases, linear predictors do perform better than nonlinear ones [44].

To develop UFIR prediction, two algorithms [27] can be used as shown in Figure 3. It is supposed that measurements at each point are either available () or not (×). Utilizing N opt available points from the immediate past, the estimator provides a one‐step ahead projection (•) from each point of this interval: from point 1 to 2a, from 2 to 3a, etc. At point 4, the measurement is unavailable. Therefore, the predicted value 4a is utilized at point 4. Further predictor equations can be organized either with fixed steps or variable steps in the direct and two‐stage forms. It has been shown in [27] that the variable‐step approach is more precise in the short term, and that there is not a large difference between the estimates in the long term.
Figure 3
Figure 3

Basic UFIR prediction algorithms: (a) fixed‐step and (b) variable‐step, after[27].

1.7.1 Fixed‐step prediction

In the fixed‐step case shown in Figure 3a, p is often unity, but in general may be arbitrary (p>0). With p=1, prediction can permanently substitute for unavailable measurements with predicted values.

The direct form
The one‐step predictor appears from (36) by setting p=1:
x ̂ l + 1 = F l + 1 x ̂ l + G l F l + 1 T H l T ( z l H l x ̂ l ) ,
where G l is computed iteratively by
G l = F l + 1 ( H l T H l + G l 1 1 ) 1 F l + 1 T
and the initial values are determined as
x ̂ s + 1 = F s + 1 , 0 m + 1 H s , m 1 Z s , m ,
G s = F s + 1 , 0 m + 1 ( H s , m T H s , m ) 1 F s + 1 , 0 m + 1 T ,

where F s + 1 , 0 m + 1 = i = 0 K 1 F s + 1 i , s=m+K−1, and l ranges from m+K to n. The desired estimate is obtained when l=n.

For TI models, the one‐step predictor becomes:
x ̂ l + 1 = F x ̂ l + G l F T H T ( z l H x ̂ l ) ,
G l = F ( H T H + G l 1 1 ) 1 F T ,
x ̂ s + 1 = F K H s , m 1 Z s , m ,
G s = F K ( H s , m T H s , m ) 1 F K T .

Both predictors can be implemented in the algorithm (Figure 3a) to satisfy the following condition: if z n is unavailable at time n, then set z n = H n x ̂ n for a TV model and z n = H x ̂ n for a TI one.

The two‐stage form
By (47) and (58), the second form of the one‐step predictor for TV and TI models, respectively, are the following:
x ̂ n + 1 = F n + 1 x ̂ n ,
x ̂ n + 1 = F x ̂ n ,

where x ̂ n is the filter estimate. This is the most widely used prediction scheme.

1.7.2 Variable‐step prediction

In the variable‐step case illustrated in Figure 3b, the predicted estimates still compensate for unavailable measurements (points 4, 5, 6), but they are not involved to produce predictions, which is unlike the case of Figure 3a. Instead, p continues to increment until the measurement becomes available. At point 7, all measured and predicted values on a horizon of N opt past points are used to produce a prediction at point 8a. There are no other differences between fixed‐step and variable‐step prediction, and the estimates (36), (47), (48), and (58) can be used in a straightforward manner, along with the relevant error bounds.

1.8 Nonlinear models and extended filtering

For many applications, nonlinear systems can be modeled in additive white Gaussian noise environments with the state and observation equations as follows:
x n = f n ( x n 1 ) + B n w n ,
z n = h n ( x n ) + v n ,

where f n (x n−1) and h n (x n ) are time‐varying nonlinear vector functions and all other notations are given in (6) and (7). If f n (x n−1) and h n (x n ) are smooth enough, then the first‐order Taylor series expansion is often applied to make the model approximately linear between two neighboring points.

An expansion of f n (x n−1) around x ̂ n 1 and h n (x n ) around the prior estimate x ̂ n = x ̂ n | n 1 leads to
f n ( x n 1 ) f n ( x ̂ n 1 ) + F n ϵ n 1 ,
h n ( x n ) h n ( x ̂ n ) + H n ϵ n ,
where F n = f n x x ̂ n 1 and H n = h n x x ̂ n are Jacobians and ϵ n = x n x ̂ n is the prior estimation error. Unbiasedness implies E{ϵ n }=0, and a first‐order approximation of the expectation of f n (x n−1) leads to the prior estimate
x ̂ n = f ̄ n ( x n 1 ) = f n ( x ̂ n 1 ) .
The expectation of the prior error can be written as E { ϵ n } = E { x n x ̂ n } = E { F n ϵ n 1 + B n w n } = 0 . A first‐order approximation of the average of h n (x n ) is
h ̄ n ( x n ) = h n ( x ̂ n ) .

With (96) and (97) linearized in this way, UFIR filtering can be applied as shown below.

1.8.1 Iterative EFIR filtering

Following the Kalman filter extension [45], the extended unbiased FIR (EUFIR) filter estimate is shown in [46] to be
x ̂ l = x ̂ l + K l [ z l h l ( x ̂ l ) ] ,
where the prior estimate is x ̂ n = f n ( x ̂ n 1 ) , the bias correction gain is K l = G l H l T , and G l is computed iteratively as
G l = [ H l T H l + ( F l G l 1 F l T ) 1 ] 1 .
The iterative EUFIR filtering algorithm is given in Table 7.
Table 7

EUFIR filtering algorithm for TV models




K, N, m=nN+1,


s=m+K−1, m+Kln.


x ̂ s by (69) and G s by (70)


F l by (100), H l by (101),


G l = [ H l T H l + ( F l G l 1 F l T ) 1 ] 1 ,


K l = G l H l T ,


x ̂ l = f l ( x ̂ l 1 ) + K l [ z l h l ( x ̂ l ) ] .


Use the estimate when l=n.

As can be seen, it has the same structure as Table 1, except for the nonlinear functions in the estimate. Although the EUFIR algorithm traditionally does not use noise statistics or initial error statistics, the estimation error covariance may be required to characterize the performance. An analysis of error covariances is given in [46]. Note that, in contrast to the first‐order expansion, (98) and (99), the second‐order expansion involves noise statistics. However, as in the extended Kalman filter [28], the higher order expansion typically does not lead to a definitive advantage [46].

1.9 Memory for OUFIR estimators

The window size N plays an important role in UFIR estimators. If N<N opt, denoising appears to be inefficient: the random error dominates, although bias is negligible. If N>N opt, the random error is small, but bias affects the estimate.

Estimation of N opt is still a challenging mathematical problem that requires finding the derivative of the estimate with respect to the convolution length N. Even so, there are several available approaches. For low‐degree polynomial models, N opt was found analytically in [47]. A more general approach employing the bandlimited property of signals was developed in [20]. Finally, an efficient algorithm exploiting measurements was recently proposed in [48]. In any case, it is much simpler to estimate a scalar N opt, rather than accurately estimating matrices Q n and R n as is required in the Kalman filter.

1.9.1 Bandlimited signals

In real applications, a measured signal is causal and bandlimited with some maximum frequency W. By the Shannon theorem, the maximum sampling interval that prevents aliasing is T=1/2W. If measurements are obtained with sampling interval T, then only N=K points are available for averaging by the K‐state FIR estimator. If we use larger N, then the estimate will be biased. In order to avoid bias, we would need the model to be represented with a larger number of states, and such a model may not be acceptable or available.

Typically, measurements are provided at time steps τ<T or even τT and N opt can be calculated as follows [20]:
N opt = ( 2 ) 1 + 1 ,

where x means the maximum integer that is less than or equal to x. A simple analysis shows that if N>N opt, aliasing causes a bias. If N<N opt, noise reduction is inefficient.

1.9.2 Known reference model

If the reference model for x n is known, then the full‐horizon UFIR filter (Table 2) with window size N=n+1 can be applied to produce the estimate x ̂ n x ̂ n | n via measurements taken from time index 0 to n. Following (60), the N‐variant MSE P n can thus be defined by
P n = E { ε n ε n T } ,
where ε n = x n x ̂ n . The MSE (105) will be minimal if we minimize it to obtain N opt and let N opt=n opt+1. In doing so, one can either minimize the trace tr(P n ) if N opt needs to be applied to all of the states, or the (k k)th component P (k k)n of P n corresponding to the k th state, respectively,
N opt = arg min n ( tr P n ) + 1 ,
N k opt = arg min n ( P ( kk ) n ) + 1 .

It has been shown in [48] that by increasing n, the first minimum in both (106) and (107) corresponds to N opt. The problem, however, arises when the reference model x n is unknown, as it usually is.

1.9.3 Unknown reference model

The case of unknown model for x n is typical. In this case, we estimate N opt via the mean square value (MSV):
V n = E { ( z n H x ̂ n ) ( z n H x ̂ n ) T } ,
in which z n and x ̂ n are both known. It has been shown in [48] that the estimate N ̂ opt of N opt can be found via (108) to minimize the estimation error of all of the states or the k th one as, respectively,
N ̂ opt arg min n ∂n tr V n + 1 ,
N ̂ k opt arg min n ∂n V ( kk ) n + 1 ,

where V (k k)n is the (k k)th component of V n . The minimization is performed by increasing n, starting with K−1, until the first minimum. To avoid ambiguities when minimizing these functions, the number of points used in the expected value must be sufficiently large, and smoothing of the objective function may be desirable.

1.10 Some generalizations and conclusions

Based on extensive investigations provided by many authors, now we provide some generalizations; compare the trade‐off between the OUFIR, OFIR, and Kalman filters; and summarize the results in Table 8.
Table 8

Critical evaluation of the Kalman, OFIR, and OUFIR filters



Batch OFIR

Iterative OFIR


Iterative OUFIR



[7, 8, 20]

[7, 49]









Initial conditions:

A priori

A posteriori

A posteriori


A posteriori

Noise statistics:






Noise characteristics:






System model:






Filter memory (points):


N opt

N opt

N op

N opt


May diverge










Approximately N opt times slower than


Kalman; Fast with parallel computing

Memory consumption:





Approximately N opt times more than



Computational complexity:






1.10.1 OUFIR vs. OFIR

Beginning with the early limited memory filter of Jazwinski [5], OFIR filtering has been under development for several decades. In control theory, fundamental progress was achieved by Kwon et al. and his followers[7, 35, 49][53]. In signal processing, solutions were found by Shmaliy et al.[8, 20, 27]. It was shown in[52] that different kinds of limited memory filters[5, 54] are equivalent to the OFIR one. The most serious flaws of this technique are high computational complexity and high memory consumption. With such poor engineering features, OFIR estimators still have not gained currency and their development remains mostly at a theoretical level.

On the other hand, OFIR estimators do not result in estimation errors that are substantially smaller than OUFIR ones, especially when N1. The rule of thumb here is as shown in Figure 4: The error difference between the OFIR and OUFIR estimates diminishes as N increases. Note that the boundary value 10…30 in Figure 4 is flexible and depends on the model. However, recalling that FIR filters require a large order (window size N1) to guarantee good performance, we obtain the following conclusion:
Figure 4
Figure 4

Effect of the estimator window size N on the error difference between OUFIR and OFIR estimators. Threshold A indicates where the difference becomes visually indistinguishable.

Fast‐ and low‐complexity iterative OUFIR algorithms that ignore noise statistics and initial error statistics are practically superior to the best‐known OFIR ones.

Note that this deduction often holds even if N is small. But in some applications, OFIR filters can be more appropriate because of their better accuracy.

1.10.2 OUFIR vs. Kalman filter

The well‐known features of the Kalman filter are optimality, fast computation, and low memory consumption. However, the Kalman filter requires a priori initial condition and noise statistics, and this is recognized as the most annoying flaw of the Kalman filter. Because of this requirement, the Kalman filter is suboptimal for all practical purposes. Moreover, its optimality is guaranteed only if the noise sources are white, which is not the case for many applications. Finally, the Kalman filter applies only to stochastic models.

In turn, the iterative OUFIR filter ignores noise statistics (except for the zero‐mean assumption), allows the noise to have any distribution and covariance, exhibits BIBO stability, and serves for both stochastic and deterministic models. However, it does not guarantee optimality, especially when N op is small. It requires (N opt−1)‐times more computational time and needs about N opt times more memory than the Kalman filter.

The Kalman filter is thus best when the noise is white and its statistics are exactly known. Otherwise, one may follow the rule of thumb sketched in Figure 5. As can be seen, it is only within a narrow range around the actual noise covariances that the OUFIR filter falls a bit short of the Kalman filter. Otherwise, the OUFIR filter demonstrates smaller errors. The Kalman filter is also the best filter under the ideal conditions. Otherwise, its error grows more rapidly than the OUFIR, meaning that the latter is more robust in real‐world applications (Figure 6).
Figure 5
Figure 5

Effect of errors in the noise covariances of the Kalman and OUFIR estimates. The value Δ depends on N and becomes insignificant when N opt1.

Figure 6
Figure 6

Effect of operating conditions on the Kalman and OUFIR estimates. The value Δ depends on N and becomes insignificant when N opt1.

Note that the error difference Δ between the two filters decreases with increasing N opt. These observations by diverse authors who have investigated uncertainties, different kinds of noise sources, and other irregular perturbations result in the following important inference:

Under the real‐world operating conditions, and when noise statistics and initial error statistics are not known exactly, the OUFIR estimator is able to outperform the Kalman filter even if N opt is not large.

Simulation results confirming these observations can be found in [19, 23, 46].

2 Conclusions

The UFIR algorithms discussed in this paper cover many applied problems associated with filtering, smoothing, and prediction of discrete‐time state‐space models. The most general conclusions one may arrive at by analyzing these estimators are the following: 1) UFIR algorithms are able to provide nice suboptimal estimates that are acceptable for many applications; 2) The optimal window size N opt can easily be estimated experimentally; 3) The extra time required by the UFIR iterations can be alleviated with parallel computing; and 4) The extra memory required by the UFIR estimators is not a problem for modern computers. So, we conclude that UFIR algorithms are strong rivals to the Kalman filter for real‐world applications. The iterative UFIR estimator commonly outperforms the OFIR one even if N opt is not large, and it is able to outperform the Kalman filter under real‐world operating conditions and when the noise statistics are not known exactly. That makes UFIR algorithms highly attractive for applications. We see the following main trends in further developments of FIR estimators. Optimal FIR filtering and smoothing strongly require fast Kalman‐like algorithms which are similar to those developed for UFIR estimators and considered in this paper. Such algorithms are required for small N opt. In turn, iterative UFIR algorithms need further optimization and robustification in non‐Gaussian environments and under the uncertainties. Special attention should also be paid to fast algorithms for the determination of N opt. Provided such modifications, one may expect new efficient FIR solutions.


a x ̂ n + p | n means the estimate at time n+p given measurements up to and including time n. Here, p=0 corresponds to filtering, p>0 corresponds to p‐step prediction, and p<0 corresponds to q‐lag smoothing, where q=−p. We simplify notation by using x ̂ n + p x ̂ n + p | n .

b In different applications, the FIR estimator memory is also called the receding horizon[53], sliding window[55], averaging interval[56], etc.


The covariance upper bound for TV models

Consider the MSE P l + p = E { ε l + p ε l + p T } in which we substitute the estimate x ̂ n + p with (36),
P l + p = E { [ F l + p ε l + p 1 + B l + p w l + p K l ( z l H l Y l x ̂ l + p 1 ) ] [ ] T } .
To find an iterative computation of (111), measurement z l needs to be expressed via x l+p−1. That can be done by combining the forward and backward solutions as follows:
p > 1 , x v : z l = H l Y l x v + v l H l j = 0 p 2 Y l ( p j ) B v j w v j p = 2 , x l + 1 : z l = H l F l + 1 1 ( x l + 1 B l + 1 w l + 1 ) + v l p = 1 , x l : z l = H l x l + v l p = 0 , x l 1 : z l = H l ( F l x l 1 + B l w l ) + v l p = 1 , x l 2 : z l = H l F l ( F l 1 x l 2 + B l 1 w l 1 ) + H l B l w l + v l p < 0 , x v : z l = H l Y l x v + v l + H l B l w l + H l j = 1 | p | Y l ( p + j ) B v + j w v + j ,
where v=l+p−1. Then deductive reasoning gives us
z l = H l ( Y l x l + p 1 + M l ) + v l ,
where M l M l ( p ) depends on p as
M l = j = 1 | p | Y l ( p + j ) B l + p 1 + j w l + p 1 + j + B l w l , p < 0 , B l w l , p = 0 , 0 , p = 1 , j = 0 p 2 Y l ( p j ) B l + p 1 j w l + p 1 j , p > 1
By (112), the MSE becomes
P l + p = E { [ ( F l + p K l H l Y l ) ε l + p 1 + B l + p w l + p K l ( H l M l + v l ) ] [ ] T } .
Taking into account that P l + p 1 = E { ε l + p 1 ε l + p 1 T } , Y l ( p + 1 ) = Y ̄ l ( p ) and analyzing products of the noise terms leads to the following:
P l + p = ( I K l H l Y ̄ l ) F l + p P l + p 1 F l + p T ( ) T + B l + p Q l + p B l + p T + K l R l T K l T + K l H l E { M l M l T } H l T K l T B l + p E { w l + p M l T } H l T K l + p T K l + p H l E { M l w l + p T } B l + p T = ( I K l H l Y ̄ l ) F l + p P l + p 1 F l + p T ( ) T + B l + p Q l + p B l + p T + K l R l K l T Q ̂ l + p H l T K l T K l H l Q ̂ l + p + K l H l Q ̄ l H l T K l T ,
Q ̂ l + p = B l + p Q l + p B l + p T Y ̄ l T , p 0 , 0 , p > 0
and in view of the fact that future noise is unknown and is commonly estimated as 0, Q ̄ l Q ̄ l ( p ) can be written as
Q ̄ l = j = 2 | p | Y l ( p + j ) B l + p 1 + j Q l + p 1 + j × B l + p 1 + j T Y l T ( p + j ) + B l Q l B l T , p 0 , 0 , p > 0 .


Authors’ Affiliations

Department of Electronics Engineering, Universidad de Guanajuato, Salamanca, Mexico
Department of Electrical and Computer Engineering, Cleveland State University, Cleveland, USA


  1. Gauss CF: Theory of the Combination of Observations Least Subject to Errors. Philadelphia: SIAM Publ; 1995. Transl. by Stewart, GWView ArticleGoogle Scholar
  2. Kay SM: Fundamentals of Statistical Signal Processing. New York: Prentice Hall; 2001.Google Scholar
  3. Kim PS, Lee ME: A new FIR filter for state estimation and its applications. J. Comput. Sc. Techn 2007, 22: 779-784. 10.1007/s11390-007-9085-8View ArticleGoogle Scholar
  4. Shmaliy YS: An unbiased FIR filter for TIE model of a local clock in applications to GPS‐based timekeeping. IEEE Trans. Ultrason., Ferroel. Freq. Contr 2006, 53: 862-870.View ArticleGoogle Scholar
  5. Jazwinski AH: Stochastic Processes and Filtering Theory. New York: Academic Press; 1970.Google Scholar
  6. Kalman RE: A new approach to linear filtering and prediction problems. J. Basic Eng 1960, 82: 35-45. 10.1115/1.3662552View ArticleGoogle Scholar
  7. Kwon OK, Kwon WH, Lee KS: FIR filters and recursive forms for discrete‐time state‐space models. Automatica 1989, 25: 715-728. 10.1016/0005-1098(89)90027-7MathSciNetView ArticleGoogle Scholar
  8. Shmaliy YS, Ibarra‐Manzano O: Time‐variant linear optimal finite impulse response estimator for discrete‐time state‐space models. Int. J. Adapt. Contr. Signal Process 2012, 26: 95-104. 10.1002/acs.1274MathSciNetView ArticleGoogle Scholar
  9. Gibbs B: Advanced Kalman Filtering, Least‐Squares and Modeling. New York: Wiley; 2011.View ArticleGoogle Scholar
  10. Ferrari P, Flammini A, Rinaldi S, Bondavalli A, Brancati F: Experimental characterization of uncertainty sources in a software‐only synchronization system. IEEE Trans. Instrum. Meas 2012, 61: 1512-1521.View ArticleGoogle Scholar
  11. Blum M: On the mean square noise power of an optimum linear discrete filter operating on polynomial plus white noise input. IRE Trans. Inform. Theory 1957, 3: 225-231. 10.1109/TIT.1957.1057423View ArticleGoogle Scholar
  12. Johnson KR: Optimum, linear, discrete filtering of signals containing a nonrandom component. IRE Trans. Inform. Theory 1956, 2: 49-55. 10.1109/TIT.1956.1056784View ArticleGoogle Scholar
  13. Heinonen P, Neuvo Y: FIR‐median hybrid filters with predictive FIR structures. IEEE Trans. Acoust. Speech Signal, Process 1988, 36: 892-899. 10.1109/29.1600View ArticleGoogle Scholar
  14. Ovaska SJ, Vainio O, Laakso TI: Design of predictive IIR filters via feedback extension of FIR forward predictors. IEEE Trans. Instrum. Meas 1997, 46: 1196-1201. 10.1109/19.676741View ArticleGoogle Scholar
  15. Samadi S, Nishihara A: Explicit formula for predictive FIR filters and differentiators using Hahn orthogonal polynomials. IEICE Trans. Fundamentals 2007, E90‐A: 1511-1518.View ArticleGoogle Scholar
  16. Kim PS: An alternative FIR filter for state estimation in discrete‐time systems. Digital Signal Process 2010, 20: 935-943. 10.1016/j.dsp.2009.10.033View ArticleGoogle Scholar
  17. Shmaliy YS, Morales‐Mendoza LJ: FIR smoothing of discrete‐time polynomial signals in state space. IEEE Trans. Signal Process 2010, 58: 2544-2555.MathSciNetView ArticleGoogle Scholar
  18. Shmaliy YS: Unbiased FIR filtering of discrete‐time polynomial state‐space models. IEEE Trans. Signal Process 2009, 57: 1241-1249.MathSciNetView ArticleGoogle Scholar
  19. Shmaliy YS: An iterative Kalman‐like algorithm ignoring noise and initial conditions. IEEE Trans. Signal Process 2011, 59: 2465-2473.MathSciNetView ArticleGoogle Scholar
  20. Shmaliy YS: Linear optimal FIR estimation of discrete time‐invariant state‐space models. IEEE Trans. Signal Process 2010, 58: 3086-3096.MathSciNetView ArticleGoogle Scholar
  21. Balbuena DH, Sergiyenko O, Tyrsa V, Burtseva L, Lopez MR: Signal frequency measurement by rational approximations. Measurement 2009, 42: 136-144. 10.1016/j.measurement.2008.04.009View ArticleGoogle Scholar
  22. Levine J: Invited Review Article: The statistical modeling of atomic clocks and the design of time scales. Rev. Sc. Instr 2012, 83: 021101-1–021101‐28. 10.1063/1.3681448View ArticleGoogle Scholar
  23. Simon D, Shmaliy YS: Unified forms for Kalman and finite impulse response filtering and smoothing. Automatica 2013, 49: 1892-1899. 10.1016/j.automatica.2013.02.026MathSciNetView ArticleGoogle Scholar
  24. Wang WQ: Phase noise suppression in GPS‐disciplined frequency synchronization systems. Fluctuation Noise L 2011, 10: 303-313. 10.1142/S0219477511000582View ArticleGoogle Scholar
  25. Rauch HE, Tung F, Striebel CT: Maximum likelihood estimates of linear dynamic systems. AIAA J 1965, 3: 1445-1450. 10.2514/3.3166MathSciNetView ArticleGoogle Scholar
  26. Stark H, Woods JW: Probability, Random Processes, and Estimation Theory for Engineers. Upper Saddle River: Prentice Hall; 1994.Google Scholar
  27. Shmaliy YS: GPS‐based Optimal FIR Filtering of Clock Models. New York: Nova Science Publ.; 2009.Google Scholar
  28. Simon D: Optimal State Estimation: Kalman, H∞ and Nonlinear Approaches. New York: Wiley; 2006.View ArticleGoogle Scholar
  29. Gustafsson F: Adaptive Filtering and Change Detection. New York: Wiley; 2000.Google Scholar
  30. Haykin S: Lessons on adaptive systems for signal processing, communications, and control. IEEE Signal Process. Mag 1999, 16: 39-48.View ArticleGoogle Scholar
  31. Moore JB: Discrete‐time fixed lag smoothing algorithms. Automatica 1973, 9: 163-173. 10.1016/0005-1098(73)90071-XView ArticleGoogle Scholar
  32. Bar‐Shalom Y, Li XR, Kirubarajan T: Estimation with Applications to Tracking and Navigation. New York: Wiley; 2001.View ArticleGoogle Scholar
  33. Helmick RE, Blair WD, Hoffman SA: Fixed‐interval smoothing for Markovian switching systems. IEEE Trans. Inform. Theory 1995, 41: 1845-1855. 10.1109/18.476310View ArticleGoogle Scholar
  34. Kim JH, Lyou J: Target state estimator design using FIR filter and smoother. Trans. Control, Automation, and Syst. Eng 2002, 4: 305-310.Google Scholar
  35. Ahn CK, Kim PS: Fixed‐lag maximum likelihood FIR smoother for state‐space models. IEICE Electonics Express 2008, 5: 11-16. 10.1587/elex.5.11View ArticleGoogle Scholar
  36. Biswas KK, Mahalanabis AK: An approach to fixed‐lag smoothing problems. IEEE Trans. Aerospace Electron. Syst 1972, AES‐8: 676-682.View ArticleGoogle Scholar
  37. Theodor Y, Shaked U: Game theory approach to H∞‐optimal discrete‐time fixed‐point and fixed‐lag smoothing. IEEE Trans. Autom. Contr 1994, 39: 1944-1948. 10.1109/9.317131MathSciNetView ArticleGoogle Scholar
  38. Vaccaro RJ: Digital Control: A State‐Space Approach. New York: McGraw‐Hill; 1995.Google Scholar
  39. Carlin BP, Polson NG, Stoffer DS: A Monte‐Carlo approach to nonnormal and nonlinear state‐space modeling. J. Am. Statist. Assoc 1992, 87: 493-500. 10.1080/01621459.1992.10475231View ArticleGoogle Scholar
  40. Lamothe M, Auclair M, Hamzaoui C, Huot S: Towards a prediction of long‐term anomalous fading of feldspar IRSL. Radiat. Meas 2003, 37: 493-498. 10.1016/S1350-4487(03)00016-7View ArticleGoogle Scholar
  41. ITU: Timing Characteristics of Primary Reference Clocks. : ITU‐T Recommendation G.811; 1997.Google Scholar
  42. Levine J: Time synchronization over the Internet using an adaptive frequency‐locked loop. IEEE Trans. Ultrason., Ferroelect., Freq. Contr 1999, 46: 888-896.View ArticleGoogle Scholar
  43. Iwata T, Imae M, Suzuyama T, Murakami H, Kawasaki Y: Simulation and ground experiments of remote synchronization system for on‐board crystal oscillator of Quazi‐Zenith Satelite System. J. Inst. Navig 2006, 53: 231-235.View ArticleGoogle Scholar
  44. Makhoul J: Linear prediction: A tutorial review. Proc. IEEE 1975, 63: 561-580.View ArticleGoogle Scholar
  45. Cox H: On the estimation of state variables and parameters for noisy dynamic systems. IEEE Trans. Autom. Contr 1964, 9: 5-12. 10.1109/TAC.1964.1105635View ArticleGoogle Scholar
  46. Shmaliy Y S: Suboptimal FIR filtering of nonlinear models in additive white Gaussian noise. IEEE Trans. Signal Process 2012, 60: 5519-5527.MathSciNetView ArticleGoogle Scholar
  47. Shmaliy YS, Muñoz‐Diaz J, Arceo‐Miquel L: Optimal horizons for a one‐parameter family of unbiased FIR filters. Digital Signal Process 2008, 18: 739-750. 10.1016/j.dsp.2007.10.002View ArticleGoogle Scholar
  48. Ramirez‐Echeverria F, Sarr A, Shmaliy YS: Optimal memory for discrete‐time FIR filters in state‐space. IEEE Trans. Signal Process 2013. in pressGoogle Scholar
  49. Han SH, Kwon WH, Kim PS: Quasi‐deadbeat minimax filters for deterministic state‐space models. IEEE Trans. Automat. Contr 2002, 47: 1904-1908.MathSciNetView ArticleGoogle Scholar
  50. Ahn CK: Strictly passive FIR filtering for state‐space models with external disturbance. Int. J. Electron. Commun 2012, 66: 944-948. 10.1016/j.aeue.2012.04.002View ArticleGoogle Scholar
  51. Ahn CK, Han SH: New H∞ FIR smoother for linear discrete‐time state‐space models. IEICE Trans. Commun 2010, E91.B: 896-899.View ArticleGoogle Scholar
  52. Kwon WH, Suh YS, Lee YI, Kwon OK: Equivalence of finite memory filters. IEEE Trans. Aerosp. Electron. Syst 1994, 30: 968-972. 10.1109/7.303774View ArticleGoogle Scholar
  53. Han S, Kwon W H: Receding Horizon, Control: Model Predictive Control for State Models. London: Springer; 2005.Google Scholar
  54. Schweppe FC: Uncertain Dynamic Systems. New York: Prentice‐Hall; 1973.Google Scholar
  55. Papadias CB, Slock DTM: Normalized sliding window constant modulus and decision‐directed algorithm: a link between blind equalization and classical adaptive filtering. IEEE Trans. Signal Process 1997, 45: 231-235. 10.1109/78.552221View ArticleGoogle Scholar
  56. Treicher J, Larimore M: New processing technique based on the constant modulus adaptive algorithm. IEEE Trans. Acoust., Speech, Signal Process 1985, 33: 420-431. 10.1109/TASSP.1985.1164567View ArticleGoogle Scholar


© Shmaliy and Simon; licensee Springer. 2013

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.