Skip to main content

Family of state space least mean power of two-based algorithms

Abstract

In this work, a novel family of state space adaptive algorithms is introduced. The proposed family of algorithms is derived based on stochastic gradient approach with a generalized least mean cost function J[k]=E[ε[k]2L] for any integer L. Since this generalized cost function is having power `2L’, it includes the whole family of the power of two-based algorithms by having different values of L. The novelty of the work resides in the fact that such a cost function has never been used in the framework of state space model. It is a well-known fact that the knowledge of state space model improves the estimation of state parameters of that system. Hence, by employing the state space model with a generalized cost function, we provide an efficient way to estimate the state parameters. The proposed family of algorithms inherit simplicity in its structure due to the use of stochastic gradient approach in contrast to the other model-based algorithms such as Kalman filter and its variants. This fact is supported by providing a comparison of the computational complexities of these algorithms. More specifically, the proposed family of algorithms has computational complexity far lesser than that of the Kalman filter. The stability of the proposed family of algorithms is analysed by providing the convergence analysis. Extensive simulations are presented to provide concrete justification and to compare the performances of the proposed family of algorithms with that of the Kalman filter.

Introduction

Over the past few decades, adaptive filters have gained huge recognition in innumerable applications extending over a wide range of fields. Adaptive filtering is an important part of statistical signal processing, and adaptive filters have been successfully applied in diverse fields such as equalization, noise cancellation, linear prediction, and in system identification [1,2]. Adaptive filters are preferred over conventional filters because of their accuracies due to adaptive capability in the domain of the problem in which it is being used. Adaptive filters automatically adjust their weights according to some adaptive algorithm which is usually based on minimization of a function of the difference between the desired signal and the observed signal [1,2]. The most widely used algorithms for adaptive filters are the least mean squares (LMS) algorithm [1,2] and the recursive least squares algorithm [1,2].

It is observed that the adaptive filters designed by incorporating the knowledge of state space (SS) model of the system performs better than the ones without it. Amongst the plethora of literature found on adaptive filtering, there are many algorithms that deal with the SS model. For example, the very well-known Kalman filter (KF) [3], which gives the linear optimal solution by calculating the minimum mean square error (MMSE) while utilizing the system model. It optimally estimates on the basis of observations which are subjected to noise and other disturbances. In the nonlinear filtering domain, we have the extended Kalman filter (EKF) [3], unscented Kalman filter (UKF) [4], cubature Kalman filter (CKF) [5], quadrature Kalman filter (QKF) [6] and many other variants of KF [3]. These provide suboptimal solution to the filtering problem. Although numerous techniques exist in literature for estimation of state parameters utilizing state space model but most of them are either highly computationally complex or have less accuracy in estimating the state parameters. Recently, the SS version of LMS and RLS are developed in [7-10] with the aim to provide an alternative to the highly-complex KF techniques. This is the very reason for carrying out our investigation.

It is found in the literature that all of the state space-based adaptive filtering estimation algorithms can be formulated using the generalized form [3]:

$$ \boldsymbol{\hat{x}}[k]=\boldsymbol{\bar{x}}[k]+\boldsymbol{K}[k]\boldsymbol{\varepsilon}[k] $$
((1))

where:

$$ \boldsymbol{\varepsilon}[k]=\boldsymbol{y}[k]-\boldsymbol{\bar{y}}[k] $$
((2))

Equation 1 forms the basis of most of the state-space-based estimation algorithms [3-6]. The usual practice to derive the gain K[k] is to employ least squares solution [1]. Usually the square of the error is minimized; however, non-mean square errors have also been studied [11]. In this paper, we propose to minimize a more general cost function given by J=E[ε2L], where the notation ‘ .’ is used to represent the Euclidean norm and L is a positive integer value where L=1,2,3,... for the basic state space least mean square (SSLMS) algorithms. Our main contribution in this work is to develop a family of adaptive algorithms which has much lesser computational cost as compared to the existing state space model-based adaptive algorithms [3-6]. We provide a detailed comparison of computational cost to support this argument in Section 6.

The paper is organized as such: Section 2 of the paper introduces the state space model. Then in Section 3, the proposed general SSLM algorithm is derived. Section 4 presents the convergence analysis followed by simulation results and comparison of the different algorithms in Section 5. Section 6 presents the computational complexity of the algorithms, and finally, X we conclude the paper in Section 7.

State space model

We begin by defining the general state space model of a linear time varying system.

$$\begin{array}{*{20}l} \textbf{x}[k+1]&=\textbf{A}[k]\textbf{x}[k]+\textbf{B}[k]\textbf{u}[k]+\textbf{w}[k], \end{array} $$
((3a))
$$\begin{array}{*{20}l} \textbf{y}[k]&=\textbf{C}[k]\textbf{x}[k]+\textbf{D}[k]\textbf{u}[k]+\textbf{v}[k] \end{array} $$
((3b))

where xn are the process states, ym are the measured outputs such that mn. A[k] is the state transition matrix, B[k] is the input matrix, u[k] is the input vector where up, wn is the process noise vector and vm is the measurement noise vector. The matrix C[k] is the output matrix where dim[C[k]]=m×n, D[k] is the feed through matrix with dim[D[k]]=m×p. It is assumed that the above system is observable. A special case is the unforced (autonomous) linear time varying system, represented as:

$$\begin{array}{*{20}l} \textbf{x}[k+1]&=\textbf{A}[k]\textbf{x}[k]+\textbf{w}[k], \end{array} $$
((4a))
$$\begin{array}{*{20}l} \textbf{y}[k]&=\textbf{C}[k]\textbf{x}[k]+\textbf{v}[k] \end{array} $$
((4b))

The state space representation for a nonlinear continuous time system is:

$$\begin{array}{*{20}l} \boldsymbol{\dot{x}}&=\mathit{\boldsymbol{f}}(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w}), \end{array} $$
((5a))
$$\begin{array}{*{20}l} \boldsymbol{y}&=\boldsymbol{h}(\boldsymbol{x},\boldsymbol{u},\boldsymbol{v}) \end{array} $$
((5b))

where f and h are nonlinear functions and the parameters are as defined before.

Derivation of the proposed general SSLM algorithm

Considering the system described by Equation 4 above, a model-based adaptive estimation process can be divided into the following two steps. Step 1 is the time update which is given by:

$$ \boldsymbol{\bar{x}}[k]=\boldsymbol{A}[k-1]\boldsymbol{\hat{x}}[k-1] $$
((6))

Step 2 is the measurement update which is given by:

$$ \boldsymbol{\hat{x}}[k]=\boldsymbol{\bar{x}}[k]+\boldsymbol{K}[k]{\boldsymbol{\varepsilon}}[k] $$
((7))

where ε[k] is the prediction error defined as:

$$ \boldsymbol{\varepsilon}[k] = \boldsymbol{y}[k] - \boldsymbol{\bar{y}}[k] $$
((8))

here, y[k] is as mentioned in Equation 4, K[k] is the gain matrix and:

$$ \boldsymbol{\bar{y}}[k] = \boldsymbol{C}[k] \boldsymbol{\bar{x}}[k] $$
((9))

Equations 6 to 9 constitute the basic structure employed in all KF techniques [3-6]. From Equations 6, 8, and 9, Equation 7 can be written as:

$$ \boldsymbol{\hat{x}}[k] = [\boldsymbol{I}-\boldsymbol{K}[k]\boldsymbol{C}[k]]\boldsymbol{A}[k-1]\boldsymbol{\hat{x}}[k-1] + \boldsymbol{K}[k]\boldsymbol{y}[k] $$
((10))

The measurement update of the class of adaptive filter governed by Equation 7 can be set up in a more generalized form as follows:

$$ \boldsymbol{\hat{x}}[k] = \boldsymbol{\bar{x}}[k] - \mu\nabla\boldsymbol{J}[k] $$
((11))

where J[k] is the cost function to be minimized, J[k] is the gradient of the cost function with respect to the predicted states, and μ is the step size parameter. To derive the proposed SSLM algorithms, we start by defining the general cost function as:

$$ \textbf{J}[k] = \textbf{E}\left[\|\boldsymbol{\varepsilon}[k]\|^{2L}\right]~\text{for}~L = 1,2,3,... $$
((12))

Minimizing the cost function J[k] with respect to the predicted states (\(\bar {\boldsymbol {x}}[k]\)) result in:

$$ \nabla\textbf{J}[k] = -2L\|\boldsymbol{\varepsilon}[k]\|^{2L-2}\textbf{C}^{T}[k]\boldsymbol{\varepsilon}[k] $$
((13))

Substituting Equation 13 in Equation 11 and generalizing, the Equation 11 can hence be written as:

$$ \boldsymbol{\hat{x}}[k] = \boldsymbol{\bar{x}}[k] + \mu\|\boldsymbol{\varepsilon}[k]\|^{2L-2}\boldsymbol{G}\boldsymbol{C}^{T}[k]\boldsymbol{\varepsilon}[k] $$
((14))

which is our general estimator algorithm. A matrix G was imposed for the condition of controllability [7] which is required due to the dynamics of the system where the algorithm is being applied. Comparing Equations 7 and 14 yields:

$$ \textbf{K}[k] = \mu\|\boldsymbol{\varepsilon}[k]\|^{2L-2}\textbf{G}\textbf{C}^{T}[k] $$
((15))

It should be noted that for L=1, the algorithm results in the basic SSLMS [7].

Convergence in the mean analysis

Before proceeding to the convergence analysis of the proposed algorithm, we set up the stage by putting forth the following assumptions.

  1. 1.

    The noise vectors w[k] and v[k] are zero-mean white processes with covariance matrices \(\textbf {Q}_{w}={\sigma ^{2}_{w}}\textbf {I}\) and \(\textbf {Q}_{v} = {\sigma ^{2}_{v}}\textbf {I}\), respectively. Moreover, they are independent of the input and state variables of the system.

  2. 2.

    The system matrices A[k] and C[k] are independent of the state variables to be estimated. Hence, they can be treated as deterministic variables.

  3. 3.

    The filter’s length is long enough to apply the law of long adaptive filters [2].

The first assumption is a well-known assumption and is also true in real practice. The second assumption is true for linear seperable systems but not true for nonlinear and inseparable systems. However, we can employ this assumption to make the analysis tractable. Moreover, this assumption is valid in most of the practical scenarios. The third assumption is also well-known in literature and often used in the analysis of adaptive filters [2].

Considering the general SSLMS algorithm given in Equation 14 which is reproduced here for convenience:

$$ \boldsymbol{\hat{x}}[k] = \boldsymbol{\bar{x}}[k] + \mu\|\boldsymbol{\varepsilon}[k]\|^{2L-2}\boldsymbol{G}\boldsymbol{C}^{T}[k]\boldsymbol{\varepsilon}[k] $$
((16))

After substituting the expression for predicted state from Equation 6 in the above equation and using the definition of vector norm, we can rewrite Equation 16 as:

$$\begin{array}{*{20}l}{} \boldsymbol{\hat{x}}[k]\! &= \boldsymbol{A}[k\,-\,1]\boldsymbol{\hat{x}}[k-1] + \mu(\boldsymbol{\varepsilon}^{T}[k]\boldsymbol{\varepsilon}[k])^{L-1}\boldsymbol{C}^{T}[k]\boldsymbol{\varepsilon}[k]\! \end{array} $$
((17a))
$$\begin{array}{*{20}l} &= \boldsymbol{A}[k-1]\boldsymbol{\hat{x}}[k-1] + \mu((\boldsymbol{y}[k]-\boldsymbol{\bar y}[k])^{T}(\boldsymbol{y}[k]-\boldsymbol{\bar{y}}[k]))^{L-1}\\ &~~~~\times\boldsymbol{G}\boldsymbol{C}^{T}[k](\boldsymbol{y}[k]-\boldsymbol{\bar{y}}[k]) \end{array} $$
((17b))

To proceed further, we express the prediction error ε[k] in terms of the actual states and the estimated states. To do so, we substitute the values of y[k] from Equation 4 and \(\boldsymbol {\bar y}[k]\) from Equation 9 in the expression of ε[k] to obtain:

$$\begin{array}{*{20}l} \boldsymbol{y}[k]-\boldsymbol{\bar{y}}[k] &= \boldsymbol{C}[k](\boldsymbol{x}[k]-\boldsymbol{\bar{x}}[k])+\boldsymbol{v}[k] \end{array} $$
((18a))
$$\begin{array}{*{20}l} &= \boldsymbol{C}[k](\boldsymbol{A}[k-1]\boldsymbol{x}[k-1]+\boldsymbol{w}[k-1]\\ &\quad-\boldsymbol{A}[k-1]\boldsymbol{\hat{x}}[k-1])+\boldsymbol{v}[k] \end{array} $$
((18b))
$$\begin{array}{*{20}l} &=\boldsymbol{C}[k]\boldsymbol{A}[k-1](\boldsymbol{x}[k-1]-\boldsymbol{\hat{x}}[k-1])\\ &\quad+\boldsymbol{C}[k]\boldsymbol{w}[k-1]+\boldsymbol{v}[k] \end{array} $$
((18c))

Defining the state estimation error vector \(\boldsymbol {\tilde {x}}[k]\) as:

$$ \boldsymbol{\tilde{x}}[k] = \boldsymbol{\hat{x}}[k] - \boldsymbol{x}[k] $$
((19))

enables Equation 18 to be rewritten as:

$$\begin{array}{*{20}l} \boldsymbol{y}[k]-\boldsymbol{\bar{y}}[k] &= -\boldsymbol{C}[k]\boldsymbol{A}[k-1]\boldsymbol{\tilde{x}}[k-1]\\ &\quad+\boldsymbol{C}[k]\boldsymbol{w}[k-1]+\boldsymbol{v}[k] \end{array} $$
((20))

Now, the substitution of the above result in Equation 17 and subtraction of x[k] from both sides results in:

$$ {\small\begin{aligned} \boldsymbol{\hat{x}}[k]-\boldsymbol{x}[k] &= \boldsymbol{A}[k-1]\boldsymbol{\hat{x}}[k-1] - \boldsymbol{x}[k]\\ &\quad+ \mu[(-\boldsymbol{C}[k]\boldsymbol{A}[k-1]\boldsymbol{\tilde{x}}[k-1]\\ &\quad+\boldsymbol{C}[k]\boldsymbol{w}[k-1]+\boldsymbol{v}[k])^{T}(-\boldsymbol{C}[k]\\ &\quad\times\boldsymbol{A}[k-1]\boldsymbol{\tilde{x}}[k-1]+\boldsymbol{C}[k]\boldsymbol{w}[k-1]\\ &\quad+\boldsymbol{v}[\!k])]^{L-1}\boldsymbol{G}\boldsymbol{C}^{T}[\!k](-\boldsymbol{C}[\!k]\boldsymbol{A}[k\,-\,1]\boldsymbol{\tilde x}[\!k\,-\,1]\\ &\quad+\boldsymbol{C}[k]\boldsymbol{w}[k-1]+\boldsymbol{v}[k])\\[-14pt] \end{aligned}} $$
((21))

Next, by substituting x[k]=A[k−1]x[k−1]+w[k−1] in Equation 21 on the right hand side and simplifying will give:

$$ {\small\begin{aligned} \boldsymbol{\tilde x}[k] &=\boldsymbol{A}[k-1]\boldsymbol{\tilde{x}}[k-1]-\boldsymbol{w}[k-1]\\ &\quad+\mu[(-\boldsymbol{C}[k]\boldsymbol{A}[k-1]\boldsymbol{\tilde{x}} [k-1]\\ &\quad+\boldsymbol{C}[k]\boldsymbol{w}[k-1]+\boldsymbol{v}[k])^{T}(-\boldsymbol{C}[k]\boldsymbol{A}[k-1]\\ &\quad\times\boldsymbol{\tilde{x}}[k-1]+\boldsymbol{C}[k]\boldsymbol{w}[k-1]+\boldsymbol{v}[k])]^{L-1}\boldsymbol{G}\boldsymbol{C}^{T}[k]\\ &\quad\times(-\boldsymbol{C}[k] \boldsymbol{A}[k\,-\,1]\boldsymbol{\tilde{x}}[k\,-\,1]+\boldsymbol{C}[k]\boldsymbol{w}[k\,-\,1]+\boldsymbol{v}[k]) \end{aligned}} $$
((22))

The complete mean convergence analysis for the generalized case with arbitrary value of L is quite involved. Therefore, we present here special scenarios of L=1 and L=2. Thus, simplifying Equation 22 for the case when L=1 results in:

$$\begin{array}{*{20}l} \boldsymbol{\tilde x}[k] &=\textbf{A}[k-1]\boldsymbol{\tilde{x}}[k-1]-\textbf{w}[k-1]\\ &\quad+\mu\textbf{G}\textbf{C}^{T}[k] (-\textbf{C}[k]\textbf{A}[k-1]\boldsymbol{\tilde{x}}[k-1]\\ &\quad+\textbf{C}[k]\textbf{w}[k-1]+\textbf{v}[k]) \end{array} $$
((23a))
$$\begin{array}{*{20}l} &=\textbf{A}[k-1]\boldsymbol{\tilde{x}}[k-1]\\ &\quad-\textbf{w}[k-1]-\mu\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k]\textbf{A}[k-1] \boldsymbol{\tilde{x}}[\!k-1]\\ &\quad+\mu\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k]\textbf{w}[k-1] \end{array} $$
((23b))
$$\begin{array}{*{20}l} &\quad+\mu\textbf{G}\textbf{C}^{T}[k]\textbf{v}[k] \end{array} $$
((23c))
$$\begin{array}{*{20}l} &=(\textbf{I}-\mu\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k])\textbf{A}[k-1]\boldsymbol{\tilde{x}}[k-1]\\ &\quad+\mu\textbf{G} \textbf{C}^{T}[k]\textbf{C}[k]\textbf{w}[k-1]\\ &\quad+\mu\textbf{G}\textbf{C}^{T}[k]\textbf{v}[k]-\textbf{w}[k-1] \end{array} $$
((23d))

Now, taking expectation on both sides of the above equation and by employing Assumptions 1 and 2 will result in:

$$ \textbf{E}[\boldsymbol{\tilde{x}}[k]] = (\textbf{I} - \mu \textbf{G}\textbf{C}^{T}[k]\textbf{C}[k])\textbf{A}[k-1]\textbf{E}[\boldsymbol{\tilde{x}}[k-1]] $$
((24))

Hence, it can be observed that the state estimation error vector will converge in the mean sense provided that the following conditions are fulfilled:

  1. 1.

    The matrices A[k],A[k−1],... should have bounded entries, i.e.:

    $$ |\textbf{A}[k]\{i,j\}|<1,~~~\forall~k $$
    ((25))

    for i=1,2,3,...,n and j=1,2,3,...,n.

  2. 2.

    |Iμ G C T[k]C[k]|<1 which implies that the step size μ of the algorithm is bounded by:

    $$ 0<\mu<\frac{2}{\lambda_{\text{max}}(\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k])}, ~~~\forall~k $$
    ((26))

    where λ max(G C T[k]C[k]) represents the largest eigenvalue of G C T[k]C[k].

  3. 3.

    System or the states should be observable, i.e. matrix C[k] is full rank.

The above conditions correspond to the bounds for the basic SSLMS algorithm (L=1). Thus, our analysis provides the mean convergence bounds of the existing SSLMS algorithm which was not provided in the original work [7]. Similarly, for L=2, the mean convergence analysis could be performed for the corresponding algorithm in the family. On substitution of L=2, simplification and taking expectation of Equation 22 results in:

$$\begin{array}{*{20}l} \textbf{E}[\boldsymbol{\tilde{x}}[k]] =~&\{\textbf{I}-\mu(\gamma\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k]\\ &+\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k]\textbf{Q}_{w}\textbf{C}^{T} [k]\textbf{C}[k]\\ &+\textbf{G}\textbf{C}^{T}[k]\textbf{Q}_{v}\textbf{C}[k]+\textbf{G} \end{array} $$
((27a))
$$\begin{array}{*{20}l} ~&\times\textbf{C}^{T}[k]\textbf{C}[k]\textbf{Q}_{w}\textbf{C}^{T}[k]\textbf{C}[k]\\ &+{\sigma^{2}_{w}}Tr(\textbf{C}^{T} [k]\textbf{C}[k])\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k]\\ &+\textbf{G}\textbf{C}^{T}[k]\textbf{Q}_{v}\textbf{C}[k] \end{array} $$
((27b))
$$\begin{array}{*{20}l} ~&+{\sigma^{2}_{v}}\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k])\}\textbf{A}[k\,-\,1]\textbf{E}[\boldsymbol{\tilde{x}}[k\,-\,1]] \end{array} $$
((27c))

where \(\gamma = \|\boldsymbol {\tilde x}\|^{2}_{\zeta }\) which is considered a constant due to the employment of the Assumption 3. Here, the term ζ is given by:

$$ \zeta = \textbf{A}^{T}[k-1]\textbf{C}^{T}[k]\textbf{C}[k]\textbf{A}[k-1] $$
((28))

Let us assume a matrix Z[k], such that:

$$\begin{array}{*{20}l} \textbf{Z}[k] =&~(\gamma\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k]+\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k]\textbf{Q}_{w}\textbf{C}^{T}[k]\textbf{C}[k]\\ &+\textbf{G}\textbf{C}^{T}[k]\textbf{Q}_{v}\textbf{C}[k]+\textbf{G} \end{array} $$
((29a))
$$\begin{array}{*{20}l} ~&\times\textbf{C}^{T}[k]\textbf{C}[k]\textbf{Q}_{w}\textbf{C}^{T}[k]\textbf{C}[k]\\ &+{\sigma^{2}_{w}}Tr(\textbf{C}^{T}[k]\textbf{C}[k])\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k]\\ &+\textbf{G}\textbf{C}^{T}[k]\textbf{Q}_{v}\textbf{C}[k] \end{array} $$
((29b))
$$\begin{array}{*{20}l} ~&+{\sigma^{2}_{v}}\textbf{G}\textbf{C}^{T}[k]\textbf{C}[k]) \end{array} $$
((29c))

Therefore, Equation 27 can be written as:

$$ \textbf{E}[\boldsymbol{\tilde{x}}[k]]=~(\textbf{I}-\mu\textbf{Z}[k])\textbf{A}[k-1]\textbf{E}[\boldsymbol{\tilde{x}}[k-1]] $$
((30))

Consequently, the mean convergence of the state estimation error vector for L=2 is obtained by fulfilling the Conditions 1 and 3 reported for L=1 except the bound for step size μ which is given by:

$$ 0<\mu<\frac{2}{\lambda_{\text{max}}(\textbf{Z}[k])}, ~~~\forall k $$
((31))

The analysis for the higher values of L can be similarly carried out.

Simulation results and discussion

In this section, we present simulation results to validate the performance of the proposed family and to compare its performance with that of the standard KF. More specifically, we aim to compare mean square error performance and the convergence speed of the aforementioned estimators. For this purpose, we investigate the following three experiments:

  • Estimation of the state parameters of a noisy sinusoid tracking problem.

  • Estimation of the state parameters of a Van der pol oscillator.

  • Estimation of the state parameters of a symmetrical three-phase induction machine.

For all of the above experiments, we investigate three different noise environments, namely Gaussian, Uniform, and Laplacian. We also investigate the effect of exponent L for the aforementioned noise environments. More specifically, we set L=1,2,3, and 4 which corresponds to SSLMS, state space least mean fourth (SSLMF), state space least mean sixth (SSLMSi), and state space least mean eighth (SSLME) algorithm, respectively. The results are reported after averaging of 100 independent simulation experiments.

Example 1. Tracking sinusoids

In the first example, we consider the system reported in [7]. More specifically, we investigate a second order transversal filter with known frequency and unknown phase and amplitude of sinusoids which produces a 4th order system given by:

$$\begin{array}{*{20}l} \boldsymbol{A}[k] &= \boldsymbol{\text{diag}}\left\{\left[\begin{array}{ll} \cos(\omega_{i} T)&\; \sin(\omega_{i} T)\\ -\sin(\omega_{i} T)&\; \cos(\omega_{i} T)\end{array}\right]\right\},~i = 1,2 \end{array} $$
((32a))
$$\begin{array}{*{20}l} \boldsymbol{C}[k] &= \left[\begin{array}{llll} 1&0&1&0\end{array}\right] \end{array} $$
((32b))

where ω i ’s represent frequencies of sinusoids which are known and constant. For the purpose of our study, we set the values of the frequencies ω 1,ω 2 to 0.5 and 0.25, respectively. T represents the sampling time which is considered to be 0.1 s for our study. The step size μ for the family of algorithms is adjusted to achieve optimum performance. Variance of the observation noise used is \({{\sigma ^{2}_{v}}} = 0.001^{2}\) while the process noise is considered with variance \({\sigma ^{2}_{w}} = 0.0001^{2}\). The G matrix is taken as all zero entries except for the first column with 1’s. The actual initial system states are considered to be x[0]=[0.1 0.1 0.1 0.1]T, and the initial estimate for tracking are chosen to be \(\hat {\mathbf {x}}[0] = [0.15~0.2~0.05~0.16]^{T}\). In Figure 1, the true value of all the four states are compared with the considered algorithms. It can be depicted from Figure 1 that the performance of KF is better than the proposed family of algorithms. This fact is further elaborated in Figure 2 where the mean square error for these estimators are compared. However, this superiority of the KF is achieved at the expense of high-computational cost. The KF performs approximately six times as many operations as the proposed family of algorithms (refer Section 6). The plot of the mean square observation error in the presence of Gaussian noise are presented in Figure 3. It is observed from Figure 1 that all of the SSLM algorithms are performing well in terms of estimation. A closer look at Figure 2 reveals that SSLME converged faster. Moreover, Table 1 verifies that SSLME performs the best in comparison to other algorithms in the family.

Figure 1
figure 1

State x 1[k] (a), x 2[k] (b), x 3[k] (c), and x 4[k] (d).

Figure 2
figure 2

Mean square error state x 1[k] (a), x 2[k] (b), x 3[k] (c), and x 4[k] (d).

Figure 3
figure 3

Mean square error y[k] (Example 1).

Table 1 Root mean square error of example 1

Example 2. Van der pol oscillator

In this example, we explore the commonly used Van der pol oscillator which is a highly nonlinear system exhibiting both stable and unstable limit cycles [12]. We consider the case of unstable limit cycle therefore, as time proceeds the system states approach to zero. The system is represented by the following differential equations:

$$\begin{array}{*{20}l} \dot x_{1} &= -x_{2} \end{array} $$
((33a))
$$\begin{array}{*{20}l} \dot x_{2} &= x_{1} -\alpha(1-{x_{1}}^{2})x_{2} \end{array} $$
((33b))

where α is a constant which is set to 0.2. The state space representation for the system shown above takes the following form:

$$\begin{array}{*{20}l} \dot{\mathbf{x}}(t) &= \left[\begin{array}{lc} 0 &\!\quad -1\\ 1 &\!\quad -\alpha(1-{x_{1}}^{2})\end{array}\right]\textbf{x}(t) \end{array} $$
((34a))
$$\begin{array}{*{20}l} \textbf{y}(t) &= \left[\begin{array}{lr} 1 &\! \quad 0\\ 0 &\! \quad 1\end{array}\right]\textbf{x}(t) \end{array} $$
((34b))

The system is discretized with a sampling time of 0.1 s. The observation is subject to a noise of variance \({{\sigma ^{2}_{v}}} = 0.01^{2}\). The process noise is considered to be of variance \({\sigma ^{2}_{w}} = 0.001^{2}\). The true initial system states are considered to be x[0]=[0.2 0.2]T, and the initial estimate for the algorithms are \(\hat {\mathbf {x}}[0] = [0.4~0.1]^{T}\). The G matrix is taken as all zero entries except for the first column with 1’s. For this experiment, the states, their mean square error, and observation mean square error are plotted only for uniform noise scenario in Figures 4, 5, and 6, respectively. However, the root mean square error (RMSE) results for all the noise environments are reported in Table 2. It can be depicted from Figures 4, 5, and 6 that the KF performs better and has a faster convergence in comparison to the SSLM algorithms. However, this comes at the cost of high computational complexity as will be discussed in Section 6. The SSLM algorithms have slower convergence with SSLMSi, and SSLME having the best performance among the others in the family.

Figure 4
figure 4

State x 1[k] (a) and x 2[k] (b).

Figure 5
figure 5

Mean square error x 1[k] (a) and x 2[k] (b).

Figure 6
figure 6

Mean square error (a) y 1[k] and (b) y 2[k] (Example 2).

Table 2 Root mean square error of Example 2

Example 3. Symmetrical three phase induction machine

Example 3 presents estimation of the states of a nonlinear symmetrical three phase induction machine, more specifically, the flux and angular velocity estimation ([13-15]). The state space representation of the induction machine is as follows:

$$\begin{array}{*{20}l} \dot{\mathbf{x}}(t) =& [x_{1}(t)~x_{2}(t)~x_{3}(t)~x_{4}(t)~x_{5}(t)]^{T} \end{array} $$
((35a))
$$\begin{array}{*{20}l} \textbf{u}(t) =& [z_{1}(t)~z_{2}(t)~z_{3}(t)]^{T} \end{array} $$
((35b))
$$\begin{array}{*{20}l} \textbf{y}(t) =& [y_{1}(t)~y_{2}(t)]^{T} \end{array} $$
((35c))

where:

$$\begin{array}{*{20}l} {\dot x_{1}}(t) &= k_{1}x_{1}(t)+z_{1}(t)x_{2}(t)+k_{2}x_{3}(t)+z_{2}(t) \end{array} $$
((36a))
$$\begin{array}{*{20}l} {\dot x_{2}}(t) &= -z_{1}(t)x_{1}(t)+k_{1}x_{2}(t)+k_{2}x_{4}(t) \end{array} $$
((36b))
$$\begin{array}{*{20}l} {\dot x_{3}}(t) &= k_{3}x_{1}(t)+k_{4}x_{3}(t)+(z_{1}(t)-x_{5}(t))x_{4}(t) \end{array} $$
((36c))
$$\begin{array}{*{20}l} {\dot x_{4}}(t) &= k_{3}x_{2}(t)-(z_{1}(t)-x_{5}(t))x_{3}(t)+k_{4}x_{4}(t) \end{array} $$
((36d))
$$\begin{array}{*{20}l} {\dot x_{5}}(t) &= k_{5}(x_{4}(t)x_{1}(t)-x_{2}(t)x_{3}(t))+k_{6}z_{3}(t) \end{array} $$
((36e))
$$\begin{array}{*{20}l} {y_{1}}(t) & = k_{7}x_{1}(t)+k_{8}x_{3}(t) \end{array} $$
((36f))
$$\begin{array}{*{20}l} {y_{2}}(t) &=k_{7}x_{2}(t)+k_{8}x_{4}(t) \end{array} $$
((36g))

The normalized state variables x 1(t), x 2(t), and x 3(t), x 4(t) are the components of the stator and the rotor flux, respectively, and x 5(t) is the angular velocity. The inputs z 1(t), z 2(t), and z 3(t) are the frequency, amplitude of the stator voltage, and the load torque, respectively. k 1=−0.186, k 2=0.178, k 3=0.225, k 4=−0.234, k 5=−0.081, k 6=4.643, and k 7=−4.448, k 8=1. The values of ks depend on the induction machine considered. Outputs y 1(t) and y 2(t) are the normalized stator currents. For the simulation purpose, the system is discretized at a sampling interval of 0.01 s and z 1(t), z 2(t), and z 3(t) are chosen as 1, 1, and 0, respectively. System was simulated in the presence of all types of noises, and μs were well tuned to get optimum performance. The G matrix is taken as all zero entries except for the first column with tens. The process noise variance is considered as \({{\sigma ^{2}_{w}}} = 0.0001^{2}\), and the observation noise variance is \({{\sigma ^{2}_{v}}} = 0.001^{2}\). The initial true state is considered to be x[0]=[0.0147 −0.9 0.0136 −0.9616 0.9]T and the initial estimate for the SSLM algorithms as \(\hat {\mathbf {x}}[0] = [0.02~-0.7~0.0136~-0.8~0.7]\). Figure 7 presents the five states of the induction machine in presence of Laplacian noise while Figure 8 presents the observations in presence of Laplacian noise. The mean square estimation error and the mean square observation error in the presence of Laplacian noise are presented in Figure 9 and Figure 10, respectively. Figure 9 shows that the algorithms perform well in terms of estimating the states and have comparable performance to the KF. Figure 10 shows us that the estimated observation from KF is near real; however, the corresponding state estimates are not near the true values. The RMSE for this example is presented in Table 3.

Figure 7
figure 7

Induction machine state x 1 (a), x 2 (b), x 3 (c), x 4 (d), and x 5 (e).

Figure 8
figure 8

Observation y 1 (a) and y 2 (b).

Figure 9
figure 9

Mean square error state x 1 (a), x 2 (b), x 3 (c), x 4 (d), and x 5 (e).

Figure 10
figure 10

Mean square error (a) y 1[k] and (b) y 2[k] (Example 3).

Table 3 Root mean square error of Example 3

An overview of the RMSE of the states and observations can be referred to in Tables 1, 2, and 3.

It is clear from our investigation that as the value of L increases, better performance is observed from the SSLM algorithms. Although the performance of the algorithms is not better in comparison to the KF, nevertheless, the algorithms are extremely low in terms of computational power requirement as will be discussed in Section 6.

Computational complexity

When dealing with real-time applications, it is essential to calculate the computational complexities of the algorithm. For this purpose, we provide a comparison of computational complexities of the proposed family of SSLM algorithms with that of the KF. To evaluate the computational complexity of the proposed SSLM family, we compute the total number of operations required by the Equations 6, 8, 9, 7, and 15 which define its implementation in order of execution. The details of operations required by the SSLM family of algorithms and the KF are presented in Tables 4 and 5, respectively. Note that in evaluating the computational cost of inversion of a matrix having dimension n×n, we assume a total of n 3 multiplications and n 3 additions.

Table 4 Computational complexity of SSLM algorithm
Table 5 Computational complexity of KF algorithm

To get more insight on the computational complexity, we present a comparison of computational complexities of the algorithms for all the three examples investigated in simulation experiments in Tables 6, 7, and 8.

Table 6 Computational complexity summary of Example 1
Table 7 Computational complexity summary of Example 2
Table 8 Computational complexity summary of Example 3

The results clearly suggest that the proposed algorithm has very low complexity as compared to the standard KF algorithm. According to our presented investigation, the KF algorithm perform approximately 6 times as many operations in example 1, 3.5 times as many in example 2, and 6 times as many in example 3 when compared to our algorithms.

Conclusions

In this work, the general family of SSLM algorithms is proposed. The proposed family is based on minimizing the general least mean cost function via stochastic gradient optimization. In order to assess the performance of the proposed family, simulation results are carried out for three different examples with different types of noise environments. In these simulations, effect of noise and exponent L on state estimation are investigated. The simulation results show that the performance of the proposed family is efficient and comparable to that of the Kalman filter. However, the computational complexity of the proposed family of algorithms is far lesser than that of the Kalman filter. More specifically, the computational complexity of the proposed family is 3.5 to 6 times lesser than the Kalman filter as presented in the reported examples. This gives a motivation to use our proposed family of algorithm in real-time application where computational complexity is of major concern. For future research, an adaptive μ could be proposed and investigated keeping in focus the effect of varying the sampling time and the noise. Moreover, different variants of KF algorithms and other linear estimation algorithms should be investigated along with the proposed SSLM algorithms family and compared to get more insightful outcome.

References

  1. S Haykin, Adaptive Filter Theory, 3rd edn. (Prentice-Hall, Upper-Saddle River, NJ, 1996).

  2. AH Sayed, Fundamentals of Adaptive Filtering (Wiley-Interscience, New York, 2003).

  3. CK Chui, G Chen, Kalman Filtering With Real-Time Applications, 4th edn. (Springer, Berlin, Heidelberg, 2009).

  4. EA Wan, R Van der Merwe, in IEEE Proceedings of Symposium on Adaptive Systems for Signal Processing, Communication and Control. The unscented Kalman filter for nonlinear estimation (Lake Louise,Alberta, Canada, 2000).

  5. I Arasaratnam, S Haykin, Cubature Kalman filters. Automatic Control IEEE Trans 54(6), 1254–1269 (2009).

    Article  MathSciNet  Google Scholar 

  6. I Arasaratnam, S Haykin, RJ Elliott, Discrete-time nonlinear filtering algorithms using Gauss-Hermite quadrature. Proc. IEEE 95(5), 953–977 (2007).

    Article  Google Scholar 

  7. MB Malik, Salman M, State-space least mean square. Digital Signal Process 18(3), 334–345 (2008).

    Article  Google Scholar 

  8. MB Malik, RA Bhatti, Tracking of linear time-varying systems using state-space least mean square. IEEE Int. Symp. Commun. Inform. Technol 1, 582–585 (2004).

    Google Scholar 

  9. MB Malik, State-space recursive least squares: part I. Signal Process84/9, 1709–1718 (2004).

    Article  Google Scholar 

  10. MB Malik, State-space recursive least squares: part II. Signal Process84/9, 1719–1728 (2004).

    Article  Google Scholar 

  11. E Walach, B Widrow, The least-mean fourth (LMF) adaptive algorithm and its family. IEEE Trans. Inform. TheoryIT30, 275–283 (1984).

    Article  Google Scholar 

  12. HK Khalil, Nonlinear Systems, 3rd edn. (Prentice Hall, Upper-Saddle River, NJ, 2000).

  13. K Reif, F Sonnemann, R Unbehauen, An EKF-based nonlinear observer with a prescribed degree of stability. Automatica 34(9), 1119–1123 (1998).

    Article  MATH  Google Scholar 

  14. R Kandepu, B Foss, L Imsland, Applying the unscented Kalman filter for nonlinear state estimation. J. Process Control 18, 753–768 (2008).

    Article  Google Scholar 

  15. L Salvatore, S Stasi, L Tarchioni, A new EKF based algorithm for flux estimation in induction machines. IEEE Trans. Ind. Electron 40(5), 496–504 (1993).

    Article  Google Scholar 

Download references

Acknowledgments

The authors acknowledge the support provided by the Centre of Excellence in Intelligent Engineering Systems http://ceies.kau.edu.sa/ CEIES, King Abdulaziz University, Jeddah, Saudi Arabia and King Abdulaziz University, Jeddah, Saudi Arabia to carry out this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Moinuddin.

Additional information

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Moinuddin, M., Al-Saggaf, U.M. & Ahmed, A. Family of state space least mean power of two-based algorithms. EURASIP J. Adv. Signal Process. 2015, 39 (2015). https://doi.org/10.1186/s13634-015-0219-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-015-0219-9

Keywords

  • Adaptive filters
  • State space least mean algorithms
  • State space estimation algorithms
  • Convergence and stability analysis