A variable step-size strategy for distributed estimation over adaptive networks

Bin Saeed, Muhammad O; Zerguine, Azzedine; Zummo, Salam A

doi:10.1186/1687-6180-2013-135

Research
Open access
Published: 06 August 2013

A variable step-size strategy for distributed estimation over adaptive networks

Muhammad O Bin Saeed¹,
Azzedine Zerguine¹ &
Salam A Zummo¹

EURASIP Journal on Advances in Signal Processing volume 2013, Article number: 135 (2013) Cite this article

3615 Accesses
44 Citations
Metrics details

Abstract

A lot of work has been done recently to develop algorithms that utilize the distributed structure of an ad hoc wireless sensor network to estimate a certain parameter of interest. One such algorithm is called diffusion least-mean squares (DLMS). This algorithm estimates the parameter of interest using the cooperation between neighboring sensors within the network. The present work proposes an improvement on the DLMS algorithm by using a variable step-size LMS (VSSLMS) algorithm. In this work, first, the well-known variants of VSSLMS algorithms are compared with each other in order to select the most suitable algorithm which provides the best trade-off between performance and complexity. Second, the detailed convergence and steady-state analyses of the selected VSSLMS algorithm are performed. Finally, extensive simulations are carried out to test the robustness of the proposed algorithm under different scenarios. Moreover, the simulation results are found to corroborate the theoretical findings very well.

1 Introduction

The advent of ad-hoc wireless sensor networks has created renewed interest in distributed computing and opened up new avenues for research in the areas of estimation and tracking of parameters of interest where a robust, scalable, and low-cost solution is required. To illustrate this point clearly, consider a set of N sensor nodes spread over a wide geographic area, as shown in Figure1. Each node takes sensor measurements at every sampling instant. The goal here is to estimate a certain unknown parameter of interest using these sensed measurements. In a centralized network, all nodes transmit their readings to a fusion center where the processing takes place. However, this system is prone to any failure of the fusion center. Furthermore, large amounts of energy and communication resources may be required for the complete signal processing task between the center and the entire network to be successfully carried out. These resources needed could become considerable, as the distance between the nodes and the center increases[1].

An ad hoc network, on the other hand, depends on distributed processing with the nodes communicating only with their neighbors and the processing taking place at the nodes themselves. As no hierarchical structure is involved, any node failure would not result in the failure of the entire network. Extensive research has been done in the field of consensus-based distributed signal processing and resulted in a variety of algorithms[2].

In this work, a brief overview of the virtues and limitations of these algorithms[3–10] is conducted, thus providing the background against which our contribution is justified. Incremental algorithms organize the network in a Hamiltonian cycle[3]. The estimation is completed by passing the estimate from node to node, improving the accuracy of the estimate with each new set of data per node. This algorithm is termed the incremental least mean squares (ILMS) algorithm as it uses the LMS algorithm[11] with incremental steps within each iteration[3]. A general incremental stochastic gradient descent algorithm was developed in[4], for which[3] can be considered a special case. These algorithms are heavily dependent on the Hamiltonian cycle and in case of a node failure, a new cycle has to be initiated. The problem of reconstructing the cycle is non-deterministic polynomial-time hard[12]. A quantized version of the ILMS algorithm was suggested in[5]. A fully distributed algorithm was proposed in[6], called diffusion least mean squares (DLMS) algorithm. In the DLMS algorithm, neighbor nodes share their estimates in order to improve the overall performance; see also[7]. Ultimately, this algorithm is robust to node failure as the network is not dependent on any single node. On the other hand, the authors in[8] introduced a scheme to adapt the combination weights at each iteration for each node instead of having as fixed weights for the shared data. In this case, the performance is improved if more weight is given to the estimates of the neighbor nodes that are providing more improvement per iteration.

All the previously mentioned algorithms are generally based on a non-constrained optimization technique, except in[8] which uses constrained optimization to adapt the weights only. However, the authors in[9] use the constraint that all nodes converge to the same steady-state estimate to derive the distributed LMS algorithm. This algorithm is unfortunately hierarchical, thus making it complex and not completely distributed. To remedy this situation, a fully distributed algorithm based on the same constraint was suggested in[10]. The algorithms in[9, 10] have been shown to be robust to inter-sensor noise, a property not possessed by the diffusion-based algorithms. However, it has been shown in[7] that diffusion-based algorithms perform better in general.

All the above-mentioned algorithms use a fixed step-size. In general, the step-size is kept the same for all nodes. As a result, the nodes with better signal-to-noise ratio (SNR) may converge quicker and provide a reasonably good performance. However, the nodes with poor SNR will not provide similar performance despite cooperation from neighboring nodes. As a result, a distributed algorithm may provide improvement in average performance but individually, some nodes will still not be performing similarly to the other nodes. One solution for this problem was provided by[8]. The work in[8] provides a computationally complex method to improve the performance of the DLMS algorithm. For every iteration, an error correlation matrix is formed for each node. A decision is made based on this decision as to the weights of the neighbors. Thus, the combiner weights are adapted at every iteration according to the performance of the neighbors of each node. Simulation results from[8] have shown slight improvement in the performance, but this has been achieved at the cost of high computational complexity.

In comparison, a much simpler solution was suggested in[13], using a variable step-size LMS algorithm. This resulted in the variable step-size diffusion LMS (VSSDLMS) algorithm. Preliminary results showed remarkable improvement in performance at a reasonably low computational cost. The idea is to vary the step-size at each iteration based on the error performance. Each node will alter its step-size according to its own individual performance. As a result, not only the average performance improves remarkably but the individual performances of the nodes also get better.

A different diffusion-based algorithm was proposed in[14] using the recursive least squares (RLS) algorithm to obtain the diffusion RLS (DRLS) algorithm. This DRLS algorithm provided exceptional results in both speed and performance. Another RLS-based distributed estimation algorithm has been studied in[15, 16]. The latter algorithm is hierarchical in nature, which makes its complexity higher than that of the DRLS algorithm. The RLS algorithm is inherently far more complex compared with the LMS algorithm. In this work, it is shown that despite the LMS algorithm being inferior to the RLS algorithm, using a variable step-size allows the LMS algorithm to achieve performance very close to that of the RLS algorithm.

In order to achieve better performance, various other algorithms have been proposed in the literature, such as in[17–19]. The works in[17, 18] propose distributed Kalman filtering algorithms that provide efficient solutions for several applications. A survey of distributed particle filtering is provided in[19]. This work takes a look at several solutions proposed for nonlinear distributed estimation. However, the focus of this work is primarily on LMS-based algorithms, so these algorithms will not be included in any further discussion.

Our work extends that of[13] and investigates in detail the performance of the VSSDLMS algorithm. Here, the most popular variable step-size LMS (VSSLMS) algorithms are first investigated and compared with each other. Based on the best complexity-performance trade-off, one variant of the VSSLMS algorithms is chosen for the proposed algorithm. Next, the incorporation of the selected VSSLMS algorithm in the diffusion scheme is described, and complete convergence and steady-state analyses are carried out. The stability of the algorithm is also analyzed. Finally, some simulations are carried out to first determine which of the various selected VSSLMS algorithms provide the best trade-off between performance and complexity, and then to compare the proposed algorithm with similar existing algorithms. Note that the performance of the proposed algorithm is assessed under different conditions. Interestingly, the theoretical findings are found to corroborate the simulation results very well. Moreover, a sensitivity analysis is performed on the proposed algorithm.

The paper is organized as follows. Section 2 describes the problem statement and briefly introduces the DLMS algorithm. Section 3 incorporates the VSSLMS algorithm into the DLMS algorithm, and then complete convergence and stability analyses are carried out. Simulation results are given in Section 4 followed by a thorough discussion of the results. Finally, Section 5 concludes the work.

Notation. Boldface letters are used for vectors/matrices and normal font for scalar quantities. Matrices are defined by capital letters and small letters are used for vectors. The notation (.)^T stands for transposition operation for vectors and matrices and expectation operation is denoted by E[.]. Any other mathematical operators that have been used will be defined as they are introduced in the paper.

2 Problem statement

Consider a network of N sensor nodes deployed over a geographical area for estimating an unknown parameter vector w ^o of size (M × 1), as shown in Figure1. Each node k has access to a time realization of a known regressor vector u _k(i) of size (1 × M) and a scalar measurement d _k(i) that are related by

d_{k} (i) = u_{k} (i) w^{o} + v_{k} (i), 1 \leq k \leq N

(1)

where v _k(i) is a spatially uncorrelated zero-mean additive white Gaussian noise with variance $σ_{v_{k}}^{2}$ and i denotes the time index. The measurements, d _k(i) and u _k(i), are used to generate an estimate, w _k(i) of size (M × 1), of the unknown vector w ^o. Assuming that each node cooperates only with its neighbors, then each node k has access to updates w _l(i), from its $N_{k}$ neighbor nodes at every time instant i, where $\underset{l \neq k}{l \in N_{k}}$ , in addition to its own estimate, w _k(i). Two different schemes have been introduced in the literature for the diffusion algorithm. The adapt-then-combine (ATC) scheme[7] first updates the local estimate using the adaptive algorithm used and then the intermediate estimates from the neighbors are fused together. The second scheme, called combine-then-adapt (CTA)[6], reverses the order. It is found that the ATC scheme outperforms the CTA scheme[7] and therefore, this work uses the ATC scheme.

The objective of the adaptive algorithm is to minimize the following global cost function given by

J (w) = \sum_{k = 1}^{N} J_{k} (w) = \sum_{k = 1}^{N} E [{|d_{k} - u_{k} w|}^{2}] .

(2)

The steepest descent algorithm is given as

w_{k} (i + 1) = w_{k} (i) + μ \sum_{k = 1}^{N} (r_{d u, k} - R_{u, k} w_{k} (i)),

(3)

where $r_{d u, k} = E [d_{k} u_{k}^{T}]$ is the cross-correlation between d _k and u _k, $R_{u, k} = E [u_{k}^{T} u_{k}]$ is the auto-correlation of u _k, and μ is the step-size. The recursion defined in (3) requires full knowledge of the statistics of the entire network. A more practical solution utilizes the distributive nature of the network. The work in[6] gives a fully distributed solution, which has been modified and improved in[7]. The update equation is divided into two steps. The first step performs the adaptation, while the second step combines the intermediate updates from neighboring nodes. The resulting scheme is called adapt-then-combine or ATC. Using the ATC scheme, the diffusion LMS algorithm is given as

\begin{array}{l} Ψ_{k} (i + 1) = w_{k} (i) + μ_{k} u_{k} (i) [d_{k} (i) - u_{k} (i) w_{k} (i)] \\ w_{k} (i + 1) = \sum_{l \in N_{k}} c_{lk} Ψ_{l} (i + 1), \end{array}

(4)

where Ψ _k(i + 1) is the intermediate update, μ _k is the step-size for node k, and c _lk is the weight connecting node k to its neighboring node $l \in N_{k}$ , where $N_{k}$ includes node k, and $\sum c_{lk} = 1$ . Further details on the formation of the weights c _lk can be found in[6, 7].

3 Variable step-size diffusion LMS algorithm

The VSSLMS algorithms show marked improvement over the LMS algorithm at a low computational complexity[20–25]. Therefore, this variation is inserted in the distributed algorithm to inherit the improved performance of the VSSLMS algorithm. Different variations have their own advantages and disadvantages. A complex step-size adaptation algorithm would not be suitable because of the physical limitations of the sensor node. As shown in[23], the algorithm proposed by[20] shows the best performance as well as having low complexity. Therefore, it is well suited for this application. A further comparison of performance of these variants in the present scenario confirm our choice of the VSSLMS algorithm.

The proposed algorithm simply incorporates the VSSLMS algorithm into the diffusion scheme given by (4). Using a VSSLMS algorithm, the step-size will also become a variable in this system of equations defining the proposed distributed algorithm. Then the VSSDLMS algorithm is governed by the following:

\begin{align} Ψ_{k} (i + 1) = w_{k} (i) + μ_{k} (i) u_{k} (i) (d_{k} (i) - u_{k} (i) w_{k} (i)), \\ μ_{k} (i + 1) = f [μ_{k} (i)], \\ w_{k} (i + 1) = \sum_{l \in N_{k}} c_{lk} Ψ_{l} (i + 1), \end{align}

(5)

where f[μ _k(i)] is the step-size adaptation function that is defined using the VSSLMS adaptation given in[20] where the update equation is given by

\begin{array}{l} μ_{k} (i + 1) & = α μ_{k} (i) + γ {(d_{k} (i) - u_{k} (i) w_{k} (i))}^{2} \\ = α μ_{k} (i) + γ e_{k}^{2} (i), \end{array}

(6)

where e _k(i) = d _k(i) − u _k(i)w _k(i), 0 < α < 1 and γ > 0.

Since nodes exchange data amongst themselves, their current update will then be affected by the weighted average of the previous estimates. Therefore, to account for this inter-node dependence, it is suitable to study the performance of the whole network. Hence, some new variables need to be introduced and the local ones are transformed into global variables as follows:

\begin{array}{l} w (i) & = col \{w_{1} (i), \dots, w_{N} (i)\}, \\ Ψ (i) & = col \{Ψ_{1} (i), \dots, Ψ_{N} (i)\}, \\ U (i) & = diag \{u_{1} (i), \dots, u_{N} (i)\}, \\ D (i) & = diag \{μ_{1} (i) I_{M}, \dots, μ_{N} (i) I_{M}\}, \\ d (i) & = col \{d_{1} (i), \dots, d_{N} (i)\}, \\ v (i) & = col \{v_{1} (i), \dots, v_{N} (i)\} . \end{array}

From these new variables, a completely new set of equations representing the entire network is formed, starting with the relation between the measurements

d (i) = U (i) w^{(o)} + v (i),

(7)

where w ^(o) = Q w ^o, and Q = c o l{I _M,I _M,…,I _M} is a M N × M matrix. Similarly, the update equations can be remodeled to represent the entire network

\begin{array}{l} Ψ (i + 1) = w (i) + D (i) U^{T} (i) (d (i) - U (i) w (i)), \\ D (i + 1) = α D (i) + γ E (i), \\ w (i + 1) = G Ψ (i + 1), \end{array}

(8)

where G = C ⊗ I _M; C is an N × N weighting matrix, where {C}_lk = c _lk; ⊗ is the Kronecker product; D(i) is the diagonal step-size matrix; and the error energy matrix, E(i), is given by

E (i) = diag \{e_{1}^{2} (i) I_{M}, e_{2}^{2} (i) I_{M}, \dots, e_{N}^{2} (i) I_{M}\} .

(9)

Considering the above set of equations, the mean and mean-square analyses and the steady-state behavior of the VSSDLMS algorithm are carried out as shown next. The mean analysis considers the stability of the algorithm and derives a bound for the step-size which would guarantee convergence. The mean-square analysis also derives transient and steady-state expressions for the mean square deviation (MSD) and the excess mean square error (EMSE). The MSD is defined as the error in the estimate of the unknown vector. The weight-error vector for node k is given by

{\tilde{w}}_{k} (i) = w^{o} - w_{k} (i),

(10)

then the MSD can be simply defined as

MSD = E [{∥{\tilde{w}}_{k} (i)∥}^{2}] = E [{∥w^{o} - w_{k} (i)∥}^{2}] .

(11)

Similarly, the EMSE is derived from the error equation as follows:

\begin{array}{lcr} EMSE = E [{|e_{k} (i)|}^{2}] - σ_{vk}^{2} = E [{|d_{k} (i) - u_{k} (i) w_{k} (i)|}^{2}] - σ_{vk}^{2}, \end{array}

which can be solved further to get the following expression for the EMSE:

EMSE = E [{∥{\tilde{w}}_{k} (i)∥}_{R_{k}}^{2}],

(12)

where R _k is the autocorrelation matrix for node k.

3.1 Mean analysis

To begin with, let us introduce the global weight-error vector defined in[6, 26] as

\tilde{w} (i) = w^{(o)} - w (i) .

(13)

Since $G w^{(o)} Δ = w^{(o)}$ , by incorporating the global weight-error vector into (8), we get

\begin{array}{l} \tilde{w} (i + 1) & = G \tilde{Ψ} (i + 1) \\ = G \tilde{w} (i) - GD (i) U^{T} (i) (U (i) \tilde{w} (i) + v (i)) \\ = G (I_{MN} - D (i) U^{T} (i) U (i)) \tilde{w} (i) - GD (i) U^{T} (i) v (i) . \end{array}

(14)

Here we use the assumption that the step-size matrix D(i) is independent of the regressor matrix U(i)[20]. Accordingly, for small values of γ in (6), the following relation holds true asymptotically

E [D (i) U^{T} (i) U (i)] \approx E [D (i)] E [U^{T} (i) U (i)],

(15)

where E[U ^T(i)U(i)] = R _U is the auto-correklation matrix of U(i), and taking the expectation on both sides of (14) gives

E [\tilde{w} (i + 1)] = G (I_{MN} - E [D (i)] R_{U}) E [\tilde{w} (i)],

(16)

where the expectation of the second term of the right-hand side of (14) is 0 since the measurement noise is spatially uncorrelated with the regressor and zero-mean, as explained earlier.

From (16), we see that for stability in the mean we must have |λ _max(G B)| < 1, where B = (I _MN − E[D(i)]R _U). Since G comes from C and we know that ∥G B∥₂ ≤ ∥G∥₂.∥B∥₂, we can safely infer that

|λ_{max} (GB)| \leq {∥C∥}_{2} . |λ_{max} (B)| .

(17)

Since there is already a condition that ∥C∥₂ = 1 and for noncooperative schemes, we have (G = I _MN), we can safely conclude that

|λ_{max} (GB)| \leq |λ_{max} (B)| .

(18)

So we can see that the cooperation mode only enhances the stability of the system (for further details, refer to[6, 7]). Since stability is also dependent on the step-size, then the algorithm will be stable in the mean if

\prod_{i = 0}^{n} (I - E [μ_{k} (i)] R_{u, k}) \to 0, as n \to \infty

(19)

which holds true if the mean of the step-size is governed by

0 < E [μ_{k} (i)] < \frac{2}{λ_{max} (R_{u, k})}, 1 \leq k \leq N,

(20)

where λ _max(R _u,k) is the maximum eigenvalue of the auto-correlation matrix R _u,k. This scenario is different from that of the fixed step-size as in this case where the system is stable only when the mean of the step-size is within the limits defined by (20).

3.2 Mean-square analysis

In this section, the mean-square analysis of the VSSDLMS algorithm is investigated. Here, the weighted norm has been used instead of the regular norm. The motivation behind using a weighted norm stems from the fact that even though the MSD does not require a weighted norm, the evaluation of the EMSE depends on a weighted norm. In order to accommodate both these measures, a general analysis is conducted using a weighted norm, where a weighting matrix is replaced by an identity matrix for the case of MSD, where a weighting matrix is not required[26].

We take the weighted norm of (14) and then apply the expectation operator to both of its sides. This yields the following:

\begin{array}{l} E [{∥\tilde{w} (i + 1)∥}_{Σ}^{2}] & = E [∥G (I_{MN} - D (i) U^{T} (i) U (i)) \tilde{w} (i) \\ - {GD (i) U^{T} (i) v (i)∥}_{Σ}^{2}] \\ = E [{∥\tilde{w} (i)∥}_{G^{T} Σ G}^{2}] - E [{∥\tilde{w} (i)∥}_{G^{T} Σ Y (i) U (i)}^{2}] \\ - E [{∥\tilde{w} (i)∥}_{U^{T} (i) Y^{T} (i) Σ G}^{2}] \\ + E [{∥\tilde{w} (i)∥}_{U^{T} (i) Y^{T} (i) Σ Y (i) U (i)}^{2}] \\ + E [v^{T} (i) Y^{T} (i) Σ Y (i) v (i)] \\ = E [{∥\tilde{w} (i)∥}_{\hat{Σ}}^{2}] + E [v^{T} (i) Y^{T} (i) Σ Y (i) v (i)], \end{array}

(21)

where

Y (i) = GD (i) U^{T} (i)

(22)

\begin{array}{l} \hat{Σ} & = G^{T} Σ G - G^{T} Σ Y (i) U (i) - U^{T} (i) Y^{T} (i) Σ G \\ + U^{T} (i) Y^{T} (i) Σ Y (i) U (i) . \end{array}

(23)

Using the data independence assumption[26] and applying the expectation operator gives

\begin{array}{l} E [\hat{Σ}] & = G^{T} Σ G - G^{T} Σ E [Y (i) U (i)] - E [U^{T} (i) Y^{T} (i)] Σ G \\ + E [U^{T} (i) Y^{T} (i) Σ Y (i) U (i)] \\ = G^{T} Σ G - G^{T} Σ G E [D (i)] E [U^{T} (i) U (i)] \\ - E [U^{T} (i) U (i)] E [D (i)] G^{T} Σ G \\ + E [U^{T} (i) Y^{T} (i) Σ Y (i) U (i)] . \end{array}

(24)

For ease of notation, we denote $E [\hat{Σ}] = Σ^{'}$ for the remaining analysis.

3.2.1 Mean-square analysis for Gaussian data

The evaluation of the expectations in (24) is quite tedious for non-Gaussian data. Therefore, it is assumed here that the data is Gaussian in order to evaluate (24). The auto-correlation matrix can be decomposed as R _U = T Λ T ^T, where Λ is a diagonal matrix containing the eigenvalues for the entire network and T is a matrix containing the eigenvectors corresponding to these eigenvalues. Using this eigenvalue decomposition, we define the following relations

\begin{array}{l} \bar{w} (i) & = T^{T} \tilde{w} (i) \bar{U} (i) = U (i) T \bar{G} = T^{T} GT \\ \bar{Σ} & = T^{T} Σ T {\bar{Σ}}^{'} = T^{T} Σ^{'} T \bar{D} (i) = T^{T} D (i) T = D (i), \end{array}

where the input regressors are considered independent of each other at each node and the step-size matrix D(i) is block-diagonal, so it does not transform since T ^T T = I. Using these relations, (21) and (24) can be rewritten, respectively, as

E [{∥\bar{w} (i + 1)∥}_{\bar{Σ}}^{2}] = E [{∥\bar{w} (i)∥}_{{\bar{Σ}}^{'}}^{2}] + E [v^{T} (i) {\bar{Y}}^{T} (i) \bar{Σ} \bar{Y} (i) v (i)],

(25)

and

\begin{array}{l} {\bar{Σ}}^{'} & = {\bar{G}}^{T} \bar{Σ} \bar{G} - {\bar{G}}^{T} \bar{Σ} \bar{G} E [D (i)] E [{\bar{U}}^{T} (i) \bar{U} (i)] \\ - E [{\bar{U}}^{T} (i) \bar{U} (i)] E [D (i)] {\bar{G}}^{T} \bar{Σ} \bar{G} \\ + E [{\bar{U}}^{T} (i) \bar{Y} (i) \bar{Σ} \bar{Y} (i) \bar{U} (i)], \end{array}

(26)

where $\bar{Y} (i) = \bar{G} D (i) {\bar{U}}^{T} (i)$ .

It can be seen that $E [{\bar{U}}^{T} (i) \bar{U} (i)] = Λ$ . Also, using the bvec operator[27], we have $\bar{σ} = bvec \{\bar{Σ}\}$ , where the bvec operator divides the matrix into smaller blocks and then applies the vec operator to each of the smaller blocks. Now, let R _v = Λ _v ⊙ I _M denote the block diagonal noise covariance matrix for the entire network, where ⊙ denotes the block Kronecker product[27] and Λ _v is a diagonal noise variance matrix for the network. Hence, the second term of the right-hand side of (25) is

E [v^{T} (i) {\bar{Y}}^{T} (i) \bar{Σ} \bar{Y} (i) v (i)] = b^{T} (i) \bar{σ},

(27)

where b(i) = bvec{R _vE[D ²(i)]Λ}.

The fourth-order moment $E [{\bar{U}}^{T} (i) {\bar{Y}}^{T} (i) \bar{Σ} \bar{Y} (i) \bar{U} (i)]$ in (26) remains to be evaluated. Using the step-size independence assumption and the ⊙ operator, we get

\begin{array}{l} bvec \{E [{\bar{U}}^{T} (i) {\bar{Y}}^{T} (i) \bar{Σ} \bar{Y} (i) \bar{U} (i)]\} & = (E [D (i) ⊙ D (i)]) \\ \times A (G^{T} ⊙ G^{T}) \bar{σ}, \end{array}

(28)

where we have from[6]

A = diag \{A_{1}, A_{2}, \dots, A_{N}\},

(29)

and each matrix A _k is given by

A_{k} = diag \{Λ_{1} \otimes Λ_{k}, \dots, λ_{k} λ_{k}^{T} + 2 Λ_{k} \otimes Λ_{k}, \dots, Λ_{N} \otimes Λ_{k}\},

(30)

where Λ _k defines the diagonal eigenvalue matrix and λ _k is the eigenvalue vector for node k.

The output of the matrix E[D(i) ⊙ D(i)] can be written as

\begin{array}{l} {(E [D (i) ⊙ D (i)])}_{kk} & = E [diag \{μ_{k} (i) I_{M} \otimes μ_{1} (i) I_{M}, \dots, \\ μ_{k} (i) I_{M} \otimes μ_{k} (i) I_{M}, \dots, \\ μ_{k} (i) I_{M} \otimes μ_{N} (i) I_{M}\}] \\ = E [diag \{μ_{k} (i) μ_{1} (i) I_{M^{2}}, \dots, \\ μ_{k}^{2} (i) I_{M^{2}}, \dots, μ_{k} (i) μ_{N} (i) I_{M^{2}}\}] \\ = diag \{E [μ_{k} (i)] E [μ_{1} (i)] I_{M^{2}}, \dots, \\ E [μ_{k}^{2} (i)] I_{M^{2}}, \dots, \\ E [μ_{k} (i)] E [μ_{N} (i)] I_{M^{2}}\} . \end{array}

(31)

Now applying the bvec operator to the weighting matrix ${\bar{Σ}}^{'}$ using the relation $bvec \{{\bar{Σ}}^{'}\} = {\bar{σ}}^{'}$ , where we can get back the original ${\bar{Σ}}^{'}$ through $bvec \{{\bar{σ}}^{'}\} = {\bar{Σ}}^{'}$ , we get

\begin{array}{l} bvec \{{\bar{Σ}}^{'}\} & = \bar{σ} = [I_{M^{2} N^{2}} - (I_{MN} ⊙ Λ E [D (i)]) \\ - (Λ E [D (i)] ⊙ I_{MN}) \\ + (E [D (i) ⊙ D (i)]) A] (G^{T} ⊙ G^{T}) \bar{σ} \\ = F (i) \bar{σ}, \end{array}

(32)

where

\begin{array}{l} F (i) & = [I_{M^{2} N^{2}} - (I_{MN} ⊙ Λ E [D (i)]) - (Λ E [D (i)] ⊙ I_{MN}) \\ + (E [D (i) ⊙ D (i)]) A] (G^{T} ⊙ G^{T}) . \end{array}

(33)

Then (21) will take on the following form:

E [{∥\bar{w} (i + 1)∥}_{\bar{σ}}^{2}] = E [{∥\bar{w} (i)∥}_{F (i) \bar{σ}}^{2}] + b^{T} (i) \bar{σ},

(34)

which characterizes the transient behavior of the network. Although (34) does not explicitly show the effect of the variable step (VSS) algorithm on the network’s performance, this effect is in fact subsumed in the weighting matrix, F(i) which varies for each iteration, unlike in the fixed step-size LMS algorithm where the analysis shows that this weighting matrix remains fixed at all iterations. Also, (33) clearly shows the effect of the VSS algorithm on the performance of the algorithm through the presence of the diagonal step-size matrix D(i).

3.2.2 Learning behavior of the proposed algorithm

In this section, the learning behavior (which shows how the algorithm evolves with time) of the VSSDLMS algorithm is evaluated. Starting with ${\bar{w}}_{0} = w^{(o)}$ and D ₀ = μ ₀ I _MN, we have for iteration (i + 1)

\begin{array}{l} E (i - 1) & = diag \{(E [{∥\bar{w} (i - 1)∥}_{λ}^{2}] + σ_{v, 1}^{2}) I_{M}, \dots, \\ (E [{∥\bar{w} (i - 1)∥}_{λ}^{2}] + σ_{v, N}^{2}) I_{M}\} \end{array}

(35)

E [D (i)] = α E [D (i - 1)] + γ E (i - 1)

(36)

\begin{array}{l} E [D^{2} (i)] & = α^{2} E [D^{2} (i - 1)] + 2 α γ E (i - 1) \\ + γ^{2} E^{2} (i - 1) \end{array}

(37)

\begin{array}{l} F (i) & = [I_{M^{2} N^{2}} - (I_{MN} ⊙ Λ E [D (i)]) \\ - (Λ E [D (i)] ⊙ I_{MN}) \\ + (E [D (i) ⊙ D (i)]) A] (G^{T} ⊙ G^{T}) \end{array}

(38)

b (i) = bvec \{R_{v} E [D^{2} (i)] Λ\};

(39)

then incorporating the above relations in (34) gives

\begin{array}{l} E [{∥\bar{w} (i + 1)∥}_{\bar{σ}}^{2}] & = E [{∥\bar{w} (i)∥}_{F (i) \bar{σ}}^{2}] + b^{T} (i) \bar{σ} \\ = {∥{\bar{w}}^{(o)}∥}_{(\prod_{m = 0}^{i} F (m)) \bar{σ}}^{2} \\ + [\sum_{m = 0}^{i - 1} b^{T} (m) (\prod_{n = m + 1}^{i} F (n)) \\ + b^{T} (i) I_{MN}] \bar{σ} . \end{array}

(40)

Now, after subtracting the results of iteration i from those of iteration (i + 1) and simplifying them, we get

\begin{array}{l} E [{∥\bar{w} (i + 1)∥}_{\bar{σ}}^{2}] & = E [{∥\bar{w} (i)∥}_{\bar{σ}}^{2}] + {∥{\bar{w}}^{(o)}∥}_{F^{'} (i) (F (i) - I_{MN}) \bar{σ}}^{2} \\ + [F^{''} (i) (F (i) - I_{MN}) + b^{T} (i) I_{MN}] \bar{σ}, \end{array}

(41)

where

\begin{align} F^{'} (i) = \prod_{m = 0}^{i - 1} F (m), \end{align}

(42)

\begin{align} F^{''} (i) = \sum_{m = 0}^{i - 2} b^{T} (m) (\prod_{n = m + 1}^{i - 1} F (n)) + b^{T} (i) I_{MN}, \end{align}

(43)

which can be defined iteratively as

F^{'} (i + 1) = F^{'} (i) F (i),

(44)

F^{''} (i + 1) = F^{''} (i) F (i) + b^{T} (i) I_{MN} .

(45)

In order to evaluate the MSD and EMSE, we need to define the corresponding weighting matrix for each of them. Taking $\bar{σ} = (1 / N) bvec \{I_{MN}\} = q_{η}$ and $η (i) = (1 / N) E [{∥\bar{w} (i)∥}^{2}]$ for the MSD, we get

\begin{array}{l} η (i) & = η (i - 1) + {∥{\bar{w}}^{(o)}∥}_{F^{'} (i) (F (i) - I_{MN}) q_{η}}^{2} \\ + [F^{''} (i) (F (i) - I_{MN}) + b^{T} (i) I_{MN}] q_{η} . \end{array}

(46)

Similarly, taking $\bar{σ} = (1 / N) bvec \{Λ\} = λ_{ζ}$ and $ζ (i) = (1 / N) E [{∥\bar{w} (i)∥}_{Λ}^{2}]$ , the EMSE behavior is governed by

\begin{array}{l} ζ (i) & = ζ (i - 1) + {∥{\bar{w}}^{(o)}∥}_{F^{'} (i) (F (i) - I_{MN}) λ_{ζ}}^{2} \\ + [F^{''} (i) (F (i) - I_{MN}) + b^{T} (i) I_{MN}] λ_{ζ} . \end{array}

(47)

The relations in (46) and (47) govern the transient behavior of the MSD and EMSE of the proposed algorithm. These relations show how the effect on the proposed algorithm’s transient behavior of the weighting matrix varies from one iteration to the next as the weighting matrix itself varies at each iteration. This is not the case in the simple fixed step-size DLMS in[6] where the weighting matrix remains constant for all iterations. Since the weighting matrix depends on the step-size matrix, which becomes very small asymptotically, then both the norm and influence of the weighting matrix also become asymptotically small. From the above relations, it is seen that both the MSD and EMSE become very small at steady-state because the weighting matrix itself becomes small at steady-state and these relations will then depend only on the product of the weighting matrices at each iteration.

3.3 Steady-state analysis

From the second relation in (8), it is seen that the step-size for each node is independent of the data received from other nodes. Even though the connectivity matrix, G, does not permit the weighting matrix, F ( i ), to be evaluated separately for each node, this is not the case for the determination of the step-size at any node. Here, we define the misadjustment as the ratio of the EMSE to the minimum mean square error. The misadjustment value is used in determining the steady-state performance of the algorithm[11]. Therefore, taking the approach of[20], we first find the misadjustment, as given by

ℳ_{k} = \frac{1 - {[1 - 2 \frac{(3 - α) γ σ_{v, k}^{2}}{1 - α^{2}} tr (Λ_{k})]}^{1 / 2}}{1 + {[1 - 2 \frac{(3 - α) γ σ_{v, k}^{2}}{1 - α^{2}} tr (Λ_{k})]}^{1 / 2}} .

(48)

Then solving (36) and (37) along with (48) leads to the steady-state values for the step-size and its square for each node

μ_{ss, k} = \frac{γ σ_{v, k}^{2} (1 + ℳ_{k})}{1 - α},

(49)

μ_{ss, k}^{2} = \frac{2 α γ μ_{ss, k} σ_{v, k}^{2} (1 + ℳ_{k}) + γ^{2} σ_{v, k}^{4} {(1 + ℳ_{k})}^{2}}{1 - α^{2}} .

(50)

Incorporating these two steady-state relations in (33) yields the steady-state weighting matrix as

\begin{array}{l} F_{ss} & = [I_{M^{2} N^{2}} - (I_{MN} ⊙ Λ E [D_{ss}]) - (Λ E [D_{ss}] ⊙ I_{MN}) \\ + (E [D_{ss} ⊙ D_{ss}]) A] (G^{T} ⊙ G^{T}), \end{array}

(51)

where D _ss = diag{μ _ss,k I _M}.

Thus, the steady-state mean-square behavior is given by

E [{∥{\bar{w}}_{ss}∥}_{\bar{σ}}^{2}] = E [{∥{\bar{w}}_{ss}∥}_{F_{ss} \bar{σ}}^{2}] + b_{ss}^{T} \bar{σ},

(52)

where b _ss = R _v D ss 2Λ and $D_{ss}^{2} = diag \{μ_{ss, k}^{2} I_{M}\}$ . Now solving (52), we get

E [{∥{\bar{w}}_{ss}∥}_{\bar{σ}}^{2}] = b_{ss}^{T} {[I_{M^{2} N^{2}} - F_{ss}]}^{- 1} \bar{σ} .

(53)

This equation gives the steady-state performance measure for the entire network. In order to solve for the steady-state values of MSD and EMSE, we take $\bar{σ} = q_{η}$ and $\bar{σ} = λ_{ζ}$ , respectively, as in (46) and (47). This gives us the steady-state values for MSD and EMSE as follows:

\begin{align} η_{ss} = b_{ss}^{T} {[I_{M^{2} N^{2}} - F_{ss}]}^{- 1} q_{η}, \end{align}

(54)

\begin{align} ζ_{ss} = b_{ss}^{T} {[I_{M^{2} N^{2}} - F_{ss}]}^{- 1} λ_{ζ} . \end{align}

(55)

The above two steady-state relationships depend on the steady-state weighting matrix which becomes very small at steady-state, as explained before. As a result, the steady-state results for the proposed algorithms become very small compared to those for the fixed step-size DLMS algorithm.

4 Numerical results

In this section, several simulation scenarios are considered and discussed to assess the performance of the proposed VSSDLMS algorithm. Results have been conducted for different average SNR values. The performance measure used throughout these simulations is the MSD. The length of the unknown vector is taken as M = 4. The size of the network is N = 20 nodes. The sensors are randomly placed in an area of 1 unit square. The input regressor vector is assumed to be white Gaussian with auto-correlation matrix having the same variance for all nodes. Results averaged over 100 experiments are shown for the SNR value of 20 dB, a normalized communication range of 0.3, and a Gaussian environment.

First, the discussed that VSSLMS algorithms are compared with each other for the case of SNR 20 dB and the results of this comparison are reported in Figure2. As can be depicted from Figure2, the algorithm of[20] performs the best and is therefore chosen for our proposed VSSDLMS algorithm.

The sensitivity analysis for the selected VSSDLMS algorithm operating in the scenario explained above is discussed next. Since the VSSDLMS algorithm depends upon the choice of α and γ, these values are varied to check their effect on the performance of the algorithm. As can be seen from Figure3, the performance of the VSSDLMS algorithm degrades as α gets larger. Similarly, the performance of the proposed algorithm improves as γ increases as depicted in Figure4. This investigation therefore allows for a proper choice of α and γ to be made.

In order to show the importance of varying the step-size, two experiments were run separately. In the first experiment, the DLMS algorithm was simulated with a high step-size while the initial value for the proposed algorithm was kept both low and high. In the second experiment, the step-size of the DLMS algorithm was changed to a low value. As can be seen from Figures5 and6, the proposed algorithm converges to the same error floor for both scenarios. However, the DLMS algorithm converges fast but at a higher error floor in Figure5. The low value of step-size results in the DLMS algorithm converging at the same error floor as the proposed algorithm but very slowly. Thus, the proposed algorithm provides fast convergence as well as better performance.

Next, the proposed algorithm is compared with some key existing algorithms, which are the no-cooperation LMS, the distributed LMS[10], the DLMS[6], the DLMS with adaptive combiners (DLMSAC)[8] and the DRLS[14]. Figure7 reports the performance behavior of these different algorithms. As can be seen from this figure, the performance of the proposed VSSDLMS algorithm is second only that of the DRLS algorithm. However, the gap in performance is narrow. These results show that when compared with other algorithms of similar complexity, the VSSDLMS algorithm exhibits a significant improvement in performance. A similar performance in the steady-state behavior of the proposed VSSDLMS algorithm is obtained as shown in Figure8. As expected, the DRLS algorithm performs better than all other algorithms included in this comparison, but the proposed algorithm remains second only to the DRLS algorithm in the steady-state mode. Also, the diffusion process can be appropriately viewed as an efficient and indirect way of adjusting the step-size in all neighboring nodes, which resulted in keeping the steady-state MSD for all nodes nearly the same for all cases.

Next, the comparison between the results predicted by the theoretical analysis of the proposed algorithm and the simulation results is reported in Figures9 and10. As can be seen from these figures, the simulation analysis corroborates the theoretical findings very well. This is done for a network of 15 nodes with M = 2 and a communication range of 0.35. Two values for α, namely α = 0.995 and α = 0.95, are chosen whereas γ = 0.001.

An important aspect of working with sensor nodes is the possibility of a node switching off. In such a case the network may be required to adapt itself. The diffusion scheme is robust to such a change, and this scenario has been considered here and results are shown in Figure11. A network of 50 nodes is chosen so that enough nodes can be switched off in order to study the performance of the proposed algorithm in this scenario. Two cases are considered. In the first case, 15 nodes are turned off after 50 iterations and then a further 15 nodes are switched off after 300 iterations. In the second case, 15 nodes are switched off after 250 iterations and the next 15 nodes are switched off after 750 iterations. In both cases, the performance degrades initially but recovers to give a similar performance to the case where there are no nodes being switched off. The difference between the best and worst case scenarios is only about 2 dB. For the DLMS algorithm, however, the performance is worse off the earlier the nodes are switched off. The difference between the best and worst case scenarios is almost 9 dB, which further enhances the robustness of the proposed algorithm.

Finally, the comparison between the theoretical and simulated steady-state values for the MSD and EMSE for two different input regressor auto-correlation matrices is given in Table1. As can be seen from this table, a close agreement between theory and simulations has been obtained.

Table 1 Steady-state values for MSD and EMSE

Full size table

5 Conclusions

The proposed variable step-size diffusion LMS (VSSDLMS) algorithm has been discussed in detail. Several popular VSSLMS algorithms are investigated and the algorithm providing the best trade-off between complexity and performance is chosen as the proposed VSSDLMS algorithm. Complete convergence and steady-state analyses have been carried out to assess the performance of the proposed algorithm. Simulations have been carried out under different scenarios and with different SNR values. A sensitivity analysis has been carried out through extensive simulations. Based on the results of this analysis, the values for the parameters of the VSSLMS algorithm were chosen. The proposed algorithm has been compared with existing algorithms of similar complexity and it has been shown that the proposed algorithm performed significantly better. Theoretical results were also compared with simulation results and the two were found to be in close agreement with each other. The proposed algorithm was then tested under different scenarios to assess its robustness. Finally, a steady-state comparison between theoretical and simulated results was carried out and tabulated and the results were also found to be in close agreement with each other.

References

Estrin D, Girod L, Pottie G, Srivastava M: Instrumenting the world with wireless sensor networks. In Proceedings of the ICASSP. Salt Lake City; 07–11 May 2001:2033-2036.
Google Scholar
Olfati-Saber R, Fax JA, Murray RM: Consensus and cooperation in networked multi-agent systems. Proc. IEEE 2007, 95(1):215-233.
Article Google Scholar
Lopes CG, Sayed AH: Incremental adaptive strategies over distributed networks. IEEE Trans. Signal Process 2007, 55(8):4064-4077.
Article MathSciNet Google Scholar
Ram SS, Nedic A, Veeravalli VV: Stochastic incremental gradient descent for estimation in sensor networks. In Proceedings of the 41st Asilomar Conference on Signals, Systems and Computers. Pacific Grove; 4–7 November 2007:582-586.
Google Scholar
Rastegarnia A, Tinati MA, Khalili A: Performance analysis of quantized incremental LMS algorithm for distributed adaptive estimation. Signal Process 2010, 90(8):2621-2627. 10.1016/j.sigpro.2010.02.019
Article Google Scholar
Lopes CG, Sayed AH: Diffusion least-mean squares over adaptive networks: formulation and performance analysis. IEEE Trans. Signal Process 2008, 56(7):3122-3136.
Article MathSciNet Google Scholar
Cattivelli F, Sayed AH: Diffusion LMS strategies for distributed estimation. IEEE Trans. Signal Process 2010, 58(3):1035-1048.
Article MathSciNet Google Scholar
Takahashi N, Yamada I, Sayed AH: Diffusion least mean squares with adaptive combiners: formulation and performance analysis. IEEE Trans. Signal Process 2010, 58(9):4795-4810.
Article MathSciNet Google Scholar
Schizas ID, Mateos G, Giannakis GB: Distributed LMS for consensus-based in-network adaptive processing. IEEE Trans. Signal Process 2009, 57(6):2365-2382.
Article MathSciNet Google Scholar
Mateos G, Schizas ID, Giannakis GB: Performance analysis of the consensus-based distributed LMS algorithm. EURASIP J. Adv. Signal Process 2009. doi: 10.1155/2009/981030
Google Scholar
Haykin S: Adaptive Filter Theory. Englewood Cliffs: Prentice-Hall; 2000.
Google Scholar
Papadimitriou CH: Computational Complexity. MA: Addison-Wesley, Reading; 1993.
Google Scholar
BinSaeed MO, Zerguine A, Zummo SA: Variable step size least mean square algorithms over adaptive networks. In Proceedings of ISSPA 2010. Kuala Lampur; 10–13 May 2010:381-384.
Google Scholar
Cattivelli FS, Lopes CG, Sayed AH: Diffusion recursive least-squares for distributed estimation over adaptive networks. IEEE Trans. Signal Process 2008, 56(5):1865-1877.
Article MathSciNet Google Scholar
Mateos G, Schizas ID, Giannakis GB: Distributed recursive least-squares for consensus-based in-network adaptive estimation. IEEE Trans. Signal Process 2009, 57(11):4583-4588.
Article MathSciNet Google Scholar
Mateos G, Giannakis GB: Distributed recursive least-squares: stability and performance analysis. IEEE Trans. Signal Process 2012, 60(7):3740-3754.
Article MathSciNet Google Scholar
Cattivelli F, Sayed AH: Diffusion strategies for distributed Kalman filtering and smoothing. IEEE Trans. Automatic Control 2010, 55(9):2069-2084.
Article MathSciNet Google Scholar
Olfati-Saber R: Distributed Kalman filtering for sensor networks. In Proceedings of IEEE CDC 2007. New Orleans; 12–14 December 2007:5492-5498.
Google Scholar
Hlinka O, Hlawatsch F, Djuric PM: Distributed particle filtering in agent networks: a survey, classification, and comparison. IEEE Signal Proc. Mag. 2013, 30(1):61-81.
Article Google Scholar
Kwong RH, Johnston EW: A variable step size LMS algorithm. IEEE Trans. Signal Process 1992, 40(7):1633-1642. 10.1109/78.143435
Article Google Scholar
Aboulnasr T, Mayyas K: A robust variable step-size LMS-type algorithm: analysis and simulations. IEEE Trans. Signal Process 1997, 45(3):631-639. 10.1109/78.558478
Article Google Scholar
Costa MH, Bermudez JCM: A robust variable step-size algorithm for LMS adaptive filters. In Proceedings of the ICASSP. Toulouse; 14–19 May 2006:93-96.
Google Scholar
Lopes CG, Bermudez JCM: Evaluation and design of variable step size adaptive algorithms. In Proceedings of the ICASSP. Salt Lake City; 7–11 May 2001:3845-3848.
Google Scholar
Mathews VJ, Xie Z: A stochastic gradient adaptive filter with gradient adaptive step size. IEEE Trans. Signal Process 1993, 41(6):2075-2087. 10.1109/78.218137
Article Google Scholar
Sulyman AI, Zerguine A: Convergence and steady-state analysis of a variable step-size NLMS algorithm. Signal Process 2003, 83(6):1255-1273. 10.1016/S0165-1684(03)00044-6
Article Google Scholar
Sayed AH: Fundamentals of Adaptive Filtering. New York: Wiley; 2003.
Google Scholar
Koning RH, Neudecker H, Wansbeek T: Block Kronecker products and the vecb operator. Economics Department, Institute of Economics Research, University of Groningen, Groningen, The Netherlands, Research Memo no. 351, 1990

Download references

Acknowledgements

This research work is funded by King Fahd University of Petroleum and Minerals (KFUPM) under research grants FT100012 and RG1112-1&2.

Author information

Authors and Affiliations

Electrical Engineering Department, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia
Muhammad O Bin Saeed, Azzedine Zerguine & Salam A Zummo

Authors

Muhammad O Bin Saeed
View author publications
You can also search for this author in PubMed Google Scholar
Azzedine Zerguine
View author publications
You can also search for this author in PubMed Google Scholar
Salam A Zummo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Azzedine Zerguine.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Bin Saeed, M.O., Zerguine, A. & Zummo, S.A. A variable step-size strategy for distributed estimation over adaptive networks. EURASIP J. Adv. Signal Process. 2013, 135 (2013). https://doi.org/10.1186/1687-6180-2013-135

Download citation

Received: 15 July 2013
Accepted: 26 July 2013
Published: 06 August 2013
DOI: https://doi.org/10.1186/1687-6180-2013-135

A variable step-size strategy for distributed estimation over adaptive networks

Abstract

1 Introduction

2 Problem statement

3 Variable step-size diffusion LMS algorithm

3.1 Mean analysis

3.2 Mean-square analysis

3.2.1 Mean-square analysis for Gaussian data

3.2.2 Learning behavior of the proposed algorithm

3.3 Steady-state analysis

4 Numerical results

5 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords