The VSSLMS algorithms show marked improvement over the LMS algorithm at a low computational complexity[20–25]. Therefore, this variation is inserted in the distributed algorithm to inherit the improved performance of the VSSLMS algorithm. Different variations have their own advantages and disadvantages. A complex step-size adaptation algorithm would not be suitable because of the physical limitations of the sensor node. As shown in[23], the algorithm proposed by[20] shows the best performance as well as having low complexity. Therefore, it is well suited for this application. A further comparison of performance of these variants in the present scenario confirm our choice of the VSSLMS algorithm.

The proposed algorithm simply incorporates the VSSLMS algorithm into the diffusion scheme given by (4). Using a VSSLMS algorithm, the step-size will also become a variable in this system of equations defining the proposed distributed algorithm. Then the VSSDLMS algorithm is governed by the following:

\begin{array}{l}{\mathbf{\Psi}}_{k}(i+1)={\mathbf{w}}_{k}(i)+{\mu}_{k}(i){\mathbf{u}}_{k}(i)\left({d}_{k}\left(i\right)-{\mathbf{u}}_{k}(i){\mathbf{w}}_{k}(i)\right),\\ \phantom{\rule{2pt}{0ex}}{\mu}_{k}(i+1)=f\left[{\mu}_{k}(i)\right],\\ {\mathbf{w}}_{k}(i+1)=\sum _{l\in {\mathcal{N}}_{k}}{c}_{\mathit{\text{lk}}}{\mathbf{\Psi}}_{l}(i+1),\phantom{\rule{2em}{0ex}}\end{array}

(5)

where *f*[*μ*
_{
k
}(*i*)] is the step-size adaptation function that is defined using the VSSLMS adaptation given in[20] where the update equation is given by

\begin{array}{ll}\phantom{\rule{5pt}{0ex}}{\mu}_{k}(i+1)& =\alpha {\mu}_{k}(i)+\gamma {\left({d}_{k}\left(i\right)-{\mathbf{u}}_{k}(i){\mathbf{w}}_{k}(i)\right)}^{2}\\ =\alpha {\mu}_{k}(i)+\gamma {e}_{k}^{2}\left(i\right),\end{array}

(6)

where *e*
_{
k
}(*i*) = *d*
_{
k
}(*i*) − **u**
_{
k
}(*i*)**w**
_{
k
}(*i*), 0 < *α* < 1 and *γ* > 0.

Since nodes exchange data amongst themselves, their current update will then be affected by the weighted average of the previous estimates. Therefore, to account for this inter-node dependence, it is suitable to study the performance of the whole network. Hence, some new variables need to be introduced and the local ones are transformed into global variables as follows:

\begin{array}{ll}\phantom{\rule{5pt}{0ex}}\mathbf{w}(i)& =\text{col}\left\{{\mathbf{w}}_{1}(i),\dots ,{\mathbf{w}}_{N}(i)\right\},\\ \phantom{\rule{5pt}{0ex}}\mathbf{\Psi}(i)& =\text{col}\left\{{\mathbf{\Psi}}_{1}(i),\dots ,{\mathbf{\Psi}}_{N}(i)\right\},\\ \phantom{\rule{5pt}{0ex}}\mathbf{U}(i)& =\text{diag}\left\{{\mathbf{u}}_{1}(i),\dots ,{\mathbf{u}}_{N}(i)\right\},\\ \phantom{\rule{5pt}{0ex}}\mathbf{D}(i)& =\text{diag}\left\{{\mu}_{1}(i){\mathbf{I}}_{M},\dots ,{\mu}_{N}(i){\mathbf{I}}_{M}\right\},\\ \phantom{\rule{7pt}{0ex}}\mathbf{d}(i)& =\text{col}\left\{{d}_{1}(i),\dots ,{d}_{N}(i)\right\},\\ \phantom{\rule{7pt}{0ex}}\mathbf{v}(i)& =\text{col}\left\{{v}_{1}(i),\dots ,{v}_{N}(i)\right\}.\end{array}

From these new variables, a completely new set of equations representing the entire network is formed, starting with the relation between the measurements

\mathbf{d}(i)=\mathbf{U}(i){\mathbf{w}}^{\left(o\right)}+\mathbf{v}(i),

(7)

where **w**
^{(o)} = **Q** **w**
^{o}, and **Q** = *c* *o* *l*{**I**
_{
M
},**I**
_{
M
},…,**I**
_{
M
}} is a *M* *N* × *M* matrix. Similarly, the update equations can be remodeled to represent the entire network

\begin{array}{l}\mathbf{\Psi}(i+1)=\mathbf{w}(i)+\mathbf{D}(i){\mathbf{U}}^{\mathrm{T}}(i)\left(\mathbf{d}(i)-\mathbf{U}(i)\mathbf{w}(i)\right),\\ \mathbf{D}(i+1)=\alpha \mathbf{D}(i)+\gamma \mathbf{E}(i),\\ \mathbf{w}(i+1)=\mathbf{G}\mathbf{\Psi}(i+1),\end{array}

(8)

where **G** = **C** ⊗ **I**
_{
M
}; **C** is an *N* × *N* weighting matrix, where {**C**}_{
lk
} = *c*
_{
lk
}; ⊗ is the Kronecker product; **D**(*i*) is the diagonal step-size matrix; and the error energy matrix, **E**(*i*), is given by

\mathbf{E}(i)=\text{diag}\left\{{e}_{1}^{2}\left(i\right){\mathbf{I}}_{M},{e}_{2}^{2}\left(i\right){\mathbf{I}}_{M},\dots ,{e}_{N}^{2}\left(i\right){\mathbf{I}}_{M}\right\}.

(9)

Considering the above set of equations, the mean and mean-square analyses and the steady-state behavior of the VSSDLMS algorithm are carried out as shown next. The mean analysis considers the stability of the algorithm and derives a bound for the step-size which would guarantee convergence. The mean-square analysis also derives transient and steady-state expressions for the mean square deviation (MSD) and the excess mean square error (EMSE). The MSD is defined as the error in the estimate of the unknown vector. The weight-error vector for node *k* is given by

{\stackrel{~}{\mathbf{w}}}_{k}(i)={\mathbf{w}}^{o}-{\mathbf{w}}_{k}(i),

(10)

then the MSD can be simply defined as

\text{MSD}=\mathrm{E}\left[{\u2225{\stackrel{~}{\mathbf{w}}}_{k}(i)\u2225}^{2}\right]=\mathrm{E}\left[{\u2225{\mathbf{w}}^{o}-{\mathbf{w}}_{k}(i)\u2225}^{2}\right].

(11)

Similarly, the EMSE is derived from the error equation as follows:

\begin{array}{l}\text{EMSE}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\mathrm{E}\left[{\left|{e}_{k}(i)\right|}^{2}\right]-{\sigma}_{\mathit{\text{vk}}}^{2}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\mathrm{E}\left[{\left|{d}_{k}(i)-{\mathbf{u}}_{k}(i){\mathbf{w}}_{k}(i)\right|}^{2}\right]\phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}{\sigma}_{\mathit{\text{vk}}}^{2},\end{array}

which can be solved further to get the following expression for the EMSE:

\text{EMSE}=\mathrm{E}\left[{\u2225{\stackrel{~}{\mathbf{w}}}_{k}(i)\u2225}_{{\mathbf{R}}_{k}}^{2}\right],

(12)

where **R**
_{
k
} is the autocorrelation matrix for node *k*.

### 3.1 Mean analysis

To begin with, let us introduce the global weight-error vector defined in[6, 26] as

\stackrel{~}{\mathbf{w}}(i)={\mathbf{w}}^{\left(o\right)}-\mathbf{w}(i).

(13)

Since\mathbf{G}{\mathbf{w}}^{(\mathrm{o})}\Delta ={\mathbf{w}}^{(\mathrm{o})}, by incorporating the global weight-error vector into (8), we get

\begin{array}{ll}\phantom{\rule{5pt}{0ex}}\stackrel{~}{\mathbf{w}}(i+1)& =\mathbf{G}\stackrel{~}{\mathbf{\Psi}}(i+1)\\ =\mathbf{G}\stackrel{~}{\mathbf{w}}(i)-\mathbf{GD}(i){\mathbf{U}}^{\mathrm{T}}(i)\left(\mathbf{U}(i)\stackrel{~}{\mathbf{w}}(i)+\mathbf{v}(i)\right)\\ =\mathbf{G}\left({\mathbf{I}}_{\mathit{\text{MN}}}-\mathbf{D}(i){\mathbf{U}}^{\mathrm{T}}(i)\mathbf{U}(i)\right)\stackrel{~}{\mathbf{w}}(i)-\mathbf{GD}(i){\mathbf{U}}^{\mathrm{T}}(i)\mathbf{v}(i).\end{array}

(14)

Here we use the assumption that the step-size matrix **D**(*i*) is independent of the regressor matrix **U**(*i*)[20]. Accordingly, for small values of *γ* in (6), the following relation holds true asymptotically

\mathrm{E}\left[\mathbf{D}(i){\mathbf{U}}^{\mathrm{T}}(i)\mathbf{U}(i)\right]\approx \mathrm{E}\left[\mathbf{D}(i)\right]\mathrm{E}\left[{\mathbf{U}}^{\mathrm{T}}(i)\mathbf{U}(i)\right],

(15)

where E[**U**
^{T}(*i*)**U**(*i*)] = **R**
_{
U
} is the auto-correklation matrix of **U**(*i*), and taking the expectation on both sides of (14) gives

\mathrm{E}\left[\stackrel{~}{\mathbf{w}}(i+1)\right]=\mathbf{G}\left({\mathbf{I}}_{\mathit{\text{MN}}}-\mathrm{E}\left[\mathbf{D}(i)\right]{\mathbf{R}}_{\mathbf{U}}\right)\mathrm{E}\left[\stackrel{~}{\mathbf{w}}(i)\right],

(16)

where the expectation of the second term of the right-hand side of (14) is 0 since the measurement noise is spatially uncorrelated with the regressor and zero-mean, as explained earlier.

From (16), we see that for stability in the mean we must have |*λ*
_{max}(**G** **B**)| < 1, where **B** = (**I**
_{
MN
} − E[**D**(*i*)]**R**
_{
U
}). Since **G** comes from **C** and we know that ∥**G** **B**∥_{2} ≤ ∥**G**∥_{2}.∥**B**∥_{2}, we can safely infer that

\left|{\lambda}_{max}(\mathbf{GB})\right|\le {\u2225\mathbf{C}\u2225}_{2}.\left|{\lambda}_{max}(\mathbf{B})\right|.

(17)

Since there is already a condition that ∥**C**∥_{2} = 1 and for noncooperative schemes, we have (**G** = **I**
_{
MN
}), we can safely conclude that

\left|{\lambda}_{max}(\mathbf{GB})\right|\le \left|{\lambda}_{max}(\mathbf{B})\right|.

(18)

So we can see that the cooperation mode only enhances the stability of the system (for further details, refer to[6, 7]). Since stability is also dependent on the step-size, then the algorithm will be stable in the mean if

\prod _{i=0}^{n}\left(\mathbf{I}-\mathrm{E}\left[{\mu}_{k}(i)\right]{\mathbf{R}}_{\mathbf{u},k}\right)\to 0,\mathit{\text{as}}\phantom{\rule{0.3em}{0ex}}n\to \infty

(19)

which holds true if the mean of the step-size is governed by

0<\mathrm{E}\left[{\mu}_{k}(i)\right]<\frac{2}{{\lambda}_{max}\left({\mathbf{R}}_{\mathbf{u},k}\right)},1\le k\le N,

(20)

where *λ*
_{max}(**R**
_{
u,k
}) is the maximum eigenvalue of the auto-correlation matrix **R**
_{
u,k
}. This scenario is different from that of the fixed step-size as in this case where the system is stable only when the mean of the step-size is within the limits defined by (20).

### 3.2 Mean-square analysis

In this section, the mean-square analysis of the VSSDLMS algorithm is investigated. Here, the weighted norm has been used instead of the regular norm. The motivation behind using a weighted norm stems from the fact that even though the MSD does not require a weighted norm, the evaluation of the EMSE depends on a weighted norm. In order to accommodate both these measures, a general analysis is conducted using a weighted norm, where a weighting matrix is replaced by an identity matrix for the case of MSD, where a weighting matrix is not required[26].

We take the weighted norm of (14) and then apply the expectation operator to both of its sides. This yields the following:

\begin{array}{ll}\phantom{\rule{6pt}{0ex}}\mathrm{E}\left[{\u2225\stackrel{~}{\mathbf{w}}(i+1)\u2225}_{\mathbf{\Sigma}}^{2}\right]& =\mathrm{E}\left[\u2225\mathbf{G}\left({\mathbf{I}}_{\mathit{\text{MN}}}-\mathbf{D}(i){\mathbf{U}}^{\mathrm{T}}(i)\mathbf{U}(i)\right)\stackrel{~}{\mathbf{w}}(i)\right.\\ \phantom{\rule{1em}{0ex}}-\left(\right)close="]">{\left(\right)close="\parallel ">\mathbf{GD}(i){\mathbf{U}}^{\mathrm{T}}(i)\mathbf{v}(i)}_{}^{}\mathbf{\Sigma}2\end{array}\n

(21)

where

\mathbf{Y}(i)=\mathbf{GD}(i){\mathbf{U}}^{\mathrm{T}}(i)

(22)

\begin{array}{ll}\phantom{\rule{5pt}{0ex}}\widehat{\mathbf{\Sigma}}& ={\mathbf{G}}^{\mathrm{T}}\mathbf{\Sigma}\mathbf{G}-{\mathbf{G}}^{\mathrm{T}}\mathbf{\Sigma}\mathbf{Y}(i)\mathbf{U}(i)-{\mathbf{U}}^{\mathrm{T}}(i){\mathbf{Y}}^{\mathrm{T}}(i)\mathbf{\Sigma}\mathbf{G}\\ \phantom{\rule{1em}{0ex}}+\phantom{\rule{0.3em}{0ex}}{\mathbf{U}}^{\mathrm{T}}(i){\mathbf{Y}}^{\mathrm{T}}(i)\mathbf{\Sigma}\mathbf{Y}(i)\mathbf{U}(i).\end{array}

(23)

Using the data independence assumption[26] and applying the expectation operator gives

\begin{array}{ll}\phantom{\rule{6pt}{0ex}}\mathrm{E}\left[\widehat{\mathbf{\Sigma}}\right]& ={\mathbf{G}}^{\mathrm{T}}\mathbf{\Sigma}\mathbf{G}\phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}{\mathbf{G}}^{\mathrm{T}}\mathbf{\Sigma}\mathrm{E}\left[\mathbf{Y}(i)\mathbf{U}(i)\right]\phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}\mathrm{E}\left[{\mathbf{U}}^{\mathrm{T}}(i){\mathbf{Y}}^{\mathrm{T}}(i)\right]\mathbf{\Sigma}\mathbf{G}\\ \phantom{\rule{1em}{0ex}}+\phantom{\rule{0.3em}{0ex}}\mathrm{E}\left[{\mathbf{U}}^{\mathrm{T}}(i){\mathbf{Y}}^{\mathrm{T}}(i)\mathbf{\Sigma}\mathbf{Y}(i)\mathbf{U}(i)\right]\\ ={\mathbf{G}}^{\mathrm{T}}\mathbf{\Sigma}\mathbf{G}-{\mathbf{G}}^{\mathrm{T}}\mathbf{\Sigma}\mathbf{G}\mathrm{E}\left[\mathbf{D}(i)\right]\mathrm{E}\left[{\mathbf{U}}^{\mathrm{T}}(i)\mathbf{U}(i)\right]\\ \phantom{\rule{1em}{0ex}}-\phantom{\rule{0.3em}{0ex}}\mathrm{E}\left[{\mathbf{U}}^{\mathrm{T}}(i)\mathbf{U}(i)\right]\mathrm{E}\left[\mathbf{D}(i)\right]{\mathbf{G}}^{\mathrm{T}}\mathbf{\Sigma}\mathbf{G}\\ \phantom{\rule{1em}{0ex}}+\phantom{\rule{0.3em}{0ex}}\mathrm{E}\left[{\mathbf{U}}^{\mathrm{T}}(i){\mathbf{Y}}^{\mathrm{T}}(i)\mathbf{\Sigma}\mathbf{Y}(i)\mathbf{U}(i)\right].\end{array}

(24)

For ease of notation, we denote\mathrm{E}\left[\widehat{\mathbf{\Sigma}}\right]={\mathbf{\Sigma}}^{\prime} for the remaining analysis.

#### 3.2.1 Mean-square analysis for Gaussian data

The evaluation of the expectations in (24) is quite tedious for non-Gaussian data. Therefore, it is assumed here that the data is Gaussian in order to evaluate (24). The auto-correlation matrix can be decomposed as **R**
_{
U
} = **T** **Λ** **T**
^{T}, where **Λ** is a diagonal matrix containing the eigenvalues for the entire network and **T** is a matrix containing the eigenvectors corresponding to these eigenvalues. Using this eigenvalue decomposition, we define the following relations

\begin{array}{ll}\phantom{\rule{5pt}{0ex}}\stackrel{\u0304}{\mathbf{w}}(i)& ={\mathbf{T}}^{\mathrm{T}}\stackrel{~}{\mathbf{w}}(i)\phantom{\rule{1em}{0ex}}\stackrel{\u0304}{\mathbf{U}}(i)=\mathbf{U}(i)\mathbf{T}\phantom{\rule{1em}{0ex}}\stackrel{\u0304}{\mathbf{G}}={\mathbf{T}}^{\mathrm{T}}\mathbf{GT}\\ \phantom{\rule{13.5pt}{0ex}}\stackrel{\u0304}{\mathbf{\Sigma}}& ={\mathbf{T}}^{\mathrm{T}}\mathbf{\Sigma}\mathbf{T}\phantom{\rule{1em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}{\stackrel{\u0304}{\mathbf{\Sigma}}}^{\prime}={\mathbf{T}}^{\mathrm{T}}{\mathbf{\Sigma}}^{\prime}\mathbf{T}\phantom{\rule{1em}{0ex}}\phantom{\rule{0.3em}{0ex}}\stackrel{\u0304}{\mathbf{D}}(i)={\mathbf{T}}^{\mathrm{T}}\mathbf{D}(i)\mathbf{T}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\mathbf{D}(i),\end{array}

where the input regressors are considered independent of each other at each node and the step-size matrix **D**(*i*) is block-diagonal, so it does not transform since **T**
^{T}
**T** = **I**. Using these relations, (21) and (24) can be rewritten, respectively, as

\mathrm{E}\phantom{\rule{0.3em}{0ex}}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}(i+1)\u2225}_{\stackrel{\u0304}{\mathbf{\Sigma}}}^{2}\right]\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}(i)\u2225}_{{\stackrel{\u0304}{\mathbf{\Sigma}}}^{\prime}}^{2}\right]+\mathrm{E}\left[{\mathbf{v}}^{\mathrm{T}}(i){\stackrel{\u0304}{\mathbf{Y}}}^{\mathrm{T}}(i)\stackrel{\u0304}{\mathbf{\Sigma}}\stackrel{\u0304}{\mathbf{Y}}(i)\mathbf{v}(i)\right]\phantom{\rule{0.3em}{0ex}},

(25)

and

\begin{array}{ll}\phantom{\rule{7pt}{0ex}}{\stackrel{\u0304}{\mathbf{\Sigma}}}^{\prime}& ={\stackrel{\u0304}{\mathbf{G}}}^{\mathrm{T}}\stackrel{\u0304}{\mathbf{\Sigma}}\stackrel{\u0304}{\mathbf{G}}-{\stackrel{\u0304}{\mathbf{G}}}^{\mathrm{T}}\stackrel{\u0304}{\mathbf{\Sigma}}\stackrel{\u0304}{\mathbf{G}}\mathrm{E}\left[\mathbf{D}(i)\right]\mathrm{E}\left[{\stackrel{\u0304}{\mathbf{U}}}^{\mathrm{T}}(i)\stackrel{\u0304}{\mathbf{U}}(i)\right]\\ \phantom{\rule{1em}{0ex}}-\phantom{\rule{0.3em}{0ex}}\mathrm{E}\left[{\stackrel{\u0304}{\mathbf{U}}}^{\mathrm{T}}(i)\stackrel{\u0304}{\mathbf{U}}(i)\right]\mathrm{E}\left[\mathbf{D}(i)\right]{\stackrel{\u0304}{\mathbf{G}}}^{\mathrm{T}}\stackrel{\u0304}{\mathbf{\Sigma}}\stackrel{\u0304}{\mathbf{G}}\\ \phantom{\rule{1em}{0ex}}+\phantom{\rule{0.3em}{0ex}}\mathrm{E}\left[{\stackrel{\u0304}{\mathbf{U}}}^{\mathrm{T}}(i)\stackrel{\u0304}{\mathbf{Y}}(i)\stackrel{\u0304}{\mathbf{\Sigma}}\stackrel{\u0304}{\mathbf{Y}}(i)\stackrel{\u0304}{\mathbf{U}}(i)\right],\end{array}

(26)

where\stackrel{\u0304}{\mathbf{Y}}(i)=\stackrel{\u0304}{\mathbf{G}}\mathbf{D}(i){\stackrel{\u0304}{\mathbf{U}}}^{\mathrm{T}}(i).

It can be seen that\mathrm{E}\left[{\stackrel{\u0304}{\mathbf{U}}}^{\mathrm{T}}(i)\stackrel{\u0304}{\mathbf{U}}(i)\right]=\mathbf{\Lambda}. Also, using the bvec operator[27], we have\stackrel{\u0304}{\sigma}=\text{bvec}\left\{\stackrel{\u0304}{\mathbf{\Sigma}}\right\}, where the bvec operator divides the matrix into smaller blocks and then applies the vec operator to each of the smaller blocks. Now, let **R**
_{
v
} = **Λ**
_{
v
} ⊙ **I**
_{
M
} denote the block diagonal noise covariance matrix for the entire network, where ⊙ denotes the block Kronecker product[27] and **Λ**
_{
v
} is a diagonal noise variance matrix for the network. Hence, the second term of the right-hand side of (25) is

\mathrm{E}\left[{\mathbf{v}}^{\mathrm{T}}(i){\stackrel{\u0304}{\mathbf{Y}}}^{\mathrm{T}}(i)\stackrel{\u0304}{\mathbf{\Sigma}}\stackrel{\u0304}{\mathbf{Y}}(i)\mathbf{v}(i)\right]={\mathbf{b}}^{\mathrm{T}}(i)\stackrel{\u0304}{\mathit{\sigma}},

(27)

where **b**(*i*) = bvec{**R**
_{
v
}E[**D**
^{2}(*i*)]**Λ**}.

The fourth-order moment\mathrm{E}\left[{\stackrel{\u0304}{\mathbf{U}}}^{\mathrm{T}}(i){\stackrel{\u0304}{\mathbf{Y}}}^{\mathrm{T}}(i)\stackrel{\u0304}{\mathbf{\Sigma}}\stackrel{\u0304}{\mathbf{Y}}(i)\stackrel{\u0304}{\mathbf{U}}(i)\right] in (26) remains to be evaluated. Using the step-size independence assumption and the ⊙ operator, we get

\begin{array}{ll}\phantom{\rule{10pt}{0ex}}\text{bvec}\left\{\mathrm{E}\left[{\stackrel{\u0304}{\mathbf{U}}}^{\mathrm{T}}(i){\stackrel{\u0304}{\mathbf{Y}}}^{\mathrm{T}}(i)\stackrel{\u0304}{\mathbf{\Sigma}}\stackrel{\u0304}{\mathbf{Y}}(i)\stackrel{\u0304}{\mathbf{U}}(i)\right]\right\}& =\left(\mathrm{E}\left[\mathbf{D}(i)\odot \mathbf{D}(i)\right]\right)\\ \phantom{\rule{1em}{0ex}}\times \mathbf{A}\left({\mathbf{G}}^{\mathrm{T}}\odot {\mathbf{G}}^{\mathrm{T}}\right)\stackrel{\u0304}{\mathit{\sigma}},\end{array}

(28)

where we have from[6]

\mathbf{A}=\text{diag}\left\{{\mathbf{A}}_{1},{\mathbf{A}}_{2},\dots ,{\mathbf{A}}_{N}\right\},

(29)

and each matrix **A**
_{
k
} is given by

{\mathbf{A}}_{k}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\text{diag}\phantom{\rule{0.3em}{0ex}}\left\{\phantom{\rule{0.3em}{0ex}}{\mathbf{\Lambda}}_{1}\otimes {\mathbf{\Lambda}}_{k},\dots ,{\lambda}_{k}{\lambda}_{k}^{\mathrm{T}}\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}2{\mathbf{\Lambda}}_{k}\otimes {\mathbf{\Lambda}}_{k},\dots ,{\mathbf{\Lambda}}_{N}\otimes {\mathbf{\Lambda}}_{k}\phantom{\rule{0.3em}{0ex}}\right\}\phantom{\rule{0.3em}{0ex}},

(30)

where **Λ**
_{
k
} defines the diagonal eigenvalue matrix and *λ*
_{
k
} is the eigenvalue vector for node *k*.

The output of the matrix E[**D**(*i*) ⊙ **D**(*i*)] can be written as

\begin{array}{ll}\phantom{\rule{10.5pt}{0ex}}{\left(\mathrm{E}\left[\mathbf{D}(i)\odot \mathbf{D}(i)\right]\right)}_{\mathit{\text{kk}}}& =\mathrm{E}\left[\text{diag}\left\{{\mu}_{k}(i){\mathbf{I}}_{M}\otimes {\mu}_{1}(i){\mathbf{I}}_{M},\dots ,\right.\right.\\ \phantom{\rule{2.3em}{0ex}}{\mu}_{k}(i){\mathbf{I}}_{M}\otimes {\mu}_{k}(i){\mathbf{I}}_{M},\dots ,\\ \phantom{\rule{2.3em}{0ex}}\left(\right)close="]">\left(\right)close="\}">{\mu}_{k}(i){\mathbf{I}}_{M}\otimes {\mu}_{N}(i){\mathbf{I}}_{M}\end{array}\n

(31)

Now applying the bvec operator to the weighting matrix{\stackrel{\u0304}{\mathbf{\Sigma}}}^{\prime} using the relation\text{bvec}\left\{{\stackrel{\u0304}{\mathbf{\Sigma}}}^{\prime}\right\}={\stackrel{\u0304}{\sigma}}^{\prime}, where we can get back the original{\stackrel{\u0304}{\mathbf{\Sigma}}}^{\prime} through\text{bvec}\left\{{\stackrel{\u0304}{\sigma}}^{\prime}\right\}={\stackrel{\u0304}{\mathbf{\Sigma}}}^{\prime}, we get

\begin{array}{ll}\phantom{\rule{6pt}{0ex}}\text{bvec}\left\{{\stackrel{\u0304}{\mathbf{\Sigma}}}^{\prime}\right\}& =\stackrel{\u0304}{\sigma}=\left[{\mathbf{I}}_{{M}^{2}{N}^{2}}-\left({\mathbf{I}}_{\mathit{\text{MN}}}\odot \mathbf{\Lambda}\mathrm{E}\left[\mathbf{D}(i)\right]\right)\right.\\ \phantom{\rule{1.15em}{0ex}}\left(\right)close="">-\left(\mathbf{\Lambda}\mathrm{E}\left[\mathbf{D}(i)\right]\odot {\mathbf{I}}_{\mathit{\text{MN}}}\right)\end{array}\n \n \n \n close="]">\n \n +\n \n \n E\n \n \n D\n (\n i\n )\n \u2299\n D\n (\n i\n )\n \n \n \n \n A\n \n \n \n \n \n \n G\n \n \n T\n \n \n \u2299\n \n \n G\n \n \n T\n \n \n \n \n \n \n \sigma \n \n \u0304\n \n \n

(32)

where

\begin{array}{ll}\phantom{\rule{6pt}{0ex}}\mathbf{F}(i)& =\left[{\mathbf{I}}_{{M}^{2}{N}^{2}}\phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}\left({\mathbf{I}}_{\mathit{\text{MN}}}\odot \mathbf{\Lambda}\mathrm{E}\left[\mathbf{D}\left(i\right)\right]\right)\phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}\left(\mathbf{\Lambda}\mathrm{E}\left[\mathbf{D}\left(i\right)\right]\odot {\mathbf{I}}_{\mathit{\text{MN}}}\right)\right.\\ \phantom{\rule{1.3em}{0ex}}\left(\right)close="]">+\left(\mathrm{E}\left[\mathbf{D}\left(i\right)\odot \mathbf{D}\left(i\right)\right]\right)\mathbf{A}& \left({\mathbf{G}}^{\mathrm{T}}\odot {\mathbf{G}}^{\mathrm{T}}\right).\end{array}\n

(33)

Then (21) will take on the following form:

\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}(i+1)\u2225}_{\stackrel{\u0304}{\sigma}}^{2}\right]=\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}(i)\u2225}_{\mathbf{F}(i)\stackrel{\u0304}{\sigma}}^{2}\right]+{\mathbf{b}}^{\mathrm{T}}(i)\stackrel{\u0304}{\sigma},

(34)

which characterizes the transient behavior of the network. Although (34) does not explicitly show the effect of the variable step (VSS) algorithm on the network’s performance, this effect is in fact subsumed in the weighting matrix, **F**(*i*) which varies for each iteration, unlike in the fixed step-size LMS algorithm where the analysis shows that this weighting matrix remains fixed at all iterations. Also, (33) clearly shows the effect of the VSS algorithm on the performance of the algorithm through the presence of the diagonal step-size matrix **D**(*i*).

#### 3.2.2 Learning behavior of the proposed algorithm

In this section, the learning behavior (which shows how the algorithm evolves with time) of the VSSDLMS algorithm is evaluated. Starting with{\stackrel{\u0304}{\mathbf{w}}}_{0}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}{\mathbf{w}}^{\left(o\right)} and **D**
_{0} = *μ*
_{0}
**I**
_{
MN
}, we have for iteration (*i* + 1)

\begin{array}{ll}\phantom{\rule{13pt}{0ex}}\mathcal{E}\left(i-1\right)& \phantom{\rule{.6em}{0ex}}=\text{diag}\left\{\left(\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}\left(i-1\right)\u2225}_{\mathit{\lambda}}^{2}\right]+{\sigma}_{v,1}^{2}\right){\mathbf{I}}_{M},\dots ,\right.\\ \phantom{\rule{4.5em}{0ex}}\left(\right)close="\}">\left(\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}\left(i-1\right)\u2225}_{\mathit{\lambda}}^{2}\right]+{\sigma}_{v,N}^{2}\right){\mathbf{I}}_{M}\end{array}\n

(35)

\phantom{\rule{1.95em}{0ex}}\mathrm{E}\left[\mathbf{D}(i)\right]=\alpha \mathrm{E}\left[\mathbf{D}\left(i-1\right)\right]+\gamma \mathcal{E}\left(i-1\right)

(36)

\begin{array}{ll}\phantom{\rule{6pt}{0ex}}\mathrm{E}\left[{\mathbf{D}}^{2}(i)\right]& ={\alpha}^{2}\mathrm{E}\left[{\mathbf{D}}^{2}\left(i-1\right)\right]+2\alpha \gamma \mathcal{E}\left(i-1\right)\\ \phantom{\rule{1em}{0ex}}+{\gamma}^{2}{\mathcal{E}}^{2}\left(i-1\right)\end{array}

(37)

\begin{array}{ll}\phantom{\rule{3em}{0ex}}\mathbf{F}(i)& \phantom{\rule{2.5em}{0ex}}=\left[{\mathbf{I}}_{{M}^{2}{N}^{2}}\phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}\left({\mathbf{I}}_{\mathit{\text{MN}}}\odot \mathbf{\Lambda}\mathrm{E}\left[\mathbf{D}\left(i\right)\right]\right)\right.\\ \phantom{\rule{3.5em}{0ex}}\left(\right)close="">\phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}\left(\mathbf{\Lambda}\mathrm{E}\left[\mathbf{D}\left(i\right)\right]\odot {\mathbf{I}}_{\mathit{\text{MN}}}\right)\end{array}\n \n \n \n close="]">\n \n +\n \n \n E\n \n \n D\n \n \n i\n \n \n \u2299\n D\n \n \n i\n \n \n \n \n \n \n A\n \n \n \n \n \n \n G\n \n \n T\n \n \n \u2299\n \n \n G\n \n \n T\n \n \n \n \n \n

(38)

\phantom{\rule{3.5em}{0ex}}\mathbf{b}(i)=\text{bvec}\left\{{\mathbf{R}}_{\mathbf{v}}\mathrm{E}\left[{\mathbf{D}}^{2}(i)\right]\mathbf{\Lambda}\right\};

(39)

then incorporating the above relations in (34) gives

\begin{array}{ll}\phantom{\rule{6pt}{0ex}}\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}(i+1)\u2225}_{\stackrel{\u0304}{\sigma}}^{2}\right]& =\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}(i)\u2225}_{\mathbf{F}(i)\stackrel{\u0304}{\sigma}}^{2}\right]+{\mathbf{b}}^{\mathrm{T}}(i)\stackrel{\u0304}{\sigma}\\ ={\u2225{\stackrel{\u0304}{\mathbf{w}}}^{\left(o\right)}\u2225}_{\left(\prod _{m=0}^{i}\mathbf{F}(m)\right)\stackrel{\u0304}{\sigma}}^{2}\\ \phantom{\rule{1em}{0ex}}+\left[\sum _{m=0}^{i-1}{\mathbf{b}}^{\mathrm{T}}(m)\left(\prod _{n=m+1}^{i}\mathbf{F}(n)\right)\right.\\ \phantom{\rule{1em}{0ex}}\left(\right)close="]">+{\mathbf{b}}^{\mathrm{T}}(i){\mathbf{I}}_{\mathit{\text{MN}}}& \stackrel{\u0304}{\sigma}.\end{array}\n

(40)

Now, after subtracting the results of iteration *i* from those of iteration (*i* + 1) and simplifying them, we get

\begin{array}{ll}\phantom{\rule{6pt}{0ex}}\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}(i+1)\u2225}_{\stackrel{\u0304}{\sigma}}^{2}\right]& =\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}(i)\u2225}_{\stackrel{\u0304}{\sigma}}^{2}\right]+{\u2225{\stackrel{\u0304}{\mathbf{w}}}^{\left(o\right)}\u2225}_{{\mathbf{F}}^{\prime}(i)\left(\mathbf{F}(i)-{\mathbf{I}}_{\mathit{\text{MN}}}\right)\stackrel{\u0304}{\sigma}}^{2}\\ \phantom{\rule{1em}{0ex}}+\left[{\mathbf{F}}^{\prime \prime}(i)\left(\mathbf{F}(i)-{\mathbf{I}}_{\mathit{\text{MN}}}\right)+{\mathbf{b}}^{\mathrm{T}}(i){\mathbf{I}}_{\mathit{\text{MN}}}\right]\stackrel{\u0304}{\sigma},\end{array}

(41)

where

\begin{array}{l}\phantom{\rule{.3em}{0ex}}{\mathbf{F}}^{\prime}(i)=\prod _{m=0}^{i-1}\mathbf{F}(m),\end{array}

(42)

\begin{array}{l}{\mathbf{F}}^{\prime \prime}(i)=\sum _{m=0}^{i-2}{\mathbf{b}}^{\mathrm{T}}(m)\left(\prod _{n=m+1}^{i-1}\mathbf{F}(n)\right)+{\mathbf{b}}^{\mathrm{T}}(i){\mathbf{I}}_{\mathit{\text{MN}}},\end{array}

(43)

which can be defined iteratively as

\phantom{\rule{.3em}{0ex}}{\mathbf{F}}^{\prime}(i+1)={\mathbf{F}}^{\prime}(i)\mathbf{F}(i),

(44)

\phantom{\rule{-2.0pt}{0ex}}{\mathbf{F}}^{\prime \prime}(i+1)={\mathbf{F}}^{\prime \prime}(i)\mathbf{F}(i)+{\mathbf{b}}^{\mathrm{T}}(i){\mathbf{I}}_{\mathit{\text{MN}}}.

(45)

In order to evaluate the MSD and EMSE, we need to define the corresponding weighting matrix for each of them. Taking\stackrel{\u0304}{\sigma}=\left(1/N\right)\text{bvec}\left\{{\mathbf{I}}_{\mathit{\text{MN}}}\right\}={\mathbf{q}}_{\eta} and\eta (i)=\left(1/N\right)\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}(i)\u2225}^{2}\right] for the MSD, we get

\begin{array}{ll}\phantom{\rule{6pt}{0ex}}\eta \left(i\right)& =\eta \left(i-1\right)+{\u2225{\stackrel{\u0304}{\mathbf{w}}}^{\left(o\right)}\u2225}_{{\mathbf{F}}^{\prime}(i)\left(\mathbf{F}(i)-{\mathbf{I}}_{\mathit{\text{MN}}}\right){\mathbf{q}}_{\eta}}^{2}\\ \phantom{\rule{1em}{0ex}}+\left[{\mathbf{F}}^{\prime \prime}(i)\left(\mathbf{F}(i)-{\mathbf{I}}_{\mathit{\text{MN}}}\right)+{\mathbf{b}}^{\mathrm{T}}(i){\mathbf{I}}_{\mathit{\text{MN}}}\right]{\mathbf{q}}_{\eta}.\end{array}

(46)

Similarly, taking\stackrel{\u0304}{\sigma}=\left(1/N\right)\text{bvec}\left\{\mathbf{\Lambda}\right\}={\lambda}_{\zeta} and\zeta \left(i\right)=\left(1/N\right)\mathrm{E}\left[{\u2225\stackrel{\u0304}{\mathbf{w}}(i)\u2225}_{\mathbf{\Lambda}}^{2}\right], the EMSE behavior is governed by

\begin{array}{ll}\phantom{\rule{6pt}{0ex}}\zeta \left(i\right)& =\zeta \left(i-1\right)+{\u2225{\stackrel{\u0304}{\mathbf{w}}}^{\left(o\right)}\u2225}_{{\mathbf{F}}^{\prime}(i)\left(\mathbf{F}(i)-{\mathbf{I}}_{\mathit{\text{MN}}}\right){\lambda}_{\zeta}}^{2}\\ \phantom{\rule{1em}{0ex}}+\left[{\mathbf{F}}^{\prime \prime}(i)\left(\mathbf{F}(i)-{\mathbf{I}}_{\mathit{\text{MN}}}\right)+{\mathbf{b}}^{\mathrm{T}}(i){\mathbf{I}}_{\mathit{\text{MN}}}\right]{\lambda}_{\zeta}.\end{array}

(47)

The relations in (46) and (47) govern the transient behavior of the MSD and EMSE of the proposed algorithm. These relations show how the effect on the proposed algorithm’s transient behavior of the weighting matrix varies from one iteration to the next as the weighting matrix itself varies at each iteration. This is not the case in the simple fixed step-size DLMS in[6] where the weighting matrix remains constant for all iterations. Since the weighting matrix depends on the step-size matrix, which becomes very small asymptotically, then both the norm and influence of the weighting matrix also become asymptotically small. From the above relations, it is seen that both the MSD and EMSE become very small at steady-state because the weighting matrix itself becomes small at steady-state and these relations will then depend only on the product of the weighting matrices at each iteration.

### 3.3 Steady-state analysis

From the second relation in (8), it is seen that the step-size for each node is independent of the data received from other nodes. Even though the connectivity matrix, **G**, does not permit the weighting matrix, **F** **(** **i** **)**, to be evaluated separately for each node, this is not the case for the determination of the step-size at any node. Here, we define the misadjustment as the ratio of the EMSE to the minimum mean square error. The misadjustment value is used in determining the steady-state performance of the algorithm[11]. Therefore, taking the approach of[20], we first find the misadjustment, as given by

{\mathcal{\mathcal{M}}}_{k}=\frac{1-{\left[1-2\frac{\left(3-\alpha \right)\gamma {\sigma}_{v,k}^{2}}{1-{\alpha}^{2}}\text{tr}\left({\mathbf{\Lambda}}_{k}\right)\right]}^{1/2}}{1+{\left[1-2\frac{\left(3-\alpha \right)\gamma {\sigma}_{v,k}^{2}}{1-{\alpha}^{2}}\text{tr}\left({\mathbf{\Lambda}}_{k}\right)\right]}^{1/2}}.

(48)

Then solving (36) and (37) along with (48) leads to the steady-state values for the step-size and its square for each node

\phantom{\rule{-14.0pt}{0ex}}{\mu}_{\mathit{\text{ss}},k}=\frac{\gamma {\sigma}_{v,k}^{2}\left(1+{\mathcal{\mathcal{M}}}_{k}\right)}{1-\alpha},

(49)

\phantom{\rule{-14.0pt}{0ex}}{\mu}_{\mathit{\text{ss}},k}^{2}=\frac{2\alpha \gamma {\mu}_{\mathit{\text{ss}},k}{\sigma}_{v,k}^{2}\left(1+{\mathcal{\mathcal{M}}}_{k}\right)+{\gamma}^{2}{\sigma}_{v,k}^{4}{\left(1+{\mathcal{\mathcal{M}}}_{k}\right)}^{2}}{1-{\alpha}^{2}}.

(50)

Incorporating these two steady-state relations in (33) yields the steady-state weighting matrix as

\begin{array}{ll}\phantom{\rule{6pt}{0ex}}{\mathbf{F}}_{\mathit{\text{ss}}}& =\left[{\mathbf{I}}_{{M}^{2}{N}^{2}}-\left({\mathbf{I}}_{\mathit{\text{MN}}}\odot \mathrm{\Lambda}\mathrm{E}\left[{\mathbf{D}}_{\mathit{\text{ss}}}\right]\right)-\left(\mathrm{\Lambda}\mathrm{E}\left[{\mathbf{D}}_{\mathit{\text{ss}}}\right]\odot {\mathbf{I}}_{\mathit{\text{MN}}}\right)\right.\\ \phantom{\rule{1em}{0ex}}+\left(\right)close="]">\left(\mathrm{E}\left[{\mathbf{D}}_{\mathit{\text{ss}}}\odot {\mathbf{D}}_{\mathit{\text{ss}}}\right]\right)A& \left({\mathbf{G}}^{\mathrm{T}}\odot {\mathbf{G}}^{\mathrm{T}}\right),\end{array}\n

(51)

where **D**
_{
ss
} = diag{*μ*
_{
ss,k
}
**I**
_{
M
}}.

Thus, the steady-state mean-square behavior is given by

\mathrm{E}\left[{\u2225{\stackrel{\u0304}{\mathbf{w}}}_{\mathit{\text{ss}}}\u2225}_{\overline{\sigma}}^{2}\right]=\mathrm{E}\left[{\u2225{\stackrel{\u0304}{\mathbf{w}}}_{\mathit{\text{ss}}}\u2225}_{{\mathbf{F}}_{\mathit{\text{ss}}}\overline{\sigma}}^{2}\right]+{\mathbf{b}}_{\mathit{\text{ss}}}^{\mathrm{T}}\overline{\sigma},

(52)

where **b**
_{
ss
} = **R**
_{
v
}
**D** *ss* 2**Λ** and{\mathbf{D}}_{\mathit{\text{ss}}}^{2}=\text{diag}\left\{{\mu}_{\mathit{\text{ss}},k}^{2}{\mathbf{I}}_{M}\right\}. Now solving (52), we get

\mathrm{E}\left[{\u2225{\stackrel{\u0304}{\mathbf{w}}}_{\mathit{\text{ss}}}\u2225}_{\overline{\sigma}}^{2}\right]={\mathbf{b}}_{\mathit{\text{ss}}}^{\mathrm{T}}{\left[{\mathbf{I}}_{{M}^{2}{N}^{2}}-{\mathbf{F}}_{\mathit{\text{ss}}}\right]}^{-1}\overline{\sigma}.

(53)

This equation gives the steady-state performance measure for the entire network. In order to solve for the steady-state values of MSD and EMSE, we take\stackrel{\u0304}{\sigma}={\mathbf{q}}_{\eta} and\stackrel{\u0304}{\sigma}={\lambda}_{\zeta}, respectively, as in (46) and (47). This gives us the steady-state values for MSD and EMSE as follows:

\begin{array}{l}{\eta}_{\mathit{\text{ss}}}={\mathbf{b}}_{\mathit{\text{ss}}}^{\mathrm{T}}{\left[{\mathbf{I}}_{{M}^{2}{N}^{2}}-{\mathbf{F}}_{\mathit{\text{ss}}}\right]}^{-1}{\mathbf{q}}_{\eta},\end{array}

(54)

\begin{array}{l}{\zeta}_{\mathit{\text{ss}}}={\mathbf{b}}_{\mathit{\text{ss}}}^{\mathrm{T}}{\left[{\mathbf{I}}_{{M}^{2}{N}^{2}}-{\mathbf{F}}_{\mathit{\text{ss}}}\right]}^{-1}{\lambda}_{\zeta}.\end{array}

(55)

The above two steady-state relationships depend on the steady-state weighting matrix which becomes very small at steady-state, as explained before. As a result, the steady-state results for the proposed algorithms become very small compared to those for the fixed step-size DLMS algorithm.