Skip to main content

Adaptive link selection algorithms for distributed estimation

Abstract

This paper presents adaptive link selection algorithms for distributed estimation and considers their application to wireless sensor networks and smart grids. In particular, exhaustive search-based least mean squares (LMS) / recursive least squares (RLS) link selection algorithms and sparsity-inspired LMS / RLS link selection algorithms that can exploit the topology of networks with poor-quality links are considered. The proposed link selection algorithms are then analyzed in terms of their stability, steady-state, and tracking performance and computational complexity. In comparison with the existing centralized or distributed estimation strategies, the key features of the proposed algorithms are as follows: (1) more accurate estimates and faster convergence speed can be obtained and (2) the network is equipped with the ability of link selection that can circumvent link failures and improve the estimation performance. The performance of the proposed algorithms for distributed estimation is illustrated via simulations in applications of wireless sensor networks and smart grids.

1 Introduction

Distributed signal processing algorithms have become a key approach for statistical inference in wireless networks and applications such as wireless sensor networks and smart grids [15]. It is well known that distributed processing techniques deal with the extraction of information from data collected at nodes that are distributed over a geographic area [1]. In this context, for each specific node, a set of neighbor nodes collect their local information and transmit the estimates to a specific node. Then, each specific node combines the collected information together with its local estimate to generate an improved estimate.

1.1 Prior and related work

Several works in the literature have proposed strategies for distributed processing which include incremental [1, 68], diffusion [2, 9], sparsity-aware [3, 10], and consensus-based strategies [11]. With the incremental strategy, the processing follows a Hamiltonian cycle, i.e., the information flows through these nodes in one direction, which means each node passes the information to its adjacent node in a uniform direction. However, in order to determine an optimum cyclic path that covers all nodes (considering the noise, interference, path loss, and channels between neighbor nodes), this method needs to solve an NP-hard problem. In addition, when any of the nodes fails, data communication through the cycle is interrupted and the distributed processing breaks down [1].

In distributed diffusion strategies [2, 10], the neighbors for each node are fixed and the combining coefficients are calculated after the network topology is deployed and starts its operation. One potential risk of this approach is that the estimation procedure may be affected by poorly performing links. More specifically, the fixed neighbors and the pre-calculated combining coefficients may not provide an optimized estimation performance for each specified node because there are links that are more severely affected by noise or fading. Moreover, when the number of neighbor nodes is large, each node requires a large bandwidth and transmit power. In [12, 13], the idea of partial diffusion was introduced for reducing communications between neighbor nodes. Prior work on topology design and adjustment techniques includes the studies in [1416] and [17], which are not dynamic in the sense that they cannot track changes in the network and mitigate the effects of poor links.

1.2 Contributions

The adaptive link selection algorithms for distributed estimation problems are proposed and studied in this chapter. Specifically, we develop adaptive link selection algorithms that can exploit the knowledge of poor links by selecting a subset of data from neighbor nodes. The first approach consists of exhaustive search-based least mean squares (LMS)/ recursive least squares (RLS) link selection (ES-LMS/ES-RLS) algorithms, whereas the second technique is based on sparsity-inspired LMS/RLS link selection (SI-LMS/SI-RLS) algorithms. With both approaches, distributed processing can be divided into two steps. The first step is called the adaptation step, in which each node employs LMS or RLS to perform the adaptation through its local information. Following the adaptation step, each node will combine its collected estimates from its neighbors and local estimate, through the proposed adaptive link selection algorithms. The proposed algorithms result in improved estimation performance in terms of the mean square error (MSE) associated with the estimates. In contrast to previously reported techniques, a key feature of the proposed algorithms is that the combination step involves only a subset of the data associated with the best performing links.

In the ES-LMS and ES-RLS algorithms, we consider all possible combinations for each node with its neighbors and choose the combination associated with the smallest MSE value. In the SI-LMS and SI-RLS algorithms, we incorporate a reweighted zero attraction (RZA) strategy into the adaptive link selection algorithms. The RZA approach is often employed in applications dealing with sparse systems in such a way that it shrinks the small values in the parameter vector to zero, which results in better convergence and steady-state performance. Unlike prior work with sparsity-aware algorithms [3, 1820], the proposed SI-LMS and SI-RLS algorithms exploit the possible sparsity of the MSE values associated with each of the links in a different way. In contrast to existing methods that shrink the signal samples to zero, SI-LMS and SI-RLS shrink to zero the links that have poor performance or high MSE values. By using the SI-LMS and SI-RLS algorithms, the data associated with unsatisfactory performance will be discarded, which means the effective network topology used in the estimation procedure will change as well. Although the physical topology is not changed by the proposed algorithms, the choice of the data coming from the neighbor nodes for each node is dynamic, leads to the change of combination weights, and results in improved performance. We also remark that the topology could be altered with the aid of the proposed algorithms and a feedback channel which could inform the nodes whether they should be switched off or not. The proposed algorithms are considered for wireless sensor networks and also as a tool for distributed state estimation that could be used in smart grids.

In summary, the main contributions of this chapter are the following:

  • We present adaptive link selection algorithms for distributed estimation that are able to achieve significantly better performance than existing algorithms.

  • We devise distributed LMS and RLS algorithms with link selection capabilities to perform distributed estimation.

  • We analyze the MSE convergence and tracking performance of the proposed algorithms and their computational complexities, and we derive analytical formulas to predict their MSE performance.

  • A simulation study of the proposed and existing distributed estimation algorithms is conducted along with applications in wireless sensor networks and smart grids.

This paper is organized as follows. Section 2 describes the system model and the problem statement. In Section 3, the proposed link selection algorithms are introduced. We analyze the proposed algorithms in terms of their stability, steady-state, and tracking performance and computational complexity in Section 4. The numerical simulation results are provided in Section 5. Finally, we conclude the paper in Section 6.

Notation: We use boldface upper case letters to denote matrices and boldface lower case letters to denote vectors. We use (·)T and (·)−1 to denote the transpose and inverse operators, respectively, (·)H for conjugate transposition and (·) for complex conjugate.

2 System model and problem statement

We consider a set of N nodes, which have limited processing capabilities, distributed over a given geographical area as depicted in Fig. 1. The nodes are connected and form a network, which is assumed to be partially connected because nodes can exchange information only with neighbors determined by the connectivity topology. We call a network with this property a partially connected network whereas a fully connected network means that data broadcast by a node can be captured by all other nodes in the network in one hop [21]. We can think of this network as a wireless network, but our analysis also applies to wired networks such as power grids. In our work, in order to perform link selection strategies, we assume that each node has at least two neighbors.

Fig. 1
figure 1

Network topology with N nodes

The aim of the network is to estimate an unknown parameter vector ω 0, which has length M. At every time instant i, each node k takes a scalar measurement d k (i) according to

$$ {d_{k}(i)} = {\boldsymbol {\omega}}_{0}^{H}{\boldsymbol x_{k}(i)} +{n_{k}(i)},~~~ i=1,2, \ldots, I, $$
((1))

where x k (i) is the M×1 random regression input signal vector and n k (i) denotes the Gaussian noise at each node with zero mean and variance \(\sigma _{n,k}^{2}\). This linear model is able to capture or approximate well many input-output relations for estimation purposes [22], and we assume I>M. To compute an estimate of ω 0 in a distributed fashion, we need each node to minimize the MSE cost function [2]

$$ {J_{k}\left({\boldsymbol \omega_{k}(i)}\right)} = {\mathbb{E} \left|{ d_{k}(i)}- {\boldsymbol {\omega_{k}^{H}}(i)}{\boldsymbol x_{k}(i)}\right|^{2}}, $$
((2))

where \(\mathbb {E}\) denotes expectation and ω k (i) is the estimated vector generated by node k at time instant i. Equation (3) is also the definition of the MSE, and the global network cost function could be described as

$$ {J({\boldsymbol \omega})} = \sum_{k=1}^{N}{\mathbb{E} \left|{ d_{k}(i)}- {\boldsymbol \omega}^{H}{\boldsymbol x_{k}(i)}\right|^{2}}. $$
((3))

To solve this problem, diffusion strategies have been proposed in [2, 9] and [23]. In these strategies, the estimate for each node is generated through a fixed combination strategy given by

$$ {\boldsymbol {\omega}}_{k}(i)= \sum\limits_{l\in \mathcal{N}_{k}} c_{kl} \boldsymbol\psi_{l}(i), $$
((4))

where \(\mathcal {N}_{k}\) denotes the set of neighbors of node k including node k itself, c kl ≥0 is the combining coefficient, and ψ l (i) is the local estimate generated by node l through its local information.

There are many ways to calculate the combining coefficient c kl which include the Hastings [24], the Metropolis [25], the Laplacian [26], and the nearest neighbor [27] rules. In this work, due to its simplicity and good performance, we adopt the Metropolis rule [25] given by

$$ c_{kl}=\left\{\begin{array}{ll} \frac{1} {max\{|\mathcal{N}_{k}|,|\mathcal{N}_{l}|\}},\ \ \text{if }k\neq l\text{ are linked}\\ 1 - \sum\limits_{l\in \mathcal{N}_{k} / k} c_{kl}, \ \ \text{for } k = l. \end{array} \right. $$
((5))

where \(|\mathcal {N}_{k}|\) denotes the cardinality of \(\mathcal {N}_{k}\). The set of coefficients c kl should satisfy [2]

$$ \sum\limits_{l\in \mathcal{N}_{k} \ \forall k} c_{kl} =1. $$
((6))

For the combination strategy mentioned in (4), the choice of neighbors for each node is fixed, which results in some problems and limitations, namely:

  • Some nodes may face high levels of noise or interference, which may lead to inaccurate estimates.

  • When the number of neighbors for each node is high, large communication bandwidth and high transmit power are required.

  • Some nodes may shut down or collapse due to network problems. As a result, local estimates to their neighbors may be affected.

Under such circumstances, a performance degradation is likely to occur when the network cannot discard the contribution of poorly performing links and their associated data in the estimation procedure. In the next section, the proposed adaptive link selection algorithms are presented, which equip a network with the ability to improve the estimation procedure. In the proposed scheme, each node is able to dynamically select the data coming from its neighbors in order to optimize the performance of distributed estimation techniques.

3 Proposed adaptive link selection algorithms

In this section, we present the proposed adaptive link selection algorithms. The goal of the proposed algorithms is to optimize the distributed estimation and improve the performance of the network by dynamically changing the topology. These algorithmic strategies give the nodes the ability to choose their neighbors based on their MSE performance. We develop two categories of adaptive link selection algorithms; the first one is based on an exhaustive search, while the second is based on a sparsity-inspired relaxation. The details will be illustrated in the following subsections.

3.1 Exhaustive search–based LMS/RLS link selection

The proposed ES-LMS and ES-RLS algorithms employ an exhaustive search to select the links that yield the best performance in terms of MSE. First, we describe how we define the adaptation step for these two strategies. In the ES-LMS algorithm, we employ the adaptation strategy given by

$$ {\boldsymbol {\psi}}_{k}(i)= {\boldsymbol {\omega}}_{k}(i)+{\mu}_{k} {\boldsymbol x_{k}(i)}\left[{ d_{k}(i)}- {\boldsymbol \omega}_{k}^{H}(i){\boldsymbol x_{k}(i)}\right]^{*}, $$
((7))

where μ k is the step size for each node. In the ES-RLS algorithm, we employ the following steps for the adaptation:

$$\begin{array}{*{20}l} \boldsymbol\Phi_{k}^{-1}(i)& = \lambda^{-1} \boldsymbol\Phi_{k}^{-1}(i-1)\\ & \quad- \frac {\lambda^{-2}\boldsymbol\Phi_{k}^{-1}(i-1) \boldsymbol x_{k}(i) \boldsymbol {x_{k}^{H}}(i)\boldsymbol\Phi_{k}^{-1}(i-1)} {1+\lambda^{-1} \boldsymbol {x_{k}^{H}}(i) \boldsymbol\Phi_{k}^{-1}(i-1)\boldsymbol x_{k}(i)}, \end{array} $$
((8))

where λ is the forgetting factor. Then, we let

$$ \boldsymbol P_{k}(i)=\boldsymbol\Phi_{k}^{-1}(i) $$
((9))

and

$$ \boldsymbol k_{k}(i)= \frac {\lambda^{-1}\boldsymbol P_{k}(i) \boldsymbol x_{k}(i) } {1+\lambda^{-1} \boldsymbol {x_{k}^{H}}(i)\boldsymbol P_{k}(i)\boldsymbol x_{k}(i)}. $$
((10))
$$ {\boldsymbol {\psi}}_{k}(i)= {\boldsymbol {\omega}}_{k}(i)+ {\boldsymbol k(i)}\left[{ d_{k}(i)}- {\boldsymbol \omega}_{k}^{H}(i){\boldsymbol x_{k}(i)}\right]^{*}, $$
((11))
$$ \boldsymbol P_{k}(i+1)=\lambda^{-1}\boldsymbol P_{k}(i)-\lambda^{-1}\boldsymbol k_{k}(i)\boldsymbol {x_{k}^{H}}(i)\boldsymbol P_{k}(i). $$
((12))

Following the adaptation step, we introduce the combination step for both the ES-LMS and ES-RLS algorithms, based on an exhaustive search strategy. At first, we introduce a tentative set Ω k using a combinatorial approach described by

$$ {\Omega_{k}}\in 2^{|\mathcal{N}_{k}|}\backslash\varnothing, $$
((13))

where the set Ω k is a nonempty set with \(2^{|\mathcal {N}_{k}|}\) elements. After the tentative set Ω k is defined, we write the cost function (2) for each node as

$$ {J_{k}\left({\boldsymbol \psi(i)}\right)} \triangleq {\mathbb{E} \left|{ d_{k}(i)}-{\boldsymbol \psi}^{H}(i){\boldsymbol x_{k}(i)}\right|^{2}}, $$
((14))

where

$$ {\boldsymbol \psi}(i) \triangleq \sum\limits_{l\in \Omega_{k}} c_{kl}(i) \boldsymbol\psi_{l}(i) $$
((15))

is the local estimator and ψ l (i) is calculated through (7) or (11), depending on the algorithm, i.e., ES-LMS or ES-RLS. With different choices of the set Ω k , the combining coefficients c kl will be re-calculated through (5), to ensure condition (6) is satisfied.

Then, we introduce the error pattern for each node, which is defined as

$$ {e_{\Omega_{k}}(i)} \triangleq { d_{k}(i)}-{\left[\sum\limits_{l\in \Omega_{k}} c_{kl}(i) \boldsymbol\psi_{l}(i)\right]}^{H}{\boldsymbol x_{k}(i)}. $$
((16))

For each node k, the strategy that finds the best set Ω k (i) must solve the following optimization problem:

$$ \hat{\Omega}_{k}(i) = \arg\min_{\Omega_{k} \in 2^{N_{k}}\setminus\varnothing} | e_{\Omega_{k}}(i)|. $$
((17))

After all steps have been completed, the combination step in (4) is performed as described by

$$ {\boldsymbol {\omega}}_{k}(i+1)= \sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i) \boldsymbol\psi_{l}(i). $$
((18))

At this stage, the main steps of the ES-LMS and ES-RLS algorithms have been completed. The proposed ES-LMS and ES-RLS algorithms find the set \(\widehat {\Omega }_{k}(i)\) that minimizes the error pattern in (16) and (17) and then use this set of nodes to obtain ω k (i) through (18).

The ES-LMS/ES-RLS algorithms are briefly summarized as follows: Step 1 Each node performs the adaptation through its local information based on the LMS or RLS algorithm. Step 2 Each node finds the best set Ω k (i), which satisfies (17). Step 3 Each node combines the information obtained from its best set of neighbors through (18).

The details of the proposed ES-LMS and ES-RLS algorithms are shown in Algorithms 1 and 2. When the ES-LMS and ES-RLS algorithms are implemented in networks with a large number of small and low–power sensors, the computational complexity cost may become high, as the algorithm in (17) requires an exhaustive search and needs more computations to examine all the possible sets Ω k (i) at each time instant.

3.2 Sparsity–inspired LMS/RLS link selection

The ES-LMS/ES-RLS algorithms previously outlined need to examine all possible sets to find a solution at each time instant, which might result in high computational complexity for large networks operating in time-varying scenarios. To solve the combinatorial problem with reduced complexity, we propose the SI-LMS and SI-RLS algorithms, which are as simple as standard diffusion LMS or RLS algorithms and are suitable for adaptive implementations and scenarios where the parameters to be estimated are slowly time-varying. The zero-attracting (ZA) strategy, RZA strategy, and zero-forcing (ZF) strategy are reported in [3] and [28] as for sparsity-aware techniques. These approaches are usually employed in applications dealing with sparse systems in scenarios where they shrink the small values in the parameter vector to zero, which results in a better convergence rate and a steady-state performance. Unlike existing methods that shrink the signal samples to zero, the proposed SI-LMS and SI-RLS algorithms shrink to zero the links that have poor performance or high MSE values. To detail the novelty of the proposed sparsity-inspired LMS/RLS link selection algorithms, we illustrate the processing in Fig. 2.

Fig. 2
figure 2

ab Sparsity aware signal processing strategies

Figure 2 a shows a standard type of sparsity-aware processing. We can see that, after being processed by a sparsity-aware algorithm, the nodes with small MSE values will be shrunk to zero. In contrast, the proposed SI-LMS and SI-RLS algorithms will keep the nodes with lower MSE values and reduce the combining weight of the nodes with large MSE values as illustrated in Fig. 2 b. When compared with the ES-type algorithms, the SI-LMS/RLS algorithms do not need to consider all possible combinations of nodes, which means that the SI-LMS/RLS algorithms have lower complexity. In the following, we will show how the proposed SI-LMS/SI–RLS algorithms are employed to realize the link selection strategy automatically.

In the adaptation step, we follow the same procedure in (7)–(11) as that of the ES-LMS and ES-RLS algorithms for the SI-LMS and SI-RLS algorithms, respectively. Then, we reformulate the combination step. First, we introduce the log-sum penalty into the combination step in (4). Different penalty terms have been considered for this task. We have adopted a heuristic approach [3, 29] known as reweighted zero-attracting strategy into the combination step in (4) because this strategy has shown an excellent performance and is simple to implement. The log-sum penalty is defined as:

$$ {f_{1}(e_{k}(i))}= \sum\limits_{l\in \mathcal{N}_{k}} \log\left(1+\varepsilon|e_{kl}(i)|\right), $$
((19))

where the error \(e_{\textit {kl}}(i) (l\in \mathcal {N}_{k})\), which stands for the neighbor node l of node k including node k itself, is defined as

$$ {e_{kl}(i)} \triangleq { d_{k}(i)}-{ {\boldsymbol\psi_{l}^{H}}(i)}{\boldsymbol x_{k}(i)} $$
((20))

and ε is the shrinkage magnitude. Then, we introduce the vector and matrix quantities required to describe the combination step. We first define a vector c k that contains the combining coefficients for each neighbor of node k including node k itself as described by

$$ {\boldsymbol c_{k}}\triangleq\left[c_{kl}\right],\ \ \ \ {l\in \mathcal{N}_{k}}. $$
((21))

Then, we define a matrix Ψ k that includes all the estimated vectors, which are generated after the adaptation step of SI-LMS and of SI-RLS for each neighbor of node k including node k itself as given by

$$ {\boldsymbol \Psi_{k}}\triangleq[\!\boldsymbol \psi_{l}(i)], \ \ \ \ {l\in \mathcal{N}_{k}}. $$
((22))

Note that the adaptation steps of SI-LMS and SI-RLS are identical to (7) and (11), respectively. An error vector \(\hat {\boldsymbol e}_{k}\) that contains all error values calculated through (20) for each neighbor of node k including node k itself is expressed by

$$ {\hat{\boldsymbol e}_{k}}\triangleq[\!e_{kl}(i)], \ \ \ \ {l\in \mathcal{N}_{k}}. $$
((23))

To devise the sparsity-inspired approach, we have modified the vector \(\hat {\boldsymbol e}_{k}\) in the following way:

  1. 1.

    The element with largest absolute value |e kl (i)| in \(\hat {\boldsymbol e}_{k}\) will be kept as |e kl (i)|.

  2. 2.

    The element with smallest absolute value will be set to −|e kl (i)|. This process will ensure the node with smallest error pattern has a reward on its combining coefficient.

  3. 3.

    The remaining entries will be set to zero.

At this point, the combination step can be defined as [29]

$$ \boldsymbol \omega_{k}(i)={\sum\limits_{j=1}^{|\mathcal{N}_{k}|} \left[ c_{k,j}-\rho \frac {\partial f_{1}(\hat{e}_{k,j})}{\partial \hat{e}_{k,j}}\right] \boldsymbol\psi_{k,j}}, $$
((24))

where \(c_{k,j}, \hat {e}_{k,j}\) stand for the jth element in the \(\boldsymbol c_{k}, \hat {\boldsymbol e}_{k}\), and ψ k,j stands for the jth column in Ψ k . The parameter ρ is used to control the algorithm’s shrinkage intensity. We then calculate the partial derivative of \(\hat {\boldsymbol e}_{k}[\!j]\):

$$\begin{array}{*{20}l} \frac{\partial{f_{1}(\hat{e}_{k,j})}}{\partial \hat{e}_{k,j}}&= \frac{\partial\left(\log(1+\varepsilon|e_{kl}(i)|)\right)}{\partial\left(e_{kl}(i)\right)}\\ &=\varepsilon\frac{{\text{sign}}(e_{kl}(i)) }{1+\varepsilon|e_{kl}(i)|}\ \ \ \ {l\in\mathcal{N}_{k}}\\ &=\varepsilon\frac{{\text{sign}}(\hat{e}_{k,j}) }{1+\varepsilon|\hat{e}_{k,j}|}. \end{array} $$
((25))

To ensure that \(\sum \limits _{j=1}^{|\mathcal {N}_{k}|} \left (c_{k,j}-\rho \frac {\partial f_{1}(\hat {e}_{k,j})}{\partial \hat {e}_{k,j}}\right)=1\), we replace \(\hat {e}_{k,j}\) with ξ min in the denominator of (25), where the parameter ξ min stands for the minimum absolute value of e kl (i) in \(\hat {\boldsymbol e}_{k}\). Then, (25) can be rewritten as

$$ \frac{\partial{f_{1}(\hat{e}_{k,j})}}{\partial \hat{e}_{k,j}}\approx\varepsilon\frac{{\text{sign}}(\hat{e}_{k,j}) }{1+\varepsilon|\xi_{\text{min}}|}. $$
((26))

At this stage, the log-sum penalty performs shrinkage and selects the set of estimates from the neighbor nodes with the best performance, at the combination step. The function sign(a) is defined as

$$ {{\text{sign}}(a)}= \left\{\begin{array}{ll} {a/|a|}\ \ \ \ \ {a\neq 0}\\ 0\ \ \ \ \ \ \ \ \ \ {a= 0}. \end{array} \right. $$
((27))

Then, by inserting (26) into (24), the proposed combination step is given by

$$ \boldsymbol \omega_{k}(i)={\sum\limits_{j=1}^{|\mathcal{N}_{k}|} \left[c_{k,j}-\rho \varepsilon\frac{{\text{sign}}({\hat{e}_{k,j}}) }{1+\varepsilon|\xi_{\text{min}}|}\right] \boldsymbol\psi_{k,j}}. $$
((28))

Note that the condition \(c_{k,j}-\rho \varepsilon \frac {{\text {sign}}({\hat {e}_{k,j}}) }{1+\varepsilon |\xi _{\text {min}}|}\geq 0\) is enforced in (28). When \(c_{k,j}-\rho \varepsilon \frac {{\text {sign}}({\hat {e}_{k,j}}) }{1+\varepsilon |\xi _{\text {min}}|}= 0\), it means that the corresponding node has been discarded from the combination step. In the following time instant, if this node still has the largest error, there will be no changes in the combining coefficients for this set of nodes.

To guarantee the stability, the parameter ρ is assumed to be sufficiently small and the penalty takes effect only on the element in \({\hat {\boldsymbol e}_{k}}\) for which the magnitude is comparable to 1/ε [3]. Moreover, there is little shrinkage exerted on the element in \({\hat {\boldsymbol e}_{k}}\) whose \(|\hat {\boldsymbol e}_{k}[j]|\ll 1/\varepsilon \). The SI-LMS and SI-RLS algorithms perform link selection by the adjustment of the combining coefficients through (28). At this point, it should be emphasized that:

  • The process in (28) satisfies condition (6), as the penalty and reward amounts of the combining coefficients are the same for the nodes with maximum and minimum error, respectively, and there are no changes for the rest nodes in the set.

  • When computing (28), there are no matrix–vector multiplications. Therefore, no additional complexity is introduced. As described in (24), only the jth element of \(\boldsymbol c_{k}, \hat {\boldsymbol e}_{k}\) and jth column of Ψ k are used for calculation.

For the neighbor node with the largest MSE value, after the modifications of \(\hat {\boldsymbol e}_{k}\), its e kl (i) value in \(\hat {\boldsymbol e}_{k}\) will be a positive number which will lead to the term \(\rho \varepsilon \frac {{\text {sign}}({\hat {e}_{k,j}})}{1+\varepsilon |\xi _{\text {min}}|}\) in (28) being positive too. This means that the combining coefficient for this node will be shrunk and the weight for this node to build ω k (i) will be shrunk too. In other words, when a node encounters high noise or interference levels, the corresponding MSE value might be large. As a result, we need to reduce the contribution of that node.

In contrast, for the neighbor node with the smallest MSE, as its e kl (i) value in \(\hat {\boldsymbol e}_{k}\) will be a negative number, the term \(\rho \varepsilon \frac {{\text {sign}}({\hat {e}_{k,j}})}{1+\varepsilon |\xi _{\text {min}}|}\) in (28) will be negative too. As a result, the weight for this node associated with the smallest MSE to build ω k (i) will be increased. For the remaining neighbor nodes, the entry e kl (i) in \(\hat {\boldsymbol e}_{k}\) is zero, which means the term \(\rho \varepsilon \frac {{\text {sign}}({\hat {e}_{k,j}})}{1+\varepsilon |\xi _{\text {min}}|}\) in (28) is zero and there is no change for the weights to build ω k (i). The main steps for the proposed SI-LMS and SI-RLS algorithms are listed as follows: Step 1 Each node carries out the adaptation through its local information based on the LMS or RLS algorithm. Step 2 Each node calculates the error pattern through (20). Step 3 Each node modifies the error vector \(\hat {\boldsymbol e}_{k}\). Step 4 Each node combines the information obtained from its selected neighbors through (28).

The SI-LMS and SI-RLS algorithms are detailed in Algorithm 3. For the ES-LMS/ES-RLS and SI-LMS/SI-RLS algorithms, we design different combination steps and employ the same adaptation procedure, which means the proposed algorithms have the ability to equip any diffusion-type wireless networks operating with other than the LMS and RLS algorithms. This includes, for example, the diffusion conjugate gradient strategy [30]. Apart from using weights related to the node degree, other signal dependent approaches may also be considered, e.g., the parameter vectors could be weighted according to the signal-to-noise ratio (SNR) (or the noise variance) at each node within the neighborhood.

4 Analysis of the proposed algorithms

In this section, a statistical analysis of the proposed algorithms is developed, including a stability analysis and an MSE analysis of the steady-state and tracking performance. In addition, the computational complexity of the proposed algorithms is also detailed. Before we start the analysis, we make some assumptions that are common in the literature [22]. Assumption I: The weight-error vector ε k (i) and the input signal vector x k (i) are statistically independent, and the weight-error vector for node k is defined as

$$ {\boldsymbol \varepsilon_{k}(i)} \triangleq \boldsymbol \omega_{k}(i) - \boldsymbol \omega_{0}, $$
((29))

where ω 0 denotes the optimum Wiener solution of the actual parameter vector to be estimated, and ω k (i) is the estimate produced by a proposed algorithm at time instant i.

Assumption II: The input signal vector x l (i) is drawn from a stochastic process, which is ergodic in the autocorrelation function [22].

Assumption III: The M×1 vector q(i) represents a stationary sequence of independent zero mean vectors and positive definite autocorrelation matrix \(\boldsymbol Q\,=\,\mathbb {E}\left [\boldsymbol q(i)\boldsymbol q^{H}(i)\right ]\), which is independent of x k (i), n k (i) and ε l (i).

Assumption IV (Independence): All regressor input signals x k (i) are spatially and temporally independent. This assumption allows us to consider the input signal x k (i) independent of \(\boldsymbol \omega _{l}(i), l\in \mathcal {N}_{k}\).

4.1 Stability analysis

In general, to ensure that a partially connected network performance can converge to the global network performance, the estimates should be propagated across the network [31]. The work in [14] shows that it is central to the performance that each node should be able to reach the other nodes through one or multiple hops [31].

To discuss the stability analysis of the proposed ES-LMS and SI-LMS algorithms, we first substitute (7) into (18) and obtain

$$\begin{array}{*{20}l}{} {\boldsymbol {\omega}}_{k}(i+1)&=\! \sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i) \boldsymbol\psi_{l}(i+1)\\ &=\!\sum\limits_{l\in \widehat{\Omega}_{k}(i)} \left[{\boldsymbol {\omega}}_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}e^{*}_{l}(i+1)\right]c_{kl}(i)\\ &=\!\sum\limits_{l\in \widehat{\Omega}_{k}(i)}\! \left[\boldsymbol \omega_{0}+\boldsymbol\varepsilon_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}e^{*}_{l}(i+1)\right]c_{kl}(i)\\ &=\!\sum\limits_{l\in \widehat{\Omega}_{k}(i)} \boldsymbol \omega_{0} c_{kl}+\sum\limits_{l\in \widehat{\Omega}_{k}(i)}\left[\boldsymbol\varepsilon_{l}(i)\right.\\ &\quad+{\mu}_{l} {\boldsymbol x_{l}(i+1)}e^{*}_{l}(i+1)\left.\!\right]c_{kl}(i)\\ & \text{subject}\ \text{to} \ \sum\limits_{l} c_{kl}(i)=1 \\ &=\boldsymbol \omega_{0}\,+\,\!\sum\limits_{l\in \widehat{\Omega}_{k}(i)}\left[\boldsymbol\varepsilon_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+\!1)}e^{*}_{l}(i+\!1)\right]c_{kl}(i). \end{array} $$
((30))

Then, we have

$$\begin{array}{*{20}l} {\boldsymbol \varepsilon_{k}(i+1)}&=\sum\limits_{l\in \widehat{\Omega}_{k}(i)}\left[\boldsymbol\varepsilon_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}e^{*}_{l}(i+1)\right]c_{kl}(i). \end{array} $$
((31))

By employing Assumption IV, we start with (31) for the ES-LMS algorithm and define the global vectors and matrices:

$$ \boldsymbol\varepsilon(i+1) \triangleq \left[ \boldsymbol\varepsilon_{1}(i+1),\cdots,\boldsymbol\varepsilon_{N}(i+1)\right]^{T} $$
((32))
$$ \mathcal{\boldsymbol M} \triangleq \text{diag}\left\{\mu_{1}\boldsymbol I_{M},\ldots,\mu_{N}\boldsymbol I_{M}\right\} $$
((33))
$$ {}\mathcal{\boldsymbol D}(i+1) \triangleq \!\text{diag}\left\{\boldsymbol x_{1}(i+1)\boldsymbol {x_{1}^{H}}(i+1),\ldots,\boldsymbol x_{N}(i+1)\boldsymbol {x_{N}^{H}}(i+1)\right\} $$
((34))

and the N M×1 vector

$${} \mathcal{\boldsymbol g}(i+1) = \left[\boldsymbol {x_{1}^{T}}(i+1)n_{1}(i+1),\cdots,\boldsymbol {x_{N}^{T}}(i+1)n_{N}(i+1)\right]^{T}\!. $$
((35))

We also define an N×N matrix C where the combining coefficients {c kl } correspond to the {l,k} entries of the matrix C and the N M×N M matrix C G with a Kronecker structure:

$$ \boldsymbol C_{G}=\boldsymbol C\otimes \boldsymbol I_{M} $$
((36))

where denotes the Kronecker product.

By inserting \(e_{l}(i+1)=e_{0-l}(i+1)-{\boldsymbol \varepsilon _{l}^{H}}(i)\boldsymbol x_{l}(i+1)\) into (31), the global version of (31) can then be written as

$$ \boldsymbol\varepsilon(i+1)=\boldsymbol {C_{G}^{T}}\left[\boldsymbol I -\mathcal{\boldsymbol M} \mathcal{\boldsymbol D}(i+1)\right] \boldsymbol\varepsilon(i)+\boldsymbol {C_{G}^{T}}\mathcal{\boldsymbol M}\mathcal{\boldsymbol g}(i+1), $$
((37))

where e 0−l (i+1) is the estimation error produced by the Wiener filter for node l as described by

$$ e_{0-l}(i+1)=d_{l}(i) - \boldsymbol {\omega_{0}^{H}}\boldsymbol x_{l}(i). $$
((38))

If we define

$$ \begin{aligned} \mathcal{D}& \triangleq \mathbb{E}[ \mathcal{\boldsymbol D}(i+1)]\\ &=\text{diag}\{\boldsymbol R_{1},\ldots,\boldsymbol R_{N}\} \end{aligned} $$
((39))

and take the expectation of (37), we arrive at

$$ \mathbb{E}\{\boldsymbol\varepsilon(i+1)\} = \boldsymbol {C_{G}^{T}}\left[\boldsymbol I-\mathcal{\boldsymbol M}\mathcal{D}\right]\mathbb{E}\{\boldsymbol\varepsilon(i)\}. $$
((40))

Before we proceed, let us define \(\boldsymbol X=\boldsymbol I-\mathcal {\boldsymbol M}\mathcal {D}\). We say that a square matrix X is stable if it satisfies X i→0 as i. A known result in linear algebra states that a matrix is stable if, and only if, all its eigenvalues lie inside the unit circle. We need the following lemma to proceed [9].

Lemma 1.

Let C G and X denote arbitrary N M×N M matrices, where C G has real, non-negative entries, with columns adding up to one. Then, the matrix \(\boldsymbol Y=\boldsymbol {C_{G}^{T}}\boldsymbol X\) is stable for any choice of C G if, and only if, X is stable.

Proof.

Assume that X is stable, it is true that for every square matrix X and every α>0, there exists a submultiplicative matrix norm ||·|| τ that satisfies ||X|| τ τ(X)+α, where the submultiplicative matrix norm ||A B||≤||A||·||B|| holds and τ(X) is the spectral radius of X [32, 33]. Since X is stable, τ(X)<1, and we can choose α>0 such that τ(X)+α=v<1 and ||X|| τ v<1. Then we obtain [9]

$$ \begin{aligned} ||\boldsymbol Y^{i}||_{\tau}&=||\left(\boldsymbol {C_{G}^{T}}\boldsymbol X\right)^{i}||_{\tau}\\ &\leq||\left(\boldsymbol {C_{G}^{T}}\right)^{i}||_{\tau}\cdot||\boldsymbol X^{i}||_{\tau}\\ &\leq v^{i}||\left(\boldsymbol {C_{G}^{T}}\right)^{i}||_{\tau}. \end{aligned} $$
((41))

Since \(\boldsymbol {C_{G}^{T}}\) has non-negative entries with columns that add up to one, it is element-wise bounded by unity. This means its Frobenius norm is bounded as well and by the equivalence of norms, so is any norm, in particular \(||\left (\boldsymbol {C_{G}^{T}}\right)^{i}||_{\tau }\). As a result, we have

$$ \lim\limits_{\textit{i}\rightarrow\infty}||\boldsymbol Y^{i}||_{\tau}=\textbf{0}, $$
((42))

so Y i converges to the zero matrix for large i. Therefore, Y is stable.

In view of Lemma 1 and (82), we need the matrix \(\boldsymbol I-\mathcal {\boldsymbol M}\mathcal {D}\) to be stable. As a result, it requires Iμ k R k to be stable for all k, which holds if the following condition is satisfied:

$$ 0<\mu_{k}<\frac {2} {\lambda_{max}\left(\boldsymbol R_{k}\right)} $$
((43))

where λ max (R k ) is the largest eigenvalue of the correlation matrix R k . The difference between the ES-LMS and SI-LMS algorithms is the strategy to calculate the matrix C. Lemma 1 indicates that for any choice of C, only X needs to be stable. As a result, SI-LMS has the same convergence condition as in (43). Given the convergence conditions, the proposed ES-LMS/ES-RLS and SI-LMS/SI-RLS algorithms will adapt according to the network connectivity by choosing the group of nodes with the best available performance to construct their estimates.

4.2 MSE steady-state analysis

In this part of the analysis, we devise formulas to predict the excess MSE (EMSE) of the proposed algorithms. The error signal at node k can be expressed as

$$ \begin{aligned} {e_{k}(i)}&=d_{k}(i)-\boldsymbol {\omega_{k}^{H}}(i)\boldsymbol x_{k}(i)\\ &=d_{k}(i)-\left[\boldsymbol \omega_{0} -\boldsymbol \varepsilon_{k}(i)\right]^{H}\boldsymbol x_{k}(i)\\ &=d_{k}(i)-\boldsymbol {\omega_{0}^{H}}\boldsymbol x_{k}(i) +\boldsymbol {\varepsilon_{k}^{H}}(i)\boldsymbol x_{k}(i)\\ &=e_{0-k}+\boldsymbol {\varepsilon_{k}^{H}}(i)\boldsymbol x_{k}(i). \end{aligned} $$
((44))

With Assumption I, the MSE expression can be derived as

$$\begin{array}{*{20}l}{} {\mathcal{J}_{mse-k}(i)}&=\mathbb{E}[\!|e_{k}(i)|^{2}]\\ &=\mathbb{E}\left[\left(e_{0-k}+\boldsymbol {\varepsilon_{k}^{H}}(i)\boldsymbol x_{k}(i)\right)\left(e_{0}^{*}+\boldsymbol {x_{k}^{H}}(i)\boldsymbol \varepsilon_{k}(i)\right)\right]\\ &={\mathcal{J}_{min-k}}+\mathbb{E}\left[\boldsymbol {\varepsilon_{k}^{H}}(i)\boldsymbol x_{k}(i)\boldsymbol {x_{k}^{H}}(i)\boldsymbol \varepsilon_{k}(i)\right]\\ &={\mathcal{J}_{min-k}}+\textrm{\text{tr}}\left\{\mathbb{E}\left[\!\boldsymbol \varepsilon_{k}(i)\boldsymbol {\varepsilon_{k}^{H}}(i)\boldsymbol x_{k}(i)\boldsymbol {x_{k}^{H}}(i)\right]\right\}\\ &={\mathcal{J}_{min-k}}+\textrm{\text{tr}}\left\{\mathbb{E}\left[\!\boldsymbol x_{k}(i)\boldsymbol {x_{k}^{H}}(i)\right]\!\mathbb{E}\left[\boldsymbol \varepsilon_{k}(i)\boldsymbol {\varepsilon_{k}^{H}}(i)\right]\right\}\\ &={\mathcal{J}_{min-k}}+\text{tr}\left\{\boldsymbol R_{k}(i)\boldsymbol K_{k}(i)\right\}, \end{array} $$
((45))

where tr(·) denotes the trace of a matrix and \(\mathcal {J}_{min-k}\) is the minimum mean square error (MMSE) for node k [22]:

$$ \mathcal{J}_{min-k}=\sigma_{d,k}^{2}-\boldsymbol {p_{k}^{H}}(i)\boldsymbol R_{k}^{-1}(i)\boldsymbol p_{k}(i), $$
((46))

\(\boldsymbol R_{k}(i)=\mathbb {E}\left [\boldsymbol x_{k}(i)\boldsymbol {x_{k}^{H}}(i)\right ]\) is the correlation matrix of the inputs for node k, \(\boldsymbol p_{k}(i)= \mathbb {E}\left [\boldsymbol x_{k}(i)d_{k}^{*}(i)\right ]\) is the cross-correlation vector between the inputs and the measurement d k (i), and \(\boldsymbol K_{k}(i)=\mathbb {E}\left [\boldsymbol \varepsilon _{k}(i)\boldsymbol {\varepsilon _{k}^{H}}(i)\right ]\) is the weight-error correlation matrix. From [22], the EMSE is defined as the difference between the mean square error at time instant i and the minimum mean square error. Then, we can write

$$ \begin{aligned} {\mathcal{J}_{ex-k}(i)}&={\mathcal{J}_{mse-k}(i)}-{\mathcal{J}_{min-k}}\\ &=\text{tr}\{\boldsymbol R_{k}(i)\boldsymbol K_{k}(i)\}. \end{aligned} $$
((47))

For the proposed adaptive link selection algorithms, we will derive the EMSE formulas separately based on (47) and we assume that the input signal is modeled as a stationary process.

4.2.1 ES–LMS

To update the estimate ω k (i), we employ

$$\begin{array}{*{20}l} {\boldsymbol {\omega}}_{k}(i+1)&= \sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i) \boldsymbol\psi_{l}(i+1)\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[{\boldsymbol {\omega}}_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}e^{*}_{l}(i+1)\right]\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)[\!{\boldsymbol {\omega}}_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}(d_{l}(i+1)\\ &\quad-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol{\omega}_{l}(i))]. \end{array} $$
((48))

Then, subtracting ω 0 from both sides of (48), we arrive at

$$\begin{array}{*{20}l}{} {{\boldsymbol \varepsilon}}_{k}(i+1)&= \sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[{\vphantom{\left.\quad-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol{\omega}_{l}(i)\right)}}\!{\boldsymbol {\omega}}_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}\left(d_{l}(i+1)\right.\right.\\ &\left.\left.\quad-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol{\omega}_{l}(i)\right)\right] -\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\boldsymbol \omega_{0}\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[{\vphantom{\left.\quad-\boldsymbol {x_{l}^{H}}(i+1)(\boldsymbol {\varepsilon}_{l}(i)+\boldsymbol \omega_{0})\right)}}{\boldsymbol {\varepsilon}}_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}\left(d_{l}(i+1)\right.\right.\\ &\left.\left.\quad-\boldsymbol {x_{l}^{H}}(i+1)(\boldsymbol {\varepsilon}_{l}(i)+\boldsymbol \omega_{0})\right)\right]\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[{\vphantom{\left.\quad-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol {\varepsilon}_{l}(i)-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol \omega_{0}\right)}}{\boldsymbol {\varepsilon}}_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}\left(d_{l}(i+1)\right.\right.\\ &\left.\left.\quad-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol {\varepsilon}_{l}(i)-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol \omega_{0}\right)\right]\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\!\left[{\boldsymbol {\varepsilon}}_{l}(i)-{\mu}_{l} {\boldsymbol x_{l}(i+1)}\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol {\varepsilon}_{l}(i)\right.\\ &\left.\quad+{\mu}_{l} {\boldsymbol x_{l}(i+1)}e_{0-l}^{*}(i+1){\vphantom{{\boldsymbol {\varepsilon}}_{l}(i)-{\mu}_{l} {\boldsymbol x_{l}(i+1)}\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol {\varepsilon}_{l}(i)}}\right]\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[(\boldsymbol I-{\mu}_{l} {\boldsymbol x_{l}(i+1)}\boldsymbol {x_{l}^{H}}(i+1)){\boldsymbol {\varepsilon}}_{l}(i)\right.\\ &\left.\quad+{\mu}_{l} {\boldsymbol x_{l}(i+1)}e_{0-l}^{*}(i+1){\vphantom{(\boldsymbol I-{\mu}_{l} {\boldsymbol x_{l}(i+1)}\boldsymbol {x_{l}^{H}}(i+1)){\boldsymbol {\varepsilon}}_{l}(i)}}\right]. \end{array} $$
((49))

Let us introduce the random variables α kl (i):

$$ \alpha_{kl}(i)=\left\{\begin{array}{ll} 1,\ \ \textit{if}\ l\in\widehat{\Omega}_{k}(i)\\ 0, \ \ \textit{otherwise}. \end{array} \right. $$
((50))

At each time instant, each node will generate data associated with network covariance matrices A k with size N×N which reflect the network topology, according to the exhaustive search strategy. In the network covariance matrices A k , a value equal to 1 means nodes k and l are linked and a value 0 means nodes k and l are not linked.

For example, suppose a network has 5 nodes. For node 3, there are two neighbor nodes, namely, nodes 2 and 5. Through Eq. (13), the possible configurations of set Ω 3 are {3,2},{3,5}, and {3,2,5}. Evaluating all the possible sets for Ω 3, the relevant covariance matrices A 3 with size 5×5 at time instant i are described in Fig. 3.

Fig. 3
figure 3

Covariance matrices A 3 for different sets of Ω 3

Then, the coefficients α kl are obtained according to the covariance matrices A k . In this example, the three sets of α kl are respectively shown in Table 1.

Table 1 Coefficients α kl for different sets of Ω 3

The parameters c kl will then be calculated through Eq. (5) for different choices of matrices A k and coefficients α kl . After α kl and c kl are calculated, the error pattern for each possible Ω k will be calculated through (16) and the set with the smallest error will be selected according to (17).

With the newly defined α kl , (49) can be rewritten as

$$ \begin{aligned} {} {\boldsymbol {\varepsilon}}_{k}(i+\!1)&=\sum\limits_{l\in \mathcal{N}_{k}} \alpha_{kl}(i)c_{kl}(i)\left[\left(\boldsymbol I-\!{\mu}_{l} {\boldsymbol x_{l}(i+\!1)}\boldsymbol {x _{l}^{H}}(i+1)\right)\right.\\ &\left.\quad\times{\boldsymbol {\varepsilon}}_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}e_{0-l}^{*}(i+1){\vphantom{\left(\boldsymbol I-\!{\mu}_{l} {\boldsymbol x_{l}(i+\!1)}\boldsymbol {x _{l}^{H}}(i+1)\right)}}\right]. \end{aligned} $$
((51))

Starting from (47), we then focus on K k (i+1).

$$\begin{array}{*{20}l} \boldsymbol K_{k}(i+1)&=\mathbb{E}\left[\!\boldsymbol\varepsilon_{k}(i+1)\boldsymbol {\varepsilon_{k}^{H}}(i+1)\right]. \end{array} $$
((52))

In (51), the term α kl (i) is determined through the network topology for each subset, while the term c kl (i) is calculated through the Metropolis rule. We assume that α kl (i) and c kl (i) are statistically independent from the other terms in (51). Upon convergence, the parameters α kl (i) and c kl (i) do not vary because at steady state, the choice of the subset \(\widehat {\Omega }_{k}(i)\) for each node k will be fixed. Then, under these assumptions, substituting (51) into (52), we arrive at:

$$\begin{array}{*{20}l}{} \boldsymbol K_{k}(i+1)&=\sum\limits_{l\in \mathcal{N}_{k}} \mathbb{E}\left[\alpha_{kl}^{2}(i)c_{kl}^{2}(i)\right]\left(\left(\boldsymbol I-\mu_{l}\boldsymbol R_{l}(i+1)\right)\boldsymbol K_{l}(i)\right.\\ &\quad\times\left(\boldsymbol I-\mu_{l}\boldsymbol R_{l}(i+1)\right)\\ &\left.\quad+{\mu_{l}^{2}}e_{0-l}(i+1)e_{0-l}^{*}(i+1)\times\boldsymbol R_{l}(i+1)\right)\\ &\quad+\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)c_{kl}(i)c_{kq}(i)\right]\\ &\quad\times\left(\left(\boldsymbol I-\mu_{l}\boldsymbol R_{l}(i+1)\right)\boldsymbol K_{l,q}(i)\left(\boldsymbol I-\mu_{q}\boldsymbol R_{l}(i+1)\right)\right.\\ &\left.\quad+\mu_{l}\mu_{q}e_{0-l}(i+1)e_{0-q}^{*}(i+1)\boldsymbol R_{l,q}(i+1)\right)\\ &\quad+\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)c_{kl}(i)c_{kq}(i)\right]\\ &\quad\times\left(\!\left(\boldsymbol I-\mu_{q}\boldsymbol R_{q}(i+1)\right)\!\boldsymbol K_{l,q}^{H}(i)\left(\boldsymbol I-\mu_{l}\boldsymbol R_{l}(i+1)\right)\right.\\ &\left.\quad+\mu_{l}\mu_{q}e_{0-q}(i+1)e_{0-l}^{*}(i+1)\boldsymbol R_{l,q}^{H}(i+1)\right) \end{array} $$
((53))

where \(\boldsymbol R_{l,q}(i+1)=\mathbb {E}\left [\!\boldsymbol x_{l}(i+1)\boldsymbol {x_{q}^{H}}(i+1)\right ]\) and \(\boldsymbol K_{l,q}(i)=\mathbb {E}\left [\!\boldsymbol \varepsilon _{l}(i)\boldsymbol {\varepsilon _{q}^{H}}(i)\right ]\). To further simplify the analysis, we assume that the samples of the input signal x k (i) are uncorrelated, i.e., \(\boldsymbol R_{k}=\sigma _{x,k}^{2}\boldsymbol I\) with \(\sigma _{x,k}^{2}\) being the variance. Using the diagonal matrices \(\boldsymbol R_{k}=\boldsymbol \Lambda _{k}=\sigma _{x,k}^{2}\boldsymbol I\) and R l,q =Λ l,q =σ x,l σ x,q I, we can write

$$\begin{array}{*{20}l}{} \boldsymbol K_{k}(i+\!1)&=\!\sum\limits_{l\in \mathcal{N}_{k}} \mathbb{E}\left[\alpha_{kl}^{2}(i)c_{kl}^{2}(i)\right]\left(\left(\boldsymbol I\,-\,\mu_{l}\boldsymbol\Lambda_{l}\right)\boldsymbol K_{l}(i)\left(\boldsymbol I\,-\,\mu_{l}\boldsymbol\Lambda_{l}\right)\right.\\ &\left.\quad+{\mu_{l}^{2}}e_{0-l}(i+1)e_{0-l}^{*}(i+1)\boldsymbol\Lambda_{l}\right)\\ &\quad+\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}} \mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)c_{kl}(i)c_{kq}(i)\right]\\ &\quad\times \left(\left(\boldsymbol I-\mu_{l}\boldsymbol\Lambda_{l}\right)\boldsymbol K_{l,q}(i)\left(\boldsymbol I-\mu_{q}\boldsymbol\Lambda_{q}\right)\right.\\ &\left.\quad+\mu_{l}\mu_{q}e_{0-l}(i+1)e_{0-q}^{*}(i+1)\boldsymbol\Lambda_{l,q}\right)\\ &\quad+\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)c_{kl}(i)c_{kq}(i)\right]\\ &\quad\times\left(\left(\boldsymbol I-\mu_{q}\boldsymbol\Lambda_{q}\right)\boldsymbol K_{l,q}^{H}(i)\left(\boldsymbol I-\mu_{l}\boldsymbol\Lambda_{l}\right)\right.\\ &\left.\quad+\mu_{l}\mu_{q}e_{0-q}(i+1)e_{0-l}^{*}(i+1)\boldsymbol\Lambda_{l,q}^{H}\right). \end{array} $$
((54))

Due to the structure of the above equations, the approximations, and the quantities involved, we can decouple (54) into

$$\begin{array}{*{20}l}{} {K_{k}^{n}}(i+\!1)&=\!\!\sum\limits_{l\in \mathcal{N}_{k}} \mathbb{E}\left[\alpha_{kl}^{2}(i)c_{kl}^{2}(i)\right]\!\left(\left(1\,-\,\mu_{l}{\lambda_{l}^{n}}\right){K_{l}^{n}}(i)\left(1\,-\,\mu_{l}{\lambda_{l}^{n}}\right)\right.\\ &\quad \left. +{\mu_{l}^{2}}e_{0-l}(i+1)e_{0-l}^{*}(i+1){\lambda_{l}^{n}}\right)\\ &\quad+\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)c_{kl}(i)c_{kq}(i)\right]\\ &\quad\times\left(\left(1-\mu_{l}{\lambda_{l}^{n}}\right)K_{l,q}^{n}(i)\left(1-\mu_{q}{\lambda_{q}^{n}}\right)\right.\\ &\quad\left. +\mu_{l}\mu_{q}e_{0-l}(i+1)e_{0-q}^{*}(i+1)\lambda_{l,q}^{n}\right)\\ &\quad+\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)c_{kl}(i)c_{kq}(i)\right]\\ &\quad\times\left(\left(1-\mu_{q}{\lambda_{q}^{n}}\right)(K_{l,q}^{n}(i))^{H}\left(1-\mu_{l}{\lambda_{l}^{n}}\right)\right.\\ &\quad\left. +\mu_{l}\mu_{q}e_{0-q}(i+1)e_{0-l}^{*}(i+1)\lambda_{l,q}^{n}\right), \end{array} $$
((55))

where \({K_{k}^{n}}(i+1)\) is the nth element of the main diagonal of K k (i+1). With the assumption that α kl (i) and c kl (i) are statistically independent from the other terms in (51), we can rewrite (55) as

$$\begin{array}{*{20}l} {}{K_{k}^{n}}(i+1)&=\sum\limits_{l\in \mathcal{N}_{k}} \mathbb{E}\left[\alpha_{kl}^{2}(i)\right]\mathbb{E}\left[c_{kl}^{2}(i)\right]\left(\left(1\,-\,\mu_{l}{\lambda_{l}^{n}}\right)^{2}{K_{l}^{n}}(i)\right.\\ &\left.\quad+{\mu_{l}^{2}}e_{0-l}(i+1)e_{0-l}^{*}(i+1){\lambda_{l}^{n}}\right)\\ &\quad+2\times\!\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\!\mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)\right]\mathbb{E}\left[c_{kl}(i)c_{kq}(i)\right]\\ &\quad\times\left(\left(1-\mu_{l}{\lambda_{l}^{n}}\right)\left(1-\mu_{q}{\lambda_{q}^{n}}\right)K_{l,q}^{n}(i) \right.\\ &\left.\quad+\mu_{l}\mu_{q}e_{0-l}(i+1)e_{0-q}^{*}(i+1)\lambda_{l,q}^{n}\right). \end{array} $$
((56))

By taking i, we can obtain (57).

$$ \begin{aligned} {K_{k}^{n}}(\text{ES-LMS})=\frac{\sum\limits_{l\in \mathcal{N}_{k}}\alpha_{kl}^{2}c_{kl}^{2}{\mu_{l}^{2}}\mathcal{J}_{min-l}{\lambda_{l}^{n}}+2\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\alpha_{kl}\alpha_{kq}c_{kl}c_{kq}\mu_{l}\mu_{q}e_{0-l}e_{0-q}^{*}\lambda_{l,q}^{n}}{1-\sum\limits_{l\in \mathcal{N}_{k}}\alpha_{kl}^{2}c_{kl}^{2}\left(1-\mu_{l}{\lambda_{l}^{n}}\right)^{2}-2\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\alpha_{kl}\alpha_{kq}c_{kl}c_{kq}\left(1-\mu_{l}{\lambda_{l}^{n}}\right)\left(1-\mu_{q}{\lambda_{q}^{n}}\right)}. \end{aligned} $$
((57))

We assume that the choice of covariance matrix A k for node k is fixed upon the proposed algorithms convergence, as a result, the covariance matrix A k is deterministic and does not vary. In the above example, we assume the choice of A 3 is fixed as shown in Fig. 4.

Fig. 4
figure 4

Covariance matrix A 3 upon convergence

Then the coefficients α kl will also be fixed and given by

$$ \left\{\begin{array}{l} \alpha_{31}=0\\ \alpha_{32}=1\\ \alpha_{33}=1\\ \alpha_{34}=0\\ \alpha_{35}=1 \end{array} \right. $$

as well as the parameters c kl that are computed using the Metropolis combining rule. As a result, the coefficients α kl and the coefficients c kl are deterministic and can be taken out from the expectation. The MSE is then given by

$$ \mathcal{J}_{\text{mse}-k}=\mathcal{J}_{\text{min}-k}+M\sigma_{x,k}^{2}\sum\limits_{n=1}^{M}{K_{k}^{n}}(\text{ES-LMS}). $$
((58))

4.2.2 SI-LMS

For the SI-LMS algorithm, we do not need to consider all possible combinations. This algorithm simply adjusts the combining coefficients for each node with its neighbors in order to select the neighbor nodes that yield the smallest MSE values. Thus, we redefine the combining coefficients through (28)

$$ c_{kl-\text{new}}=c_{kl}-\rho\varepsilon\frac{{\text{sign}}(|e_{kl}|)}{1+\varepsilon|\xi_{\text{min}}|}\ \ (l\in\mathcal{N}_{k}). $$
((59))

For each node k, at time instant i, after it receives the estimates from all its neighbors, it calculates the error pattern e kl (i) for every estimate received through Eq. (20) and finds the nodes with the largest and smallest errors. An error vector \( \hat {\boldsymbol e}_{k}\) is then defined through (23), which contains all error patterns e kl (i) for node k.

Then a procedure which is detailed after Eq. (23) is carried out and modifies the error vector \( \hat {\boldsymbol e}_{k}\). For example, suppose node 5 has three neighbor nodes, which are nodes 3,6, and 8. The error vector \(\hat {\boldsymbol e}_{5}\) has the form described by \( \hat {\boldsymbol e}_{5}=~\left [e_{53},e_{55},e_{56},e_{58}\right ]=~\left [0.023,0.052,-0.0004,-0.012\right ]\). After the modification, the error vector \( \hat {\boldsymbol e}_{5}\) will be edited as \( \hat {\boldsymbol e}_{5}=~\left [0,0.052,-0.0004,0\right ]\). The quantity h kl is then defined as

$$ h_{kl}=\rho\varepsilon\frac{{\text{sign}}(|e_{kl}|)}{1+\varepsilon|\xi_{\mathrm min}|}\ \ (l\in\mathcal{N}_{k}), $$
((60))

and the term ‘error pattern’ e kl in (60) is from the modified error vector \( \hat {\boldsymbol e}_{k}\).

From [29], we employ the relation \(\mathbb {E}\, [\!\text {sign}(e_{\textit {kl}})]\approx \text {sign}(e_{0-k})\). According to Eqs. (1) and (38), when the proposed algorithm converges at node k or the time instant i goes to infinity, we assume that the error e 0−k will be equal to the noise variance at node k. Then, the asymptotic value h kl can be divided into three situations due to the rule of the SI-LMS algorithm:

$${} {h_{kl}}=\! \left\{\begin{array}{ll} {\rho\varepsilon\frac{{\text{sign}}(|e_{0-k}|)}{1+\varepsilon|e_{0-k}|}} & \text{for the node with the largest MSE}\\ {\rho\varepsilon\frac{{\text{sign}}(-|e_{0-k}|)}{1+\varepsilon|e_{0-k}|}}& \text{for the node with the smallest MSE}\\ 0 & \text{for all the remaining nodes}. \end{array} \right. $$
((61))

Under this situation, after the time instant i goes to infinity, the parameters h kl for each neighbor node of node k can be obtained through (61) and the quantity h kl will be deterministic and can be taken out from the expectation.

Finally, removing the random variables α kl (i) and inserting (59) and (60) into (57), the asymptotic values \({K_{k}^{n}}\) for the SI-LMS algorithm are obtained as in (62).

$$ \begin{aligned} &{K_{k}^{n}}(\text{SI-LMS})= \frac{\sum\limits_{l\in \mathcal{N}_{k}}(c_{kl}-h_{kl})^{2}{\mu_{l}^{2}}\mathcal{J}_{min-l}{\lambda_{l}^{n}}+2\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}(c_{kl}-h_{kl})(c_{kq}-h_{kq})\mu_{l}\mu_{q}e_{0-l}e_{0-q}^{*}\lambda_{l,q}^{n}}{1-\sum\limits_{l\in \mathcal{N}_{k}}\left(c_{kl}-h_{kl}\right)^{2}\left(1-\mu_{l}{\lambda_{l}^{n}}\right)^{2}-2\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\left(c_{kl}-h_{kl}\right)\left(c_{kq}-h_{kq}\right)\left(1-\mu_{l}{\lambda_{l}^{n}}\right)\left(1-\mu_{q}{\lambda_{q}^{n}}\right)}. \end{aligned} $$
((62))

At this point, the theoretical results are deterministic, and the MSE for SI-LMS algorithm is given by

$$ \mathcal{J}_{\text{mse}-k}=\mathcal{J}_{\text{min}-k}+M\sigma_{x,k}^{2}\sum\limits_{n=1}^{M}{K_{k}^{n}}(\text{SI-LMS}). $$
((63))

4.2.3 ES-RLS

For the proposed ES-RLS algorithm, we start from (11), after inserting (11) into (18), we have

$$\begin{array}{*{20}l} {\boldsymbol {\omega}}_{k}(i+1)&= \sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i) \boldsymbol\psi_{l}(i+1)\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[{\boldsymbol {\omega}}_{l}(i)+\boldsymbol k_{l}(i+1)e^{*}_{l}(i+1)\right]\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[{\boldsymbol {\omega}}_{l}(i)+\boldsymbol k_{l}(i+1)(d_{l}(i+1)-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol{\omega}_{l}(i))\right]. \end{array} $$
((64))

Then, subtracting the ω 0 from both sides of (48), we arrive at

$$\begin{array}{*{20}l} {\boldsymbol {\varepsilon}}_{k}(i+1)&= \sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[{\boldsymbol {\omega}}_{l}(i)+\boldsymbol k_{l}(i+1)(d_{l}(i+1)-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol{\omega}_{l}(i))\right]\\&-\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\boldsymbol \omega_{0}\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[{\boldsymbol {\varepsilon}}_{l}(i)+\boldsymbol k_{l}(i+1)\left(d_{l}(i+1)-\boldsymbol {x_{l}^{H}}(i+1)(\boldsymbol {\varepsilon}_{l}(i)+\boldsymbol \omega_{0})\right)\right]\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[\left(\boldsymbol I-\boldsymbol k_{l}(i+1)\boldsymbol {x_{l}^{H}}(i+1)\right){\boldsymbol {\varepsilon}}_{l}(i)+\boldsymbol k_{l}(i+1)e_{0-l}^{*}(i+1)\right]. \end{array} $$
((65))

Then, with the random variables α kl (i), (65) can be rewritten as

$$\begin{array}{*{20}l} {\boldsymbol {\varepsilon}}_{k}(i+1)&=\sum\limits_{l\in \mathcal{N}_{k}} \alpha_{kl}(i)c_{kl}(i)\left[\left(\boldsymbol I-\boldsymbol k_{l}(i+1)\boldsymbol {x_{l}^{H}}(i+1)\right){\boldsymbol {\varepsilon}}_{l}(i)+\boldsymbol k_{l}(i+1)e_{0-l}^{*}(i+1)\right]. \end{array} $$
((66))

Since \(\boldsymbol k_{l}(i+1)=\boldsymbol \Phi ^{-1}_{l}(i+1)\boldsymbol x_{l}(i+1)\) [22], we can modify (66) as

$$\begin{array}{*{20}l} {\boldsymbol {\varepsilon}}_{k}(i+1)&=\sum\limits_{l\in \mathcal{N}_{k}} \alpha_{kl}(i)c_{kl}(i)\left[\left(\boldsymbol I-\boldsymbol\Phi^{-1}_{l}(i+1)\boldsymbol x_{l}(i+1)\boldsymbol {x_{l}^{H}}(i+1)\right){\boldsymbol {\varepsilon}}_{l}(i)+\boldsymbol\Phi^{-1}_{l}(i+1)\boldsymbol x_{l}(i+1)e_{0-l}^{*}(i+1)\right]. \end{array} $$
((67))

At this point, if we compare (67) with (51), we can find that the difference between (67) and (51) is that the \(\boldsymbol \Phi ^{-1}_{l}(i+1)\) in (67) has replaced the μ l in (51). From [22], we also have

$$ \mathbb{E}\left[\boldsymbol \Phi^{-1}_{l}(i+1)\right]= \frac{1}{i-M}\boldsymbol R_{l}^{-1}(i+1)\ \ \ \ \text{for}\ i>M+1. $$
((68))

As a result, we can arrive

$$\begin{array}{*{20}l} \boldsymbol K_{k}(i+1)&=\sum\limits_{l\in \mathcal{N}_{k}} \mathbb{E}\left[\alpha_{kl}^{2}(i)c_{kl}^{2}(i)\right]\left(\left(\boldsymbol I-\frac{\boldsymbol\Lambda_{l}^{-1}\boldsymbol\Lambda_{l}}{i-M}\right)\boldsymbol K_{l}(i)\left(\boldsymbol I-\frac{\boldsymbol\Lambda_{l}\boldsymbol\Lambda_{l}^{-1}}{i-M}\right)\right. \end{array} $$
((69))
$$\begin{array}{*{20}l} &\left.\quad+\frac{\boldsymbol\Lambda_{l}^{-1}\boldsymbol\Lambda_{l}\boldsymbol\Lambda_{l}^{-1}}{(i-M)^{2}}e_{0-l}(i+1)e_{0-l}^{*}(i+1)\right)+\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)c_{kl}(i)c_{kq}(i)\right]\\ &\quad\times\left(\left(\boldsymbol I-\frac{\boldsymbol\Lambda_{l}^{-1}\boldsymbol\Lambda_{l}}{i-M}\right)\boldsymbol K_{l,q}(i)\left(\boldsymbol I-\frac{\boldsymbol\Lambda_{q}\boldsymbol\Lambda_{q}^{-1}}{i-M}\right) +\frac{\boldsymbol\Lambda_{l}^{-1}\boldsymbol\Lambda_{l,q}\boldsymbol\Lambda_{q}^{-1}}{(i-M)^{2}}e_{0-l}(i+1)\right.\\ &\left.\quad\times e_{0-q}^{*}(i+1){\vphantom{\boldsymbol I-\frac{\boldsymbol\Lambda_{l}^{-1}\boldsymbol\Lambda_{l}}{i-M}}}\right)+\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)c_{kl}(i)c_{kq}(i)\right]\\ &\quad\times\left(\left(\boldsymbol I-\frac{\boldsymbol\Lambda_{q}\boldsymbol\Lambda_{q}^{-1}}{i-M}\right)\boldsymbol K_{l,q}^{H}(i)\left(\boldsymbol I-\frac{\boldsymbol\Lambda_{l}^{-1}\boldsymbol\Lambda_{l}}{i-M}\right)+\frac{\boldsymbol\Lambda_{q}^{-1}\boldsymbol\Lambda_{l,q}^{H}\boldsymbol\Lambda_{l}^{-1}}{(i-M)^{2}}e_{0-q}(i+1)e_{0-l}^{*}(i+1)\right). \end{array} $$
((70))

Due to the structure of the above equations, the approximations and the quantities involved, we can decouple (70) into

$$\begin{array}{*{20}l} {K_{k}^{n}}(i+1)&=\sum\limits_{l\in \mathcal{N}_{k}} \mathbb{E}\left[\alpha_{kl}^{2}(i)c_{kl}^{2}(i)\right]\left(\left(1-\frac{1}{i-M}\right)^{2} {K_{l}^{n}}(i)+\frac{e_{0-l}(i+1)e_{0-l}^{*}(i+1)}{{\lambda_{l}^{n}}(i-M)^{2}}\right)\\ &\quad+\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)c_{kl}(i)c_{kq}(i)\right]\left(\left(1-\frac{1}{i-M}\right)^{2} K_{l,q}^{n}(i)+\frac{\lambda_{l,q}^{n} e_{0-l}(i+1)e_{0-q}^{*}(i+1)}{(i-M)^{2}{\lambda_{l}^{n}}{\lambda_{q}^{n}}}\right)\\ &\quad+\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\mathbb{E}\left[\alpha_{kl}(i)\alpha_{kq}(i)c_{kl}(i)c_{kq}(i)\right]\times\left(\left(1-\frac{1}{i-M}\right)^{2}\left(K_{l,q}^{n}(i)\right)^{H}+\frac{\lambda_{l,q}^{n} e_{0-q}(i+1)e_{0-l}^{*}(i+1)}{(i-M)^{2}{\lambda_{q}^{n}}{\lambda_{l}^{n}}}\right) \end{array} $$
((71))

where \({K_{k}^{n}}(i+1)\) is the nth elements of the main diagonals of K k (i+1). With the assumption that, upon convergence, α kl and c kl do not vary, because at steady state, the choice of subset \(\widehat {\Omega }_{k}(i)\) for each node k will be fixed, we can rewrite (71) as (72). Then, the MSE is given by

$$ {K_{k}^{n}}(\text{ES-RLS})=\frac{\sum\limits_{l\in \mathcal{N}_{k}}\alpha_{kl}^{2}c_{kl}^{2}\frac{\mathcal{J}_{min-l}}{{\lambda_{l}^{n}}(i-M)^{2}}+2\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\alpha_{kl}\alpha_{kq}c_{kl}c_{kq}\frac{\lambda_{l,q}^{n} e_{0-l}e_{0-q}^{*}}{(i-M)^{2}{\lambda_{l}^{n}}{\lambda_{q}^{n}}}}{1-\sum\limits_{l\in \mathcal{N}_{k}}\alpha_{kl}^{2}c_{kl}^{2}\left(1-\frac{1}{i-M}\right)^{2}-2\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\alpha_{kl}\alpha_{kq}c_{kl}c_{kq}\left(1-\frac{1}{i-M}\right)^{2}}. $$
((72))
$$ \mathcal{J}_{mse-k}=\mathcal{J}_{min-k}+M\sigma_{x,k}^{2}\sum_{n=1}^{M}{K_{k}^{n}}(\text{ES-RLS}). $$
((73))

On the basis of (72), we have that when i tends to infinity, the MSE approaches the MMSE in theory [22].

4.2.4 SI-RLS

For the proposed SI-RLS algorithm, we insert (59) into (72) and remove the random variables α kl (i), and following the same procedure as for the SI-LMS algorithm, we can obtain (74), where h kl and h kq satisfy the rule in (61). Then, the MSE is given by

$$\begin{array}{*{20}l} &{K_{k}^{n}}(\text{SI-RLS})=\frac{\sum\limits_{l\in \mathcal{N}_{k}}(c_{kl}-h_{kl})^{2}\frac{\mathcal{J}_{\text{min}-l}}{{\lambda_{l}^{n}}(i-M)^{2}}+2\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}\left(c_{kl}-h_{kl}\right)(c_{kq}-h_{kq})\frac{\lambda_{l,q}^{n} e_{0-l}e_{0-q}^{*}}{(i-M)^{2}{\lambda_{l}^{n}}{\lambda_{q}^{n}}}}{1-\sum\limits_{l\in \mathcal{N}_{k}}(c_{kl}-h_{kl})^{2}\left(1-\frac{1}{i-M}\right)^{2}-2\sum\limits_{\substack{{l,q}\in \mathcal{N}_{k} \\ l\neq q}}(c_{kl}-h_{kl})(c_{kq}-h_{kq})\left(1-\frac{1}{i-M}\right)^{2}}. \end{array} $$
((74))
$$ \mathcal{J}_{\text{mse}-k}=\mathcal{J}_{\text{min}-k}+M\sigma_{x,k}^{2}\sum\limits_{n=1}^{M}{K_{k}^{n}}(\text{SI-RLS}). $$
((75))

In conclusion, according to (62) and (74), with the help of modified combining coefficients, for the proposed SI-type algorithms, the neighbor node with lowest MSE contributes the most to the combination, while the neighbor node with the highest MSE contributes the least. Therefore, the proposed SI-type algorithms perform better than the standard diffusion algorithms with fixed combining coefficients.

4.3 Tracking analysis

In this subsection, we assess the proposed ES-LMS/RLS and SI-LMS/RLS algorithms in a non-stationary environment, in which the algorithms have to track the minimum point of the error performance surface [34, 35]. In the time-varying scenarios of interest, the optimum estimate is assumed to vary according to the model ω 0(i+1)=β ω 0(i)+q(i), where q(i) denotes a random perturbation [32] and β=1 in order to facilitate the analysis. This is typical in the context of tracking analysis of adaptive algorithms [22, 32, 36, 37].

4.3.1 ES-LMS

For the tracking analysis of the ES-LMS algorithm, we employ Assumption III and start from (48). After subtracting the ω 0(i+1) from both sides of (48), we obtain

$$\begin{array}{*{20}l}{} {\boldsymbol {\varepsilon}}_{k}(i+1)&= \sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)[\!{\boldsymbol {\omega}}_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}(d_{l}(i+1)\\ &\quad-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol{\omega}_{l}(i))]-\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\boldsymbol \omega_{0}(i+1)\\ &= \sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)[\!{\boldsymbol {\omega}}_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}(d_{l}(i+1)\\ &\quad-\boldsymbol {x_{l}^{H}}(i+1)\boldsymbol{\omega}_{l}(i))]\,-\,\sum\limits_{l\in \widehat{\Omega}_{k}(i)} \!c_{kl}(i)\left(\!\boldsymbol \omega_{0}(i)+\boldsymbol q(i)\right)\\ &\quad =\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[\vphantom{\left[\left(\boldsymbol I-{\mu}_{l} {\boldsymbol x_{l}(i+1)}\boldsymbol {x_{l}^{H}}(i+1)\right){\boldsymbol {\varepsilon}}_{l}(i)\right.}{\boldsymbol {\varepsilon}}_{l}(i)+{\mu}_{l} {\boldsymbol x_{l}(i+1)}\left(d_{l}(i+1)\right.\right.\\ &\left.\left.\quad-\boldsymbol {x_{l}^{H}}(i+1)(\boldsymbol {\varepsilon}_{l}(i)+\boldsymbol \omega_{0})\right)\right]-\boldsymbol q(i)\\ &=\sum\limits_{l\in \widehat{\Omega}_{k}(i)} c_{kl}(i)\left[\left(\boldsymbol I-{\mu}_{l} {\boldsymbol x_{l}(i+1)}\boldsymbol {x_{l}^{H}}(i+1)\right){\boldsymbol {\varepsilon}}_{l}(i)\right.\\ &\left.\quad+{\mu}_{l} {\boldsymbol x_{l}(i+1)}e_{0-l}^{*}(i+1)\vphantom{\left[\left(\boldsymbol I-{\mu}_{l} {\boldsymbol x_{l}(i+1)}\boldsymbol {x_{l}^{H}}(i+1)\right){\boldsymbol {\varepsilon}}_{l}(i)\right.}\right]-\boldsymbol q(i). \end{array} $$
((76))

Using Assumption III, we can arrive at

$$ {\mathcal{J}_{ex-k}(i+1)}=\text{tr}\{\boldsymbol R_{k}(i+1)\boldsymbol K_{k}(i+1)\}+\text{tr}\{\boldsymbol R_{k}(i+1)\boldsymbol Q\}. $$
((77))

The first part on the right side of (77) has already been obtained in the MSE steady-state analysis part in Section 4 B. The second part can be decomposed as

$${} {\fontsize{9.6pt}{9.6pt}\selectfont{\begin{aligned} \text{tr}\{\boldsymbol R_{k}(i+1)\boldsymbol Q\}&=\text{tr}\left\{\mathbb{E}\left[\boldsymbol x_{k}(i+1)\boldsymbol {x_{k}^{H}}(i+1)\right]\mathbb{E}\left[\boldsymbol q(i)\boldsymbol q^{H}(i)\right]\right\}\\ &=M\sigma_{x,k}^{2}\text{tr}\{\boldsymbol Q\}. \end{aligned}}} $$
((78))

The MSE is then obtained as

$${} \mathcal{J}_{mse-k}=\mathcal{J}_{min-k}\,+\,M\sigma_{x,k}^{2}\sum\limits_{n=1}^{M}{K_{k}^{n}}(\text{ES-LMS})\,+\,M\sigma_{x,k}^{2}\text{tr}\{\boldsymbol Q\}. $$
((79))

4.3.2 SI-LMS

For the SI-LMS recursions, we follow the same procedure as for the ES-LMS algorithm and obtain

$${} \mathcal{J}_{\text{mse}-k}=\mathcal{J}_{\text{min}-k}+M\sigma_{x,k}^{2}\sum\limits_{n=1}^{M}{K_{k}^{n}}(\text{SI-LMS})+M\sigma_{x,k}^{2}\text{tr}\{\boldsymbol Q\}. $$
((80))

4.3.3 ES-RLS

For the SI-RLS algorithm, we follow the same procedure as for the ES-LMS algorithm and arrive at

$$\begin{array}{*{20}l}{} \mathcal{J}_{\text{mse}-k}(i+1)&=\mathcal{J}_{\text{min}-k}+M\sigma_{x,k}^{2}\sum\limits_{n=1}^{M}{K_{k}^{n}}(i+1)(\text{ES-RLS})\\ &\quad+M\sigma_{x,k}^{2}\text{tr}\{\boldsymbol Q\}. \end{array} $$
((81))

4.3.4 SI-RLS

We start from (75), and after a similar procedure to that of the SI-LMS algorithm, we have

$$\begin{array}{*{20}l}{} \mathcal{J}_{\text{mse}-k}(i+1)&=\mathcal{J}_{\text{min}-k}+M\sigma_{x,k}^{2}\sum\limits_{n=1}^{M}{K_{k}^{n}}(i+1)(\text{SI-RLS})\\ &\quad+M\sigma_{x,k}^{2}\text{tr}\{\boldsymbol Q\}. \end{array} $$
((82))

In conclusion, for time-varying scenarios, there is only one additional term \(M\sigma _{x,k}^{2}\text {tr}\{\boldsymbol Q\}\) in the MSE expression for all algorithms, and this additional term has the same value for all algorithms. As a result, the proposed SI-type algorithms still perform better than the standard diffusion algorithms with fixed combining coefficients, according to the conclusion obtained in the previous subsection.

4.4 Computational complexity

In the analysis of the computational cost of the algorithms studied, we assume complex-valued data and first analyze the adaptation step. For both ES-LMS/RLS and SI-LMS/RLS algorithms, the adaptation cost depends on the type of recursions (LMS or RLS) that each strategy employs. The details are shown in Table 2.

Table 2 Computational complexity for the adaptation step per node per time instant

In the combination step, we analyze the computational complexity in Table 3. The overall complexity for each algorithm is summarized in Table 4. In the above three tables, t is the number of nodes chosen from \(|\mathcal {N}_{k}|\) and M is the length of the unknown vector ω 0. The proposed algorithms require extra computations as compared to the existing distributed LMS and RLS algorithms. This extra cost ranges from a small additional number of operations for the SI-LMS/RLS algorithms to a more significant extra cost that depends on \(|\mathcal {N}_{k}|\) for the ES-LMS/RLS algorithms.

Table 3 Computational complexity for the combination step per node per time instant
Table 4 Computational complexity per node per time instant

5 Simulations

In this section, we investigate the performance of the proposed link selection strategies for distributed estimation in two scenarios: wireless sensor networks and smart grids. In these applications, we simulate the proposed link selection strategies in both static and time-varying scenarios. We also show the analytical results for the MSE steady-state and tracking performances that we obtained in Section 4.

5.1 Diffusion wireless sensor networks

In this subsection, we compare the proposed ES-LMS/ES-RLS and SI-LMS/SI-RLS algorithms with the diffusion LMS algorithm [2], the diffusion RLS algorithm [38], and the single-link strategy [39] in terms of their MSE performance. A reduced-communication diffusion LMS algorithm with a performance comparable or worse to the standard diffusion LMS algorithm, which has been reported in [40], may also be considered if a designer needs to reduce the required bandwidth.

The network topology is illustrated in Fig. 5, and we employ N=20 nodes in the simulations. The average node degree of the wireless sensor network is 5. The length of the unknown parameter vector ω 0 is M=10, and it is generated as a complex random vector. The input signal is generated as x k (i)=[x k (i) x k (i−1) … x k (iM+1)] and x k (i)=u k (i)+α k x k (i−1), where α k is a correlation coefficient and u k (i) is a white noise process with variance \(\sigma ^{2}_{u,k}= 1-|\alpha _{k}|^{2}\), to ensure the variance of x k (i) is \(\sigma ^{2}_{x,k}= 1\). The x k (0) is defined as a Gaussian random number with zero mean and variance \(\sigma ^{2}_{x,k}\). The noise samples are modeled as circular Gaussian noise with zero mean and variance \(\sigma ^{2}_{n,k}\in [0.001,0.01]\). The step size for the diffusion LMS ES-LMS and SI-LMS algorithms is μ=0.2. For the diffusion RLS algorithm, both ES-RLS and SI-RLS, the forgetting factor λ is set to 0.97 and δ is equal to 0.81. In the static scenario, the sparsity parameters of the SI-LMS/SI-RLS algorithms are set to ρ=4×10−3 and ε=10. The Metropolis rule is used to calculate the combining coefficients c kl . The MSE and MMSE are defined as in (3) and (46), respectively. The results are averaged over 100 independent runs.

Fig. 5
figure 5

Wireless sensor networks topology with 20 nodes

In Fig. 6, we can see that the ES-RLS has the best performance for both steady-state MSE and convergence rate and obtains a gain of about 8 dB over the standard diffusion RLS algorithm. The SI-RLS is worse than the ES–RLS but is still significantly better than the standard diffusion RLS algorithm by about 5 dB. Regarding the complexity and processing time, the SI-RLS is as simple as the standard diffusion RLS algorithm, while the ES-RLS is more complex. The proposed ES-LMS and SI-LMS algorithms are superior to the standard diffusion LMS algorithm.

Fig. 6
figure 6

Network MSE curves in a static scenario

In the time-varying scenario, the sparsity parameters of the SI-LMS and SI-RLS algorithms are set to ρ=6×10−3 and ε=10. The unknown parameter vector ω 0 varies according to the first-order Markov vector process:

$$ {\boldsymbol \omega}_{0}(i+1)=\beta{\boldsymbol \omega}_{0}(i)+{\boldsymbol q(i)}, $$
((83))

where q(i) is an independent zero mean Gaussian vector process with variance \({\sigma ^{2}_{q}}= 0.01\) and β=0.9998.

Figure 7 shows that, similarly to the static scenario, the ES-RLS has the best performance and obtains a 5 dB gain over the standard diffusion RLS algorithm. The SI-RLS is slightly worse than the ES-RLS but is still better than the standard diffusion RLS algorithm by about 3 dB. The proposed ES-LMS and SI-LMS algorithms have the same advantage over the standard diffusion LMS algorithm in the time-varying scenario. Notice that in the scenario with large \(|\mathcal {N}_{k}|\), the proposed SI-type algorithms still have a better performance when compared with the standard techniques.

Fig. 7
figure 7

Network MSE curves in a time-varying scenario

To illustrate the link selection for the ES-type algorithms, we provide Figs. 8 and 9. From these two figures, we can see that upon convergence, the proposed algorithms converge to a fixed selected set of links \(\widehat {\Omega }_{k}\).

Fig. 8
figure 8

Set of selected links for node 16 with ES-LMS in a static scenario

Fig. 9
figure 9

Link selection state for node 16 with ES-LMS in a time-varying scenario

5.2 MSE analytical results

The aim of this section is to validate the analytical results obtained in Section 4. First, we verify the MSE steady-state performance. Specifically, we compare the analytical results in (58), (63), (73) and (75) to the results obtained by simulations under different SNR values. The SNR indicates the input signal variance to noise variance ratio. We assess the MSE against the SNR, as shown in Figs. 10 and 11. For ES-RLS and SI-RLS algorithms, we use (73) and (75) to compute the MSE after convergence. The results show that the analytical curves coincide with those obtained by simulations, which indicates the validity of the analysis. We have assessed the proposed algorithms with SNR equal to 0, 10, 20, and 30 dB, respectively, with 20 nodes in the network. For the other parameters, we follow the same definitions used to obtain the network MSE curves in a static scenario. The details have been shown on the top of each subfigure in Figs. 10 and 11. The theoretical curves for ES-LMS/RLS and SI-LMS/RLS are all close to the simulation curves.

Fig. 10
figure 10

ab MSE performance against SNR for ES-LMS and SI-LMS

Fig. 11
figure 11

ab MSE performance against SNR for ES-RLS and SI-RLS

The tracking analysis of the proposed algorithms in a time-varying scenario is discussed as follows. Here, we verify that the results in (79), (80), (81), and (82) of the subsection 4.3 can provide a means of estimating the MSE. We consider the same model as in (83), but with β set to 1. In the next examples, we employ N=20 nodes in the network and the same parameters used to obtain the network MSE curves in a time-varying scenario. A comparison of the curves obtained by simulations and by the analytical formulas is shown in Figs. 12 and 13. From these curves, we can verify that the gap between the simulation and analysis results are extraordinary small under different SNR values. The details of the parameters are shown on the top of each subfigure in Figs. 12 and 13.

Fig. 12
figure 12

ab MSE performance against SNR for ES-LMS and SI-LMS in a time-varying scenario

Fig. 13
figure 13

ab MSE performance against SNR for ES-RLS and SI-RLS in a time-varying scenario

5.3 Smart Grids

The proposed algorithms provide a cost-effective tool that could be used for distributed state estimation in smart grid applications. In order to test the proposed algorithms in a possible smart grid scenario, we consider the IEEE 14-bus system [41], where 14 is the number of substations. At every time instant i, each bus k,k=1,2,…,14, takes a scalar measurement d k (i) according to

$$ {d_{k}(i)}= {X_{k} \left({\boldsymbol \omega}_{0}(i)\right)+ n_{k}(i)},~~~ k=1,2, \ldots, 14, $$
((84))

where ω 0(i) is the state vector of the entire interconnected system and X k (ω 0(i)) is a nonlinear measurement function of bus k. The quantity n k (i) is the measurement error with mean equal to zero and which corresponds to bus k.

Initially, we focus on the linearized DC state estimation problem. The state vector ω 0(i) is taken as the voltage phase angle vector ω 0 for all busses. Therefore, the nonlinear measurement model for state estimation (84) is approximated by

$$ {d_{k}(i)}= {\boldsymbol {\omega_{0}^{H}}\boldsymbol x_{k}(i)+ n_{k}(i)},~~~ k=1,2, \ldots, 14, $$
((85))

where x k (i) is the measurement Jacobian vector for bus k. Then, the aim of the distributed estimation algorithm is to compute an estimate of ω 0, which can minimize the cost function given by

$$ {J_{k}({\boldsymbol \omega_{k}(i)})} = {\mathbb{E} |{ d_{k}(i)}-{{\boldsymbol\omega_{k}^{H}}(i)}}{\boldsymbol x_{k}(i)}|^{2}. $$
((86))

We compare the proposed algorithms with the M-CSE algorithm [4], the single-link strategy [39], the standard diffusion RLS algorithm [38], and the standard diffusion LMS algorithm [2] in terms of MSE performance. The MSE comparison is used to determine the accuracy of the algorithms and compare the rate of convergence. We define the IEEE 14-bus system as in Fig. 14.

Fig. 14
figure 14

IEEE 14-bus system for simulation

All busses are corrupted by additive white Gaussian noise with variance \(\sigma ^{2}_{n,k}\in \,[\!0.001,0.01]\). The step size for the standard diffusion LMS [2], the proposed ES-LMS, and SI-LMS algorithms is 0.15. The parameter vector ω 0 is set to an all-one vector. For the diffusion RLS, ES-RLS, and SI-RLS algorithms, the forgetting factor λ is set to 0.945 and δ is equal to 0.001. The sparsity parameters of the SI-LMS/RLS algorithms are set to ρ=0.07 and ε=10. The results are averaged over 100 independent runs. We simulate the proposed algorithms for smart grids under a static scenario.

From Fig. 15, it can be seen that ES-RLS has the best performance and significantly outperforms the standard diffusion LMS [2] and the \(\mathcal {M}\)\(\mathcal {CSE}\) [4] algorithms. The ES-LMS is slightly worse than the ES-RLS, which outperforms the remaining techniques. The SI-RLS is worse than the ES-LMS but is still better than SI-LMS, while the SI-LMS remains better than the diffusion RLS, LMS, and \(\mathcal {M}\)-\(\mathcal {CSE}\) algorithms and the single-link strategy.

Fig. 15
figure 15

MSE performance curves for smart grids

6 Conclusions

In this paper, we have proposed the ES-LMS/RLS and SI-LMS/RLS algorithms for distributed estimation in applications such as wireless sensor networks and smart grids. We have compared the proposed algorithms with existing methods. We have also devised analytical expressions to predict their MSE steady-state performance and tracking behavior. Simulation experiments have been conducted to verify the analytical results and illustrate that the proposed algorithms significantly outperform the existing strategies, in both static and time-varying scenarios, in examples of wireless sensor networks and smart grids.

References

  1. CG Lopes, AH Sayed, Incremental adaptive strategies over distributed networks. IEEE Trans. Signal Process. 48(8), 223–229 (2007).

    MathSciNet  Google Scholar 

  2. CG Lopes, AH Sayed, Diffusion least-mean squares over adaptive networks: Formulation and performance analysis. IEEE Trans. Signal Process. 56(7), 3122–3136 (2008).

    Article  MathSciNet  Google Scholar 

  3. Y Chen, Y Gu, AO Hero, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. Sparse LMS for system identification, (2009), pp. 3125–3128.

  4. L Xie, D-H Choi, S Kar, HV Poor, Fully distributed state estimation for wide-area monitoring systems. IEEE Trans. Smart Grid. 3(3), 1154–1169 (2012).

    Article  Google Scholar 

  5. Y-H Huang, S Werner, J Huang, VG N. Kashyap, State estimation in electric power grids: meeting new challenges presented by the requirements of the future grid. IEEE Signal Process. Mag. 29(5), 33–43 (2012).

    Article  Google Scholar 

  6. D Bertsekas, A new class of incremental gradient methods for least squares problems. SIAM J. Optim. 7(4), 913–926 (1997).

    Article  MATH  MathSciNet  Google Scholar 

  7. A Nedic, D Bertsekas, Incremental subgradient methods for nondifferentiable optimization. SIAM J. Optim. 12(1), 109–138 (2001).

    Article  MATH  MathSciNet  Google Scholar 

  8. MG Rabbat, RD Nowak, Quantized incremental algorithms for distributed optimization. IEEE J. Sel. Areas Commun. 23(4), 798–808 (2005).

    Article  Google Scholar 

  9. FS Cattivelli, AH Sayed, Diffusion LMS strategies for distributed estimation. IEEE Trans. Signal Process. 58, 1035–1048 (2010).

    Article  MathSciNet  Google Scholar 

  10. PD Lorenzo, S Barbarossa, AH Sayed, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing. Sparse diffusion LMS for distributed adaptive estimation, (2012), pp. 3281–3284.

  11. G Mateos, ID Schizas, GB Giannakis, Distributed Recursive Least-Squares for Consensus-Based In-Network Adaptive Estimation. IEEE Trans. Signal Process. 57(11), 4583–4588 (2009).

    Article  MathSciNet  Google Scholar 

  12. R Arablouei, K Doǧançay, S Werner, Y-F Huang, Adaptive distributed estimation based on recursive least-squares and partial diffusion. IEEE Trans. Signal Process. 62, 3510–3522 (2014).

    Article  MathSciNet  Google Scholar 

  13. R Arablouei, S Werner, Y-F Huang, K Doǧançay, Distributed least mean–square estimation with partial diffusion. IEEE Trans. Signal Process. 62, 472–484 (2014).

    Article  MathSciNet  Google Scholar 

  14. CG Lopes, AH Sayed, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. Diffusion adaptive networks with changing topologies (Las Vegas, 2008), pp. 3285–3288.

  15. B Fadlallah, J Principe, in Proc. IEEE International Joint Conference on Neural Networks. Diffusion least-mean squares over adaptive networks with dynamic topologies, (2013), pp. 1–6.

  16. S-Y Tu, AH Sayed, On the influence of informed agents on learning and adaptation over networks. IEEE Trans. Signal Process. 61, 1339–1356 (2013).

    Article  MathSciNet  Google Scholar 

  17. T Wimalajeewa, SK Jayaweera, Distributed node selection for sequential estimation over noisy communication channels. IEEE Trans. Wirel. Commun. 9(7), 2290–2301 (2010).

    Article  Google Scholar 

  18. RC de Lamare, R Sampaio-Neto, Adaptive reduced-rank processing based on joint and iterative interpolation, decimation and filtering. IEEE Trans. Signal Process. 57(7), 2503–2514 (2009).

    Article  MathSciNet  Google Scholar 

  19. RC de Lamare, PSR Diniz, Set-membership adaptive algorithms based on time-varying error bounds for CDMA interference suppression. IEEE Trans. Vehi. Techn. 58(2), 644–654 (2009).

    Article  Google Scholar 

  20. L Guo, YF Huang, Frequency-domain set-membership filtering and its applications. IEEE Trans. Signal Process. 55(4), 1326–1338 (2007).

    Article  MathSciNet  Google Scholar 

  21. A Bertrand, M Moonen, Distributed adaptive node–specific signal estimation in fully connected sensor networks–part II: simultaneous and asynchronous node updating. IEEE Trans. Signal Process. 58(10), 5292–5306 (2010).

    Article  MathSciNet  Google Scholar 

  22. S Haykin, Adaptive Filter Theory, 4th edn. (Prentice Hall, Upper Saddle River, NJ, USA, 2002).

    Google Scholar 

  23. L Li, JA Chambers, in Proc. IEEE/SP 15th Workshop on Statistical Signal Processing. Distributed adaptive estimation based on the APA algorithm over diffusion networks with changing topology, (2009), pp. 757–760.

  24. X Zhao, AH Sayed, Performance limits for distributed estimation over LMS adaptive networks. IEEE Trans. Signal Process. 60(10), 5107–5124 (2012).

    Article  MathSciNet  Google Scholar 

  25. L Xiao, S Boyd, Fast linear iterations for distributed averaging. Syst. Control Lett. 53(1), 65–78 (2004).

    Article  MATH  MathSciNet  Google Scholar 

  26. R Olfati-Saber, RM Murray, Consensus problems in networks of agents with switching topology and time-delays. IEEE Trans. Autom. Control. 49, 1520–1533 (2004).

    Article  MathSciNet  Google Scholar 

  27. A Jadbabaie, J Lin, AS Morse, Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Trans. Autom. Control. 48(6), 988–1001 (2003).

    Article  MathSciNet  Google Scholar 

  28. R Meng, RC de Lamare, VH Nascimento, in Proc. Sensor Signal Processing for Defence. Sparsity-aware affine projection adaptive algorithms for system identification (London, UK, 2011).

  29. Y Chen, Y Gu, A Hero, Regularized least-mean-square algorithms. Tech. Rep. AFOSR (2010).

  30. S Xu, RC de Lamare, in Proc. Sensor Signal Processing for Defence. Distributed conjugate gradient strategies for distributed estimation over sensor networks (London, UK, 2012).

  31. F Cattivelli, AH Sayed, Diffusion strategies for distributed Kalman filtering and smoothing. IEEE Trans. Autom. Control. 55(9), 2069–2084 (2010).

    Article  MathSciNet  Google Scholar 

  32. AH Sayed, Fundamentals of adaptive filtering (John Wiley&Sons, Hoboken, NJ, USA, 2003).

    Google Scholar 

  33. T Kailath, AH Sayed, B Hassibi, Linear Estimation (Prentice-Hall, Englewood Cliffs, NJ, USA, 2000).

    Google Scholar 

  34. RC de Lamare, PSR Diniz, Blind adaptive interference suppression based on set-membership constrained constant-modulus algorithms with dynamic bounds. IEEE Trans. Signal Process. 61(5), 1288–1301 (2013).

    Article  MathSciNet  Google Scholar 

  35. Y Cai, RC de Lamare, Low-complexity variable step-size mechanism for code-constrained constant modulus stochastic gradient algorithms applied to CDMA interference suppression. IEEE Trans. Signal Process. 57(1), 313–323 (2009).

    Article  MathSciNet  Google Scholar 

  36. B Widrow, SD Stearns, Adaptive signal processing (Prentice-Hall, Englewood Cliffs, NJ, USA, 1985).

    MATH  Google Scholar 

  37. E Eweda, Comparison of RLS, LMS, and sign algorithms for tracking randomly time-varying channels. IEEE Trans. Signal Process. 42, 2937–2944 (1994).

    Article  Google Scholar 

  38. FS Cattivelli, CG Lopes, AH Sayed, Diffusion recursive least-squares for distributed estimation over adaptive networks. IEEE Trans. Signal Process. 56(5), 1865–1877 (2008).

    Article  MathSciNet  Google Scholar 

  39. X Zhao, AH Sayed, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. Single-link diffusion strategies over adaptive networks, (2012), pp. 3749–3752.

  40. R Arablouei, S Werner, K Doǧançanay, Y-F Huang, Analysis of a reduced-communication diffusion LMS algorithm. Signal Processing. 117, 355–361 (2015).

    Article  Google Scholar 

  41. A Bose, Smart transmission grid applications and their supporting infrastructure. IEEE Trans. Smart Grid. 1(1), 11–19 (2010).

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported in part by the US National Science Foundation under Grants CCF-1420575, CNS-1456793, and DMS-1118605.

Part of this work has been presented at the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vancouver, Canada and 2013 IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, Saint Martin.

The authors wish to thank the anonymous reviewers, whose comments and suggestions have greatly improved the presentation of these results.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Songcen Xu.

Additional information

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, S., de Lamare, R.C. & Poor, H.V. Adaptive link selection algorithms for distributed estimation. EURASIP J. Adv. Signal Process. 2015, 86 (2015). https://doi.org/10.1186/s13634-015-0272-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-015-0272-4

Keywords