Research  Open  Published:
Variable forgetting factor mechanisms for diffusion recursive least squares algorithm in sensor networks
EURASIP Journal on Advances in Signal Processingvolume 2017, Article number: 57 (2017)
Abstract
In this work, we present lowcomplexity variable forgetting factor (VFF) techniques for diffusion recursive least squares (DRLS) algorithms. Particularly, we propose lowcomplexity VFFDRLS algorithms for distributed parameter and spectrum estimation in sensor networks. For the proposed algorithms, they can adjust the forgetting factor automatically according to the posteriori error signal. We develop detailed analyses in terms of mean and mean square performance for the proposed algorithms and derive mathematical expressions for the mean square deviation (MSD) and the excess mean square error (EMSE). The simulation results show that the proposed lowcomplexity VFFDRLS algorithms achieve superior performance to the existing DRLS algorithm with fixed forgetting factor when applied to scenarios of distributed parameter and spectrum estimation. Besides, the simulation results also demonstrate a good match for our proposed analytical expressions.
Introduction
Distributed estimation is commonly utilized for distributed data processing over sensor networks, which exhibits increased robustness, flexibility, and system efficiency compared to centralized processing. Owing to these merits, distributed estimation has received more and more attention and been widely used in applications ranging from environmental monitoring [1], medical data collecting for healthcare [2], animal tracking in agriculture [1], monitoring physical phenomena [3], localizing moving mobile terminals [4, 5] to national security. Particularly, distributed estimation technique relies on the cooperation among geographically spread sensor nodes to process locally collected data. With different cooperation strategies employed, distributed estimation algorithms can be classified into the incremental type and the diffusion type. Note that we consider the diffusion cooperation strategy in this paper since the incremental strategy requires the definition of a path through the network and may be not suitable for large networks or dynamic configurations [6, 7]. Many distributed estimation algorithms with the diffusion strategy have been put forward recently, such as diffusion leastmean squares (LMS) [8, 9], diffusion sparse LMS [10–12], variable step size diffusion LMS (VSSDLMS) [13, 14], diffusion recursive least squares (RLS) [6, 7], distributed sparse RLS [15], distributed sparse total least squares (LS) [16], diffusion information theoretic learning (ITL) [17], and the diffusionbased algorithm for distributed censor regression [18]. Among assorted distributed estimation algorithms, the RLSbased algorithms achieve superior performance to the LMSbased ones by inheriting the advantages of fast convergence and low steadystate misadjustment from the RLS technique. Thus, the distributed estimation algorithms based on the diffusion strategy and the RLS adaptive technique are investigated in this paper.
However, the existing RLSbased distributed estimation algorithms provide a fixed forgetting factor, which has some drawbacks. With a fixed forgetting factor, the algorithm fails to keep up with realtime variations in environment, such as variations in sensor network topology. Moreover, it is expected to adjust the forgetting factors automatically according to the estimation errors rather than choose appropriate values for them through simulations. There have been several studies on variable forgetting factor (VFF) methods. Specifically, the classic gradientbased VFF (GVFF) mechanism was proposed in [19], and most of the existing VFF mechanisms are extensions of this method [20–24]. Nevertheless, the GVFF mechanism requires a large amount of computation. In order to reduce the computational complexity, the improved lowcomplexity VFF mechanisms have been reported in [25, 26]. To the best of our knowledge, the existing VFF mechanisms are mostly employed in a centralized context and have not been considered in the field of distributed estimation yet.
In this work, the previously reported VFF mechanisms [25, 26] are employed to the diffusion RLS algorithms for distributed signal processing applications, by simplifying the inverse relation between the forgetting factor and the adaptation component to provide lower computational complexity. The resulting algorithms are referred to as lowcomplexity timeaveraged VFF diffusion RLS (LTVFFDRLS) algorithm and lowcomplexity correlated timeaveraged VFF diffusion RLS (LCTVFFDRLS) algorithm, respectively. Compared with the GVFF mechanisms, the proposed LTVFF and LCTVFF mechanisms can reduce the computational complexity significantly [25, 26]. Then, we carry out the analysis for the proposed algorithms in terms of the mean and mean square error performance. Finally, we provide simulation results to verify the effectiveness of the proposed algorithms when applied in distributed parameter estimation and distributed spectrum estimation.
Our main contributions are summarized as follows:

1)
We propose the lowcomplexity VFFDRLS algorithms for distributed estimation in sensor networks. To the best of our knowledge, the VFF mechanisms have not been considered in the distributed estimation algorithms yet.

2)
We study the mean and mean square performance for the proposed algorithms in a general case, and provide the transient analysis for a specialized case. Specifically, for the general case, in terms of the mean performance, we show that the mean value of the weight error vector approaches zero as iteration numbers go to infinity, which implies the asymptotical convergence of the proposed algorithms; from the perspective of mean square performance, we derive the mathematical expressions for the steadystate MSD and EMSE values. In the specialized case, we study the transient analysis by focusing on the learning curve and prove that the proposed algorithms are convergent and the convergence rate is related to the varying forgetting factors.

3)
We perform simulations to evaluate the performance of the proposed algorithms when applied to distributed parameter estimation and distributed spectrum estimation tasks. The simulation results indicate that the proposed algorithms exhibit remarkable improvements in convergence and steadystate performance when compared with the DRLS algorithm that has a fixed forgetting factor. Besides, effectiveness of our analytical expressions for calculating the steadystate MSD and EMSE is verified by the simulation results. In addition, we also provided detailed simulation results regarding the choice of the parameters in the proposed algorithms to help with the parameter selection in practice.
This paper is organized as follows. Section 2 provides the system model for the distributed estimation over sensor networks. Besides, the DRLS algorithm with the fixed forgetting factor is described briefly. In Section 3, two lowcomplexity VFF mechanisms are presented, followed by the analyses for the variable forgetting factor in terms of steadystate statistical properties. Besides, the proposed LTVFFDRLS algorithm and the LCTVFFDRLS algorithm are presented. In the last part of this section, the computational complexity of the VFF mechanisms as well as the proposed algorithms is analyzed. In Section 4, detailed analyses based on mean and meansquare performance for the proposed algorithms are carried out and analytical expressions to compute MSD and EMSE are derived. In addition, transient analysis for a specialized case is provided in the last part of Section 4. In Section 5, simulation results are presented for distributed parameter estimation and distributed spectrum estimation. Section 6 draws the conclusions.
Notation: Boldface letters are used for vectors or matrices, while normal font for scalar quantities. Matrices are denoted by capital letters and small letters are used for vectors. We use the operator row {·} to denote a row vector, col {·} to denote a column vector, and diag {·} to denote a diagonal matrix. The operator E[·] stands for the expectation of some quantity, and Tr {·} represents the trace of a matrix. We use (·)^{T} and (·)^{−1} to denote the transpose and inverse operator, respectively, and (·)^{∗} for complex conjugatetransposition. We also use the symbol I _{ n } to represent an identity matrix of size n and $\mathbf {\mathbb {I}}$ to denote a vector of appropriate size with all elements equal to one.
System model and diffusionbased DRLS algorithm
In this section, we first illustrate the system model for the distributed estimation over sensor networks. Following this, we review the conventional DRLS algorithm with the fixed forgetting factor briefly.
System model
Let us consider a sensor network consisting of N sensor nodes which are spatially distributed over a geographical area. The set of nodes connected to node k including itself are called the neighbor nodes of node k, denoted by $\mathcal {N}_{k}$. The number of the nodes linked to node k is the degree of k, denoted by n _{ k }. The system model for the distributed estimation over sensor networks is presented in Fig. 1.
At each time instant i, each node k has access to complex valued time realizations {d _{ k,i },u _{ k,i }}, k=1,2,…,N, i=1,2,…, with d _{ k,i } a scalar measurement and u _{ k,i } a M×1 input vector. The relation between the measurement d _{ k,i } and the input vector u _{ k,i } can be characterized as
where w ^{o} is the unknown optimal weight vector of size M×1, and v _{ k,i } is zeromean additive white Gaussian noise with variance ${\sigma }_{v,k}^{2}$. Particularly, we assume that the noise variance has been determined in advance somehow. We also assume that the noise samples v _{ k,i }, k=1,2,…,N, i=1,2,…, are independent of each other as well as the input vectors u _{ k,i }. We aim to estimate the unknown optimal weight vector w ^{o} in a distributed manner. That is, each sensor node k obtains a local estimate w _{ k,i } of size M×1 to approach the optimal weight vector w ^{o} as much as possible. To this end, each node k not only uses its local measurement d _{ k,i } and input vector u _{ k,i } but also cooperates with its closest neighbors for updating its local estimate w _{ k,i }. Specifically, by cooperation, each node k has access to its neighbors’ data {d _{ l,i },u _{ l,i }} and estimates w _{ l,i } at each time instant i where $l\in \mathcal {N}_{k}$, and then, each node k fuses all the available information to update its local estimate ψ _{ k,i }.
Let us first introduce some vectors and matrices. At each time instant i, by collecting all nodes’ measurements into vector y _{ i }, noise samples into vector v _{ i } (both of length N), and input vectors into the matrix H _{ i } of size N×M, we obtain
Following this, we define the covariance matrix of the noise vector v _{ i } as
Next, we stack y _{ i }, v _{ i } and H _{ i } from time instant 0 to time instant i into matrices respectively, which are given by
Besides, we define $\mathbf {\mathcal {R}}_{v,i}=E[\mathbf {\mathcal {V}}_{i}\mathbf {\mathcal {V}}_{i}^{*}]$.
Brief review of diffusionbased DRLS algorithm
In this part, we give a brief introduction to the diffusionbased DRLS algorithm [6, 7].
For the diffusionbased DRLS algorithm, the local optimization problem to estimate the optimal weight vector w ^{o} at each node k can be formulated as follows:
Note that the notation $\\mathbf {a}\_{\boldsymbol {\Sigma }}^{2}=\mathbf {a}^{*}\boldsymbol {\Sigma }\mathbf {a}$ represents the weighted vector norm of any positive definite Hermitian matrix Σ. Besides, the matrix Π _{ i } is given by Π _{ i }=λ ^{i+1} Π where 0≪λ<1 representing the forgetting factor and Π=δ ^{−1} I _{ M } with δ>0. Furthermore, the matrix $\boldsymbol {\mathcal {W}}_{k,i}$ can be expressed as $\boldsymbol {\mathcal {W}}_{k,i}=\boldsymbol {\mathcal {R}}_{v,i}^{1}\boldsymbol {\Lambda }_{i}\text {diag}\{\mathbf {C}_{k},\mathbf {C}_{k},\ldots,\mathbf {C}_{k}\}$, where Λ _{ i }=diag{I _{ N },λ I _{ N },…,λ ^{i} I _{ N }} and C _{ k } is a diagonal matrix. It is worth noting that the main diagonal elements of the matrix C _{ k } is composed of the kth column of matrix C. Particularly, the matrix C is the adaptation matrix for the diffusionbased DRLS algorithm and satisfies $\mathbf {\mathbb {I}}^{T}\mathbf {C}=\mathbf {\mathbb {I}}$ and $\mathbf {C}\mathbf {\mathbb {I}}=\mathbf {\mathbb {I}}$ [6]. Also, note that the matrix C is a doubly stochastic matrix, that is, both a left stochastic matrix and a right stochastic matrix.
The optimization problem (5) can be rewritten as follows [6]:
where C _{ l,k } represents the (l,k)th element of the matrix C. The closedform solution to (6) is given by [6, 7]
where P _{ k,i } can be expressed as
However, the closedform solution in (7) is obtained via calculating the inversion of matrices, which requires large computation. Instead, the diffusionbased DRLS algorithm provides a recursive approach to solve (6), which can be implemented by the following two steps.
Step 1: Let us take the updates at time instant i for example. Note that we denote the iteration number at time instant i as the superscript (·)^{l} with l=0 representing the initial value. At the very start, we initialize the intermediate local estimate ψ _{ k,i } and the inverse matrix P _{ k,i } for each node k by utilizing the updated results from time instant i−1, that is
Then, for each node k, its data is updated incrementally among its neighbors, which is given by
where the left arrow denotes the operation of assignment. Finally, each node k obtains its ultimate intermediate local estimate ψ _{ k,i } which can be expressed as
Step 2: Each node k combines the ultimate intermediate local estimate of its own, i.e., ψ _{ k,i }, obtained in step 1 with that of its neighbors, i.e., ψ _{ l,i }, $l\in \mathcal {N}_{k}$ by performing the following diffusion to obtain the local estimate w _{ k,i }:
where A _{ l,k } denotes the (l,k)th element of the matrix A. Particularly, the matrix A is the combination matrix for the diffusionbased DRLS algorithm and is chosen such that $\mathbf {\mathbb {I}}^{T}\mathbf {A}=\mathbf {\mathbb {I}}$ [6].
Note that the steps (9)–(13) constitute the diffusionbased DRLS algorithm [6, 7].
Lowcomplexity variable forgetting factor mechanisms
In this section, we introduce the LTVFF mechanism and the LCTVFF mechanism that are employed by our proposed algorithms. Particularly, the analyses for the variable forgetting factor in terms of the steadystate properties of the firstorder statistics are presented, and the LTVFFDRLS algorithm that employs the LTVFF mechanism as well as the LCTVFFDRLS algorithm that applies the LCTVFF mechanism are proposed. In the last part of this section, we analyze the computational complexity for these two VFF mechanisms as well as the proposed algorithms.
LTVFF mechanism
Motivated by the VSS mechanism [13, 14] for the diffusion LMS algorithm, the lowcomplexity VFF mechanisms are designed such that smaller forgetting factors are employed when the estimation errors are large in order to obtain a faster convergence speed, whereas the forgetting factor increases when the estimation errors become small so as to yield better steadystate performance. Based on the above idea, an effective rule to adapt the forgetting factor can be formulated as
where the quantity ζ _{ k }(i) is related to the estimation errors and varies in an inverse way to the forgetting factor, which is referred to as the adaptation component. The operator $[\cdot ]_{\lambda _{}}^{\lambda _{+}}$ denotes the truncation of the forgetting factor to the limits of the range [λ _{+},λ _{−}].
For the LTVFF mechanism, the adaptation component is given by
with parameters 0<α<1 and β>0. Besides, α is chosen close to 1 and β is set to be a small value. The quantity e _{ k }(i) denotes the priori estimation error [19] of each node for the DRLS algorithm, which can be expressed as
That is to say, in the LTVFF mechanism, the adaptation component is updated based on the instantaneous estimation error.
The LTVFF mechanism is given by (14) and (15). The value of the forgetting factor λ _{ k }(i) is controlled by the parameters α and β. Particularly, the effects of α and β on the performance of our proposed algorithms are investigated in Section 5. As can be seen from (14) and (15), large estimation errors will cause an increase in the adaptation component ζ _{ k }(i), which yields a smaller forgetting factor and provides a faster tracking speed. Conversely, small estimation errors will lead to the decrease of the adaptation component ζ _{ k }(i), and thus, the forgetting factor λ _{ k }(i) will be increased to yield smaller steadystate misadjustment.
Next, we study the steadystate statistical properties of the adaptation component ζ _{ k }(i) and the forgetting factor λ _{ k }(i). Based on (15), it is reasonable to assume that ζ _{ k }(i) and ζ _{ k }(i−1) are approximately equivalent when i→∞. By taking expectations on both sides of (15) and letting i goes to infinity, we can obtain E[ζ _{ k }(∞)]
Then, we compute the quantity of E[e _{ k }(∞)^{2}]. Let us define the weight error vector for node k as
According to (16) and (18), we can rewrite E[e _{ k }(i)^{2}] as
where the term $E\left [\left \mathbf {u}_{k,i}^{T}\mathbf {\widetilde {w}}_{k,i1}\right ^{2}\right ]$ denotes the excess error. Since it is sufficiently small when i→∞ compared with the variance of noise, it can be neglected. As a consequence, the following approximation holds
where ε _{min} denotes the minimum meansquare error and can be expressed as
Subsequently, by substituting (20) into (17), we can approximately write
According to (14), we can deduce
By substituting (22) into (23), we can obtain the firstorder statistics of the forgetting factor for the LTVFF mechanism:
By applying the LTVFF mechanism to the diffusionbased DRLS algorithm, we propose the LTVFFDRLS algorithm, which is exhibited in the left column of Table 1.
LCTVFF mechanism
For the LCTVFF mechanism, the forgetting factor can be calculated through (14) while the adaptation component ζ _{ k }(i) can be adjusted according to an alternative rule, that is, the timeaveraged estimation of the correlation of two consecutive estimation errors is employed to the updating equation of the adaptation component ζ _{ k }(i). Therefore, the rule to update the adaptation component can be described as
where 0<α<1 and β>0. Particularly, α is set close to 1 and β is chosen to be slightly larger than 0. The quantity ρ _{ k }(i) denotes the timeaveraged estimation of the correlation of two consecutive estimation errors, which is defined by
where 0<γ<1 and γ is slightly smaller than 1. Note that the LCTVFF mechanism is given by (14), (25), and (26).
Next, we consider the steadystate statistical properties of ρ _{ k }(i), ζ _{ k }(i), and λ _{ k }(i) for the LCTVFF mechanism. As we will see in simulation results, the proposed algorithm converges to the steadystate in numerous iterations, and thus, the values of ρ _{ k }(i−1) and ρ _{ k }(i) can be assumed to be approximately equivalent, respectively, when i is large enough. Thus, we can obtain E[e _{ k }(i−1)e _{ k }(i)]≈E[e _{ k }(i)^{2}] and ρ _{ k }(i−1)≈ρ _{ k }(i) when i→∞. Then, by taking expectations on both sides of (26) and letting i go to infinity, we can obtain the firstorder statistical properties of ρ _{ k }(i):
To study the secondorder statistical properties of ρ _{ k }(i), we consider the square of (26), which is given by
Recall that e _{ k }(i−1) and e _{ k }(i) can be considered equivalent when i→∞, and thus, we can rewrite (28) as
Since (1−γ)^{2}e _{ k }(i)^{4} is sufficiently small when compared with other terms in (29), it can be neglected. Therefore, we can obtain
According to (16) and (26), the quantities of ρ _{ k }(i−1) and e _{ k }(i)^{2} can be considered uncorrelated at steady state, that is to say, E[ρ _{ k }(i−1)e _{ k }(i)^{2}]≈E[ρ _{ k }(i−1)]E[e _{ k }(i)^{2}]. Note that the detailed derivation is presented in Appendix A: Proof of the uncorrelation of ρ _{ k }(i−1) and e _{ k }(i)^{2} in the steady state. Then, by taking expectations on both sides of (30), we can obtain the following result:
Substituting (20) and (27) into (31) results in
To calculate the firstorder statistics of the adaptation component ζ _{ k }(i), we take expectations on both sides of (25) and let i goes to infinity, as a result, we obtain
Substituting (32) into (33) leads to
Consequently, we have the firstorder steadystate statistics of the forgetting factor for the LCTVFF mechanism as follows:
By employing the LCTVFF mechanism to the diffusionbased DRLS algorithm, we propose the LCTVFFDRLS algorithm, which is presented in the right column of Table 1.
Computational complexity
In this part, we study the computational complexity of the proposed LTVFF and LCTVFF mechanisms in comparison with the GVFF mechanism. Generally, we evaluate the number of arithmetic operations in terms of complex additions and multiplications for each node at each iteration. The results have been shown in Tables 2 and 3. From Table 3, the additional computational complexity of the proposed LTVFF and LCTVFF mechanisms is evaluated by fixed small values for each node at each iteration. However, for the GVFF mechanism, the additional computational complexity increases with the size of the sensor network for each node at each iteration. The result in Table 3 clearly reveals that the proposed LTVFF and LCTVFF mechanisms greatly reduce computational cost when compared to the GVFF mechanism.
Performance analysis
In this section, we carry out the analyses in terms of mean and mean square error performance for the proposed LTVFFDRLS and LCTVFFDRLS algorithms. In particular, we derive mathematical expressions to describe the steadystate behavior based on MSD and EMSE. In addition, we also perform transient analysis in a specialized case for the proposed algorithms in the last part of this section. To proceed with the analysis, we first introduce several assumptions, which have been widely adopted in the analysis for the RLStype algorithms and have been verified by simulations [7, 27].
Assumption 1
To facilitate analytical studies, we assume that all the input vectors u _{ k,i },∀k,i are independent of each other and the correlation matrix of the input vector u _{ k,i } is invariant over time, which is defined as
Assumption 2
For the proposed LTVFF and LCTVFF mechanisms, when i becomes large, we assume that there exists a positive number N _{ i }, when i>N _{ i }, for which we have that the forgetting factor λ _{ k }(i) varies slowly around its mean value, that is
For the RLStype algorithms with the fixed forgetting factor, we have the ergodicity assumption for P _{ k,i } [6, 7, 27], that is, the time average of a sequence of random variables can be replaced by its expected value so as to make the analysis for the performance of these algorithms tractable. Similarly, for the RLStype algorithms with variable forgetting factors, we still have the ergodicity assumption:
Assumption 3
We assume that there exists a number N _{ i }>0, when i>N _{ i }, for which we can replace $\mathbf {P}_{k,i}^{1}$ by its expected value $E\left [\mathbf {P}_{k,i}^{1}\right ]$, which can be represented as
where $\lim \limits _{i\to \infty }E\left [\mathbf {P}_{k,i}^{1}\right ]$ can be calculated through
The derivation is presented in Appendix B. Since $\lim \limits _{i\to \infty }E\left [\mathbf {P}_{k,i}^{1}\right ]$ is independent of i, we can denote it by $\mathbf {P}_{k}^{1}$. Moreover, based on the ergodicity assumption, it is also common in the analysis of the performance of the RLStype algorithms to replace the random matrix P _{ k,i } by P _{ k } when i is large enough.
Mean performance
In light of (1) and (13), the following relation holds [7] after the incremental update of ψ _{ l,i } is complete:
By substituting (1) and (18) into (40), we obtain the following equation:
Next, let us define the intermediate weight error vector $\widetilde {\boldsymbol {\psi }}_{k,i}$ for node k as
Substituting (42) into (41) results in the following result:
Then, we construct $\widetilde {\mathbf {w}}_{k,i}$ from $\widetilde {\boldsymbol {\psi }}_{l,i}$ based on (13) and obtain
Note that P _{ k,i } can be replaced by P _{ k } when i is large enough (cf. Assumption 3), and thus, it is reasonable to assume that P _{ k,i } converges as i→∞. Therefore, we can approximately have
Besides, in view of Assumption 3 and the Eq. (39), we can obtain
By combining (45) and (46), we have the following approximation:
Then, substituting (47) into (44) yields the following result when i is sufficiently large:
Following this, two global matrices $\widetilde {\mathbf {W}}_{i}$ and $\boldsymbol {\mathcal {P}}$ are built in the following form in order to collect the weight error vectors $\widetilde {\mathbf {w}}_{k,i},k=1,\cdots,N$ and matrices P _{ k },k=1,⋯,N, respectively:
In addition, we introduce a global diagonal matrix D(i) to collect the forgetting factors of all nodes at time instant i, which is given by
Using the vectors in (2), the term $\sum \limits _{m=1}^{N}\frac {C_{m,l}}{\sigma _{vm}^{2}}\mathbf {u}_{m,i}v_{m,i}$ in (44) can be rewritten as $\mathbf {H}_{i}^{*}\mathbf {C}_{l}\mathbf {R}_{v}^{1}\mathbf {v}_{i}$. By collecting the vectors $\mathbf {H}_{i}^{*}\mathbf {C}_{l}\mathbf {R}_{v}^{1}\mathbf {v}_{i}$, l=1,2,…,N, into a block diagonal matrix G _{ i }, we obtain
To separate the noise vectors, we can rewrite (51) as
where ⊗ denotes the Kronecker product of two matrices [28]. Subsequently, we express (48) in a more compact way, which leads to the following updating equation for the global matrix $\widetilde {\mathbf {W}}_{i}$:
In order to simplify the notation Λ _{ i } A, we denote it as F(i), and thus, we can rewrite (53) as
In order to facilitate analysis, we assume that $\widetilde {\mathbf {W}}_{i1}$ and F(i) can be considered uncorrelated, that is, $E[\widetilde {\mathbf {W}}_{i1}\mathbf {F}(i)]\approx E[\widetilde {\mathbf {W}}_{i1}]E[\mathbf {F}(i)]$. As we will see in simulation results, this assumption works well for theoretical analysis, which matches numerical results perfectly. By taking expectations on both sides of (54), we obtain the following result:
Recall (52), since the noise samples v _{ i } have zero mean, E[G _{ i }] equals to zero; therefore, we can obtain
Following this, we assume that there exists a number N _{ i }>0 and iterate (56) starting from the time instant i to N _{ i }, as a result, we obtain
Recalling that F(i)=Λ _{ i } A, with Λ _{ i } a diagonal matrix, we have the following relation for each element in F(i):
where the subscript m,n represents the (m,n)th element in the matrix. Given that the elements of A are all between 0 and 1 and each element in the diagonal matrix Λ _{ i } does not exceed the upper bound λ _{+}, which is smaller than unity, we have
Each element in the product $\prod \limits _{j=N_{i}+1}^{i}E[\mathbf {F}(j)]$ can be viewed as a polynomial of F _{1,1}(i),F _{1,2}(i),⋯, with an order of i−N _{ i }+1. When i→∞, each element of this product approaches zero since F _{1,1}(i),F _{1,2}(i),⋯ are all smaller than unity. Now, assuming that all the elements of $E[\widetilde {\mathbf {W}}_{N_{i}}]$ are bounded in absolute value by some finite constant, therefore, all the elements of $E[\widetilde {\mathbf {W}}_{i}]$ converge to zero when i→∞. As a result, we can conclude that the proposed LTVFFDRLS and LCTVFFDRLS algorithms converge asymptotically when i→∞.
Meansquare error and deviation performances
In this part, we perform analyses for the proposed LTVFFDRLS and LCTVFFDRLS algorithms based on mean square performance and derive expressions for the steadystate MSD and EMSE, which are defined as
We start with (54) and then operate recursively from time instant N _{ i }, which yields
Then, the kth column of $\widetilde {\mathbf {W}}_{i}$ is given by
where e _{ k } is a column vector of length N with unity for the kth element and zero for the others. Next, we write the Euclidean norm of the weight error vector $\widetilde {\mathbf {w}}_{k,i}$, that is, $\\widetilde {\mathbf {w}}_{k,i}\^{2}$, or equivalently, $Tr\{\widetilde {\mathbf {w}}_{k,i}\widetilde {\mathbf {w}}_{k,i}^{*}\}$.
Since the elements of F(i) are all bounded by zero and one, $\prod \limits _{j=N_{i}+1}^{i}\mathbf {F}(j)$ vanishes when i→∞, which leads to the expectation of the first term becoming zero. Moreover, seeing that the cross terms incorporate the zeromean vectors v _{ i }, their expectations also become zero. As a result, we have the following expression:
which can be rewritten as
For simplicity, we have the following notation:
where J ^{t,l}(i) is a matrix of size N×N. By combining (52), (64), and (65), let us first compute $(\mathbf {I}_{N}{\otimes }\mathbf {v}_{t})\mathbf {J}^{t,l}(i)(\mathbf {I}_{N}{\otimes }\mathbf {v}_{l}^{*})$. According to the properties of the Kronecker product, we have the following equality:
Therefore, $(\mathbf {I}_{N}{\otimes }v_{t})\mathbf {J}^{t,l}(i)\left (\mathbf {I}_{N}{\otimes }v_{l}^{*}\right)$ can be expressed as
Note that, in light of (65), the matrix J ^{t}(i) and the covariance matrix of noise R _{ v } can be considered uncorrelated. Then, by taking expectations on both sides of (67), we have the following results:
where we drop the index and denote J ^{t,t}(i) as J ^{t}(i). By substituting (68) into (64), we can obtain
Note that P _{ k }, k=1,2,…,N is Hermitian; therefore, we have the following expression:
where $\mathbf {G}_{t}\mathbf {J}^{t}(i)\mathbf {G}_{t}^{*}$ can be represented as a block matrix K ^{t}(i), which can be decomposed into N×N blocks of size M×M each. The (m,l)th block is given by
By taking expectations on both sides of (71), we obtain the following equality:
Substituting (65) and (72) into (70) yields the following result:
In view of Assumption 2, we can verify that there exists a number N _{ i }>0, when i>N _{ i }, for which F(i) satisfies
Therefore, we replace E[F(i)] with E[F(∞)] when i>N _{ i } and then reformulate (73) as
Subsequently, we replace i−t with t in (75) and then let i goes to infinity. As a result, we can obtain the expression of the steadystate MSD for node k:
Next, we calculate the steadystate EMSE for node k. According to (60), the EMSE for node k can be expressed as follows
Note that u _{ k,i } is independent of $\widetilde {\mathbf {w}}_{k,i1}$. By substituting (76) into (77), we can obtain the expression of the steadystate EMSE for node k:
Expressions (76) and (78) describe the steadystate behavior of the proposed LTVFFDRLS and LCTVFFDRLS algorithms. By comparing the expressions (76) and (78) with the analytical results in [7], it is clear that the fixed matrix λ ^{2} A in the expressions for the conventional DRLS algorithms has been replaced by the matrix F(i) in the expressions (76) and (78), which is weighted by the matrix Λ _{ i }. Since Λ _{ i } varies from one iteration to the next, F(i) varies for each iteration as well, which improves the tracking performance of the resulting algorithms. Furthermore, since all the elements in F(i) are bounded by zero and unity, the values of the steadystate MSD and EMSE given by (76) and (78) are both very small values when i is large enough. Thus, we can verify that the proposed LTVFFDRLS and LCTVFFDRLS algorithms both converge in the meansquare sense.
Transient analysis under spatial invariance assumption
In this subsection, we consider a specialized case that the noise variances and input vector covariance matrices are the same for all the sensor nodes, and provide transient analysis for this specific case. Particularly, we assume spatial invariance:
In addition, to facilitate analysis, we assume that all elements of the adaptation matrix C are equal to $\frac {1}{N}$.
We study the transient analysis through focusing on the learning curve, which is obtained by depicting the squared priori estimation error, i.e., $E\left [\mathbf {u}_{k,i}^{*}(\mathbf {w}_{k,i}\mathbf {w}^{o})^{2}\right ]$ [29, 30], as a function of the iteration number i. We first rewrite this squared priori estimation error in a more compact form:
where we use the representation $\\mathbf {t}\_{\mathbf {A}}^{2}=\mathbf {t}^{*}\mathbf {A}\mathbf {t}$ in the last equality.
Then, we use the spatial invariance assumption to simply (39) and (48). Particularly, by taking advantage of the assumption that the input vector covariance matrix is the same over all sensor nodes, we can derive the following expression from (39), when i is large enough:
By substituting (82) into (48), we can arrive at
where we use the column vector s _{ i } to denote the quantity $\mathbf {R}_{u}^{1}\mathbf {H}_{i}^{*}\mathbf {v}_{i}$ in the third equality, and we use the property of the combination matrix, i.e., $\sum _{l=1}^{N}A_{l,k}=1, \forall k\in \{1, 2, \cdots, N\}$, to arrive at the fourth equality. Let us define
Note that Λ _{ i }=diag{λ _{ i }}. Then, we can write the recursive equation of type (83) for all sensor nodes in a more compact form as follows:
where $\mathbf {f}(i)=\mathbf {\mathbb {I}}\mathbf {A}^{T}\boldsymbol {\lambda }_{i}$ in the second equality. Then, we have the following global squared priori estimation error for all sensor nodes by using the last equality in (85):
where Σ=Λ _{ i } A R _{ u } A ^{T} Λ _{ i }, and we use the property of the Kronecker product, i.e., (66), in the fourth and fifth equalities, and the fact that both quantities of f(i)^{T} R _{ u } f(i) and $\mathbf {s}_{i}^{*}\mathbf {s}_{i}$ are scalar and they are independent to arrive at the last equality. Particularly, $E[\mathbf {s}_{i}^{*}\mathbf {s}_{i}]$ can be rewritten as
where we use the spatial invariance assumption, i.e., $\mathbf {v}_{i}\mathbf {v}_{i}^{*}=\text {diag}\{\sigma _{v}^{2}, \sigma _{v}^{2}, \cdots, \sigma _{v}^{2}\}=\sigma _{v}^{2}\mathbf {I}_{N}$ and $\mathbf {H}_{i}^{*}\mathbf {H}_{i}=\sum _{m=1}^{N} \mathbf {u}_{m,i}\mathbf {u}_{m,i}^{*}=\sum _{m=1}^{N}\mathbf {R}_{u_{m}}=N\mathbf {R}_{u}$, to arrive at the third and fourth equalities, respectively, and the symmetry of the input vector covariance matrix in the last equality. By plugging (87) back into (86), we have
For convenience, we use the notation $\\mathbf {t}\^{2}_{\text {vec}\{\mathbf {A}\}}$to denote the weighted norm $\\mathbf {t}\^{2}_{\mathbf {A}}$, where the symbol vec{A} represents the vectorization of a matrix. Particularly, by using the equality vec{A B C}=(C ^{T}⊗A)vec{B}, we can vectorize the matrix Σ=Λ _{ i } A R _{ u } A ^{T} Λ _{ i } as follows
where F(i)=Λ _{ i } A, $\boldsymbol {\mathcal {F}}_{i}=\mathbf {F}(i)\otimes \mathbf {F}(i)$ and γ=vec{R _{ u }}. Ultimately, we have
This recursive equation is stable and convergent if $E[\boldsymbol {\mathcal {F}}_{i}]$ is stable [31].
Particularly, the quantity $\boldsymbol {\mathcal {F}}_{i}$ has a spectral radius smaller than unity and thus is stable. This can be proved as follows: If we replace each element in Λ _{ i } by its upper bound λ _{+}, then we have $\boldsymbol {\mathcal {F}}_{i}$ replacced by $\lambda _{+}^{2}\mathbf {A}\otimes \mathbf {A}$. Note that A satisfies $\mathbf {\mathbb {I}}^{T}\mathbf {A}=\mathbf {\mathbb {I}}$, and then, we can readily verify that each column of A⊗A sums up to unity. Hence, the quantity $\lambda _{+}^{2}\mathbf {A}\otimes \mathbf {A}$ has the spectral radius $\lambda _{+}^{2}$ that is smaller than one. Given that each element in Λ _{ i } does not exceed λ _{+}, the spectral radius of $\boldsymbol {\mathcal {F}}_{i}$ is smaller than $\lambda _{+}^{2}$ and surely is smaller than unity. Therefore, for this specialized case, it can be verified theoretically that the proposed LTVFFDRLS and LCTVFFDRLS algorithms are convergent in terms of the learning curve and the convergence rate is related to the varying forgetting factors.
Also note that, since the convergence performance of the adaptive algorithms does not depend on the outside environment but rely on the network topology and the design of algorithms, the analytical results in this specialized case also apply to the general case.
Simulation results
In this section, we present the simulation results for the proposed LTVFFDRLS and LCTVFFDRLS algorithms when applied in two applications, that is, distributed parameter estimation and distributed spectrum estimation over sensor networks.
Distributed parameter estimation
In this part, we evaluate the performance of the proposed LTVFFDRLS and LCTVFFDRLS algorithms when applied to distributed parameter estimation in comparison with the DRLS algorithm with the fixed forgetting factor and the GVFFDRLS algorithm. In addition, we also verify the effectiveness of the proposed analytical expressions in (76) and (78) based on simulations.
We assume that there are 10 nodes in the sensor network and the length of the unknown weight vector is M=5. The input vectors u _{ k,i }, k=1,2,…,N are assumed to be Gaussian with zero means and variances $\left \{\sigma _{u,k}^{2}\right \}$ chosen randomly between 1 and 2 for each node. The Gaussian noise samples v _{ k,i }, k=1,2,…,N have variances $\left \{\sigma _{v,k}^{2}\right \}$ that are chosen randomly between 0.1 and 0.2 for each node. We generate the measurements {d _{ k,i }} according to (1). Simulation results are averaged over 100 experiments. The adaptation matrix C is governed by the Metropolis rule, while the choice of the diffusion matrix A follows the relativedegree rule [8]. The network topology used for the simulations is shown in Fig. 2.
Effects of α, β, and γ
In this subsection, we study the effects of the parameters α, β, and γ on the performance of the proposed LTVFF and LCTVFF mechanisms. For the LTVFF mechanism, we investigate the steadystate MSD values versus α for β=0.0015,0.002,0.0025,0.005. The simulation results are shown in Fig. 3. For the LCTVFF mechanism, we first depict the steadystate MSD values versus α for β=0.0025,0.005,0.0075,0.01 in Fig. 4. Then, the effects of γ are illustrated in Fig. 5 by investigating the steadystate MSD values against γ for different pairs of α and β.
As can be seen from Figs. 3 and 4 for both the LTVFF and LCTVFF mechanisms, the optimal choice of α and β is not unique. Specifically, different pairs of α and β can yield the same steadystate MSD value. For example, for the LTVFF mechanism, the pairs α=0.91,β=0.0015, α=0.89,β=0.002, and α=0.87,β=0.0025 provide almost the same steadystate MSD performance. For the LCTVFF mechanism, when γ=0.95, the pairs α=0.93,β=0.0025, α=0.90,β=0.005, α=0.85,β=0.0075, and α=0.80,β=0.01 yield almost the same steadystate MSD value. In addition, it can also be observed that with the decreasing of α and β, the steadystate performance degrades. Furthermore, the result in Fig. 5 reveals that the steadystate MSD performance of the LCTVFF mechanism does not change so much as γ varies for different pairs of α and β.
However, when we choose appropriate values for α, β, and γ, only considering the effects on the steadystate behaviors is not enough. This is because that the convergence speed is closely connected to the steadystate MSD values. That is to say, when the algorithm assumes a faster convergence speed, the steadystate error floor rises; if the convergence speed is controlled to be slower, the steadystate performance improves. Figures 6 and 7 show the tradeoff between convergence speed and steadystate performance by depicting learning curves against different values of α and β for LTVFFDRLS and LCTVFFDRLS algorithms, respectively. Therefore, we need to keep a good balance between the steadystate behaviors and the convergence speed in order to ensure good performance. In practical applications, the optimized values of α, β, and γ should be obtained through experiments and then stored for the future use.
MSD and EMSE performance
Figures 8 and 9 show the MSD curves against the number of iterations for the LTVFFDRLS and LCTVFFDRLS algorithms with different initial values for the forgetting factor in comparison with the conventional DRLS algorithm and the GVFFDRLS algorithm, respectively. The parameters of the considered algorithms are listed in Table 4. From the results, the LTVFFDRLS algorithm converges to almost the same error floor in two scenarios where the variable forgetting factor is initialized to be small or large. This is also true for the LCTVFFDRLS algorithm, which has lower error floor and faster convergence speed than the LTVFFDRLS algorithm. However, as shown in Fig. 8, for the conventional DRLS algorithm, its convergence speed and steadystate error floor both have obvious changes when the fixed forgetting factors increases. Specifically, when the fixed forgetting factor is small, the conventional DRLS algorithm converges faster but has a higher error floor than the LTVFFDRLS algorithm; however, as the fixed forgetting factors increase, it converges to a lower error floor (not as good as the LTVFFDRLS algorithm) but has slower convergence speed. Besides, from Fig. 9, the MSD performance of the proposed LTVFFDRLS and LCTVFDRLS algorithms are less sensitive to the initial values for the forgetting factor than that of the GVFFDRLS algorithm. Therefore, by employing the LTVFF and LCTVFF mechanisms, the proposed algorithms can track the optimal performance regardless of the initial values for the forgetting factor and greatly reduce the difficulty in choosing the appropriate value for the forgetting factor.
In Figs. 10, 11, 12, and 13, we evaluate the performance of the proposed LTVFFDRLS and LCTVFFDRLS algorithms based on MSD and EMSE behaviors in comparison with that of the conventional DRLS with the fixed forgetting factor and the GVFFDRLS algorithms. Specifically, the MSD and EMSE curves against the number of iterations for the analyzed algorithms are depicted in Figs. 10 and 11, respectively, while the steadystate MSD and EMSE values for each node are shown in Figs. 12 and 13, respectively. As can be seen from these results, both the LTVFFDRLS and LCTVFFDRLS algorithms converge after a number of iterations and achieve lower steadystate MSD and EMSE values compared to the DRLS algorithm with the fixed forgetting factor and the GVFFDRLS algorithm. Besides, we also depict the analytical results which are calculated through expressions (76) and (78) in Figs 10, 11, 12, and 13. From these results, it is clear that analytical expressions corroborate the simulated results very well. The parameters of the considered algorithms are shown in Table 5, which are tuned through experiments by referring to the investigation in Section 5.1.1.
In Fig. 14, we test the performance of different algorithms considered in a nonstationary environment. Specifically, in order to simulate the nonstationary environment, we consider the scenario where the topology of the sensor network varies over time, that is, the total number of sensor nodes is set to 40 at the start, and then, we switch off half of the nodes after 100 iterations and another 10 nodes after 800 iterations. The MSD curves against the number of iterations for the proposed algorithms in comparison with the conventional DRLS algorithm with the fixed forgetting factor and the GVFFDRLS algorithm in the nonstationary environment are depicted in Fig. 14. As can be observed, the switching off of some sensor nodes results in the degradation of the performance for all the algorithms. However, the proposed LTVFFDRLS and LCTVFFDRLS algorithms still outperform the other two existing algorithms in MSD performance. Besides, they exhibit better tracking properties by showing smaller and smoother variations in the MSD curves at the time of switching sensor nodes.
Next, we elaborate the numerical stability of the proposed LTVFFDRLS and LCTVFFDRLS algorithms. Through tuning the parameters α, β, γ, λ _{+}, λ _{+} to different values, the proposed LTVFFDRLS and LCTVFFDRLS algorithms can have different convergence speed and steadystate performance, but their MSD and EMSE curves always decrease to the steadystate. Indeed, after a number of experiments, we have not encountered the case where they diverge. Hence, the proposed LTVFF and LCTVFF mechanisms do not make the numerical stability of the DRLS algorithm worse. Besides, the simulation results in Fig. 14 show that, after switching some nodes in the network, the proposed LTVFFDRLS and LCTVFFDRLS algorithms still achieve superior performance to the conventional DRLS algorithm, and they exhibit smoother MSD curves at the time of switching nodes, especially the LCTVFFDRLS algorithm. This further verifies that the proposed algorithms improve instead of impair the numerical stability of the DRLS algorithm by keeping better tracking of the variations.
Distributed spectrum estimation
In this part, we extend the proposed LTVFFDRLS and LCTVFFDRLS algorithms to the application of distributed spectrum estimation, for which we focus on estimating the parameter w _{0} that is relevant to the unknown spectrum of a transmitted signal s. First of all, we characterize the system model of distributed spectrum estimation.
We denote the power spectral density (PSD) of the unknown spectrum of the transmitted signal s by Φ _{ s }(f), which can be well approximated by the following basis expansion model [32] with N _{ b } sufficiently large:
where $\phantom {\dot {i}\!}\mathbf {b}_{0}(f)=\text {col}\{b_{1}(f),b_{2}(f),\ldots,b_{N_{b}}(f)\}$ is the vector of basis functions [33, 34], $\phantom {\dot {i}\!}\mathbf {w}_{0}=\text {col}\{{w}_{01},{w}_{02},\ldots,{w}_{0N_{b}}\}$ is the expansion parameter to be estimated and represents the power that transmits the signal s over each basis, and N _{ b } is the number of basis functions.
We assume H _{ k }(f,i) to be the channel transfer function between the source emitting the signal s and the receiver node k at time instant i. Based on (91), the PSD of the signal received by node k can be represented as
where $\mathbf {b}_{k,i}(f)=\left [H_{k}(f,i)^{2}b_{m}(f)\right ]_{m=1}^{N_{b}}\in \mathbb {R}^{N_{b}}$ and $\sigma _{r,k}^{2}$ denotes the receiver noise power at node k.
At each time instant i, by observing the received PSD described in (92) over N _{ c } frequency samples f _{ j }=f _{ min }:(f _{ max }−f _{ min })/N _{ c }:f _{ max }, for j=1,2,…,N _{ c }, each node k takes measurements according to the following model:
where $v_{k,i}^{j}$ denotes the sampling noise at frequency f _{ j } with zero mean and variance $\sigma _{n,j}^{2}$. The receiver noise power $\sigma _{r,k}^{2}$ can be estimated with high accuracy preliminarily and then subtracted from (93) [35, 36]. Therefore, we can obtain
By collecting the measurements over N _{ c } frequencies into a column vector d _{ k,i }, we obtain the following system model of distributed spectrum estimation:
where $\mathbf {d}_{k,i}=\left [d_{k,i}^{f_{j}}\right ]_{j=1}^{N_{c}}\in \mathbb {R}^{N_{c}}$, $\mathbf {B}_{k,i}=\left [\mathbf {b}^{T}_{k,i}(f_{j})\right ]_{j=1}^{N_{c}}\in \mathbb {R}^{N_{c}{\times }N_{b}}$, with N _{ c }>N _{ b }, and $\mathbf {v}_{k,i}=\left [v_{k,i}^{j}\right ]_{j=1}^{N_{c}}\in \mathbb {R}^{N_{c}}$.
Next, we carry out simulations to show the performance of the proposed algorithms when applied to distributed spectrum estimation. We consider a sensor network composed of N=20 nodes in order to estimate the unknown expansion parameter w _{0}. We use N _{ b }=50 nonoverlapping rectangular basis functions with amplitude equal to one to approximate the PSD of the unknown spectrum. The nodes can scan N _{ c }=100 frequencies over the frequency axis, which is normalized between 0 and 1. In particular, we assume that only 8 entries of w _{0} are nonzero, which implies that the unknown spectrum is transmitted over 8 basis functions. Thus, the sparsity ratio equals to 8/50. We set the power transmitted over each basis function to be 0.7 and the variance of the sampling noise to be 0.004.
In Fig. 15, we compare the performance of different algorithms for the distributed spectrum estimation in terms of MSD. As can be depicted, the proposed LTVFFDRLS and LCTVFFDRLS algorithms still outperform the conventional DRLS algorithm in steadystate performance. By tuning parameters, the GVFFDRLS algorithm can achieve similar performance to the proposed algorithms in the convergence speed and steadystate MSD values but at huge computational cost. We have listed the simulation time of running each algorithm for 600 iterations and 1 Monte Carlo experiment in Table 6. As can be observed, the simulation time of running the GVFFDRLS algorithm is almost 3 times of that for running the other algorithms. In Fig. 16, we take node 1 as an example to investigate the performance of different algorithms in estimating the true PSD. From the results, although different algorithms obtain similar estimates of the true PSD, the proposed LCTVFFDRLS algorithm obviously leads to smaller side lobes in the PSD curve than the other three.
Conclusions
In this paper, we have proposed two lowcomplexity VFFDRLS algorithms for distributed estimation including the LTVFFDRLS and LCTVFFDRLS algorithms. For the LTVFFDRLS algorithm, the forgetting factor is adjusted by the timeaveraged cost function, while for the LCTVFFDRLS algorithm, the forgetting factor is adjusted by the timeaveraged of the correlation of two successive estimation errors. We also have investigated the computational complexity of the lowcomplexity VFF mechanisms as well as the proposed VFFDRLS algorithms. In addition, we have carried out the convergence and steadystate analysis for the proposed algorithms. Moreover, we also have derived analytical expressions for the steadystate MSD and EMSE. The simulation results have shown the superiority of the proposed algorithms to the conventional DRLS and GVFFDRLS algorithms in applications of distributed parameter estimation and distributed spectrum estimation and have verified the effectiveness of our proposed analytical expressions for the steadystate MSD and EMSE.
Appendices
A: Proof of the uncorrelation of ρ _{ k }(i−1) and e _{ k }(i)^{2} in the steady state
By multiplying both sides of (26) by e _{ k }(i)^{2} and taking expectaitons, we have the following equation:
Recall that the values of e _{ k }(i−1) and e _{ k }(i) and the values of ρ _{ k }(i−1) and ρ _{ k }(i) can be considered approximately equivalent when i→∞; therefore, we have the following results:
By recalling (27), we can obtain
That is, we can conclude that ρ _{ k }(i−1) and e _{ k }(i)^{2} are uncorrelated in the steady state.
B: Proof of (39)
According to (8), we can obtain the following equation:
where the matrices $\boldsymbol {\mathcal {H}}_{i}$ and $\boldsymbol {\mathcal {W}}_{k,i}$ can be expressed as follows
Therefore, (99) can be reformulate as
Substituting (2) into (101) yields the following recursion:
By employing the iterative Eq. (102), we can write
Recalling Assumption 1, we know that the correlation matrix of the input vector is invariant over time, as a result, the correlation matrix $\mathbf {R}_{u_{l,i}}$ can be represented as $\mathbf {R}_{u_{l}}$. Therefore, by taking expectations on both sides of (103), we obtain the following result
In view of Assumption 2, (104) can be approximately rewritten as
where ξ and χ can be expressed as follows, respectively:
and
Since n _{ i } is a finite positive number, ξ and χ are two deterministic values. In addition, note that λ _{ k }(i) does not exceed its upper bound λ _{+}, which is smaller than but close to unity. Therefore, we have 0<E[λ _{ k }(i)]<λ _{+}<1, and $E[\lambda _{k} (i)]^{iN_{i}+1}<\lambda _{+}^{iN_{i}+1}$. When i is large enough, $\lambda _{+}^{iN_{i}+1}$ approaches zero, and, of course, $\phantom {\dot {i}\!}E[\lambda _{k} (i)]^{iN_{i}+1}$ also approaches zero. As a result, the last term in (105) vanishes. Then, we obtain the following result:
where the values of λ _{ k }(∞) is given in (24) for the LTVFF mechanism and in (35) for the LCTVFF mechanism, respectively. Hence, we obtain (39). Note that, by setting appropriate truncation bounds for λ _{ k }(i), the steadystate forgetting factor value will not be influenced by the truncation. Hence, the result (39) always holds true despite the truncation employed to the VFF mechanisms. Indeed, the truncation mechanism only plays a role during the process of converging. Once the algorithms reach the steady state, the values of the forgetting factor are not affected by the truncation mechanism any longer.
References
 1
P Corke, T Wark, R Jurdak, W Hu, P Valencia, D Moore, Environmental wireless sensor networks. Proc. IEEE.98(11), 1903–1917 (2010).
 2
JG Ko, C Lu, MB Srivastava, JA Stankovic, A Terzis, M Welsh, Wireless sensor networks for healthcare. Proc. IEEE.98(11), 1947–1960 (2010).
 3
R Abdolee, B Champagne, AH Sayed, in Proc. IEEE Statistical Signal Processing Workshop. Diffusion LMS for Source and Process Estimation in Sensor Networks (IEEEAnn Arbor, 2012).
 4
R Abdolee, B Champagne, AH Sayed, in Proc. IEEE ICASSP. Diffusion LMS Localization and Tracking Algorithm for Wireless Cellular Networks (IEEEVancouver, 2013).
 5
R Abdolee, B Champagne, AH Sayed, Diffusion adaptation over multiagent networks with wireless link impairments. IEEE Trans. Mob. Comput. 15(6), 1362–1376 (2016).
 6
FS Cattiveli, CG Lopes, AH Sayed, in Proc. IEEE Workshop Signal Process. Advances Wireless Commun. (SPAWC). A Diffusion RLS Scheme for Distributed Estimation over Adaptive Networks (IEEEHelsinki, 2007), pp. 1–5.
 7
FS Cattiveli, CG Lopes, AH Sayed, Diffusion recursive leastsquares for distributed estimation over adaptive networks. IEEE Trans. Signal Process.56(5), 1865–1877 (2008).
 8
FS Cattiveli, AH Sayed, Diffusion LMS strategies for distributed estimation. IEEE Trans. Signal Process.58(3), 1035–1048 (2010).
 9
CG Lopes, AH Sayed, Diffusion leastmean squares over distributed networks: formulation and performance analysis. IEEE Trans. Signal Process.56(7), 3122–3136 (2008).
 10
Y Liu, C Li, Z Zhang, Diffusion sparse leastmean squares over networks. IEEE Trans. Signal Process.60(8), 4480–4485 (2012).
 11
S Xu, RC de Lamare, HV Poor, in Proc. IEEE ICASSP. Adaptive link selection strategies for distributed estimation in diffusion wireless networks (IEEEVancouver, 2013).
 12
S Xu, RC de Lamare, HV Poor, Distributed compressed estimation based on compressive sensing. IEEE Signal Process. Lett.22(9), 1311–1315 (2015).
 13
MOB Saeed, A Zerguine, SA Zummo, A variable stepsize strategy for distributed estimation over adaptive networks. EURASIP J. Adv Signal Process. 2013(1), 1–14 (2013).
 14
H Lee, S Kim, J Lee, W Song, A variable stepsize diffusion LMS algorithm for distributed estimation. IEEE Trans. Signal Process.63(7), 1808–1820 (2015).
 15
Z Liu, Y Liu, C Li, Distributed sparse recursive leastsquares over networks. IEEE Trans. Signal Process.62(6), 1386–1395 (2014).
 16
S Huang, C Li, Distributed sparse total leastsquares over networks. IEEE Trans. Signal Process.63(11), 2986–2998 (2015).
 17
C Li, P Shen, Y Liu, Z Zhang, Diffusion information theoretic learning for distributed estimation over network. IEEE Trans. Signal Process.61(16), 4011–4024 (2013).
 18
Z Liu, C Li, Y Liu, Distributed censored regression over networks. IEEE Trans. Signal Process.63(20), 5437–5449 (2015).
 19
S Haykin, Adaptive Filter Theory, 4th edn (PrenticHall, Englewood cliffs, 2000).
 20
S Leung, CF So, Gradientbased variable forgetting factor RLS algorithm in timevarying environments. IEEE Trans. Signal Process. 53(8), 3141–3150 (2005).
 21
CF So, SH Leung, Variable forgetting factor RLS algorithm based on dynamic equation of gradient of mean square error. Electron. Lett.37(3), 202–203 (2011).
 22
S Song, J Lim, S Baek, K Sung, Gauss Newton variable forgetting factor recursive least squares for time varying parameter tracking. Electron. Lett.36(11), 988–990 (2000).
 23
S Song, J Lim, SJ Baek, K Sung, Variable forgetting factor linear least squares algorithm for frequency selective fading channel estimation. IEEE Trans. Vehi. Techonol. 51(3), 613–616 (2002).
 24
F Albu, in Proc. of ICARCV 2012. Improved Variable Forgetting Factor Recursive Least Square Algorithm (IEEEGuangzhou, 2012).
 25
Y Cai, RC de Lamare, M Zhao, J Zhong, Lowcomplexity variable forgetting factor mechanisms for blind adaptive constrained constant modulus algorithms. IEEE Trans. Signal Process.60(8), 3988–4002 (2012).
 26
L Qiu, Y Cai, M Zhao, Lowcomplexity variable forgetting factor mechanisms for adaptive linearly constrained minimum variance beamforming algorithms. IET Signal Process. 9(2), 154–165 (2015).
 27
R Arablouei, K Dogancay, S Werner, Y Huang, Adaptive distributed estimation based on recursive leastsquares and partial diffusion. IET Signal Process. 62(14), 1198–1208 (2014).
 28
DS Tracy, RP Singh, A new matrix product and its applications in partitioned matrix differentiation. Statistica Neerlandica. 51(3), 639–652 (2003).
 29
H Shin, AH Sayed, in Proc. IEEE ICASSP. Transient Behavior of Affine Projection Algorithms (IEEEHong Kong, 2003).
 30
JH Husoy, MSE Abadi, in IEEE MELECON 2004. A Common Framework for Transient Analysis of Adaptive Filters (IEEEDubrovnik, 2004).
 31
AH Sayed, Adaptive filters (Wiley, 2011).
 32
JA Bazerque, GB Giannakis, Distributed spectrum sensing for cognitive radio networks by exploiting sparsity. IEEE Trans. Signal Process.58(3), 1847–1862 (2010).
 33
S Chen, DL Donoho, MA Saunders, Atomic decomposition by basis pursuit. SIAM J. Sci Comput. 20:, 33–61 (1998).
 34
Y Zakharov, T Tozer, J Adlard, Polynomial splinesapproximation of Clarke’s model. IEEE Trans. Signal Process.52(5), 1198–128 (2004).
 35
PD Lorenzo, S Barbarossa, A Sayed, Distributed spectrum estimation for small cell networks based on sparse diffusion adaptation. IEEE Signal Process. Lett.20(123), 1261–1265 (2013).
 36
ID Schizas, G Mateos, GB Giannakis, Distributed LMS for consensusbased innetwork adaptive processing. IEEE Trans. Signal Process.57(6), 2365–2382 (2009).
Funding
This work was supported in part by the National Natural Science Foundation of China under Grant 61471319, the Scientific Research Project of Zhejiang Provincial Education Department under Grant Y201122655, and the Fundamental Research Funds for the Central Universities.
Author information
Affiliations
Contributions
YC and RCdL proposed the original idea. LZ carried out the experiment. In addition, LZ and YC wrote the paper. CL and RCdL supervised and reviewed the manuscript. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Yunlong Cai.
Ethics declarations
Competing interests
The authors declare that they have no competing interests
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Sensor networks
 Distributed parameter estimation
 Distributed spectrum estimation
 Diffusion recursive leastsquares
 Variable forgetting factor