A variable-parameter normalized mixed-norm (VPNMN) adaptive algorithm

Since both the least mean-square (LMS) and least mean-fourth (LMF) algorithms suffer individually from the problem of eigenvalue spread, so will the mixed-norm LMS-LMF algorithm. Therefore, to overcome this problem for the mixed-norm LMS-LMF, we are adopting here the same technique of normalization (normalizing with the power of the input) that was successfully used with the LMS and LMF separately. Consequently a new normalized variable-parameter mixed-norm (VPNMN) adaptive algorithm is proposed in this study. This algorithm is derived by exploiting a time-varying mixing parameter in the traditional mixed-norm LMS-LMF weight update equation. The time-varying mixing parameter is adjusted according to a well-known technique used in the adaptation of the step-size parameter of the LMS algorithm. In order to study the theoretical aspects of the proposed VPNMN adaptive algorithm, our study also addresses its convergence analysis, and assesses its performance using the concept of energy conservation. Extensive simulation results corroborate our theoretical findings and show that a substantial improvement, in both convergence time and steady-state error, can be obtained with the proposed algorithm. Finally, the VPNMN algorithm proved its usefulness in a noise cancellation application where it showed its superiority over the normalized least-mean square algorithm.


Introduction
Due to its simplicity, the least mean-square (LMS) [1,2] algorithm is the most widely used algorithm for adaptive filters in many applications. The least mean-fourth (LMF) [3] algorithm was also proposed later as a special case of the more general family of steepest descent algorithms [4] with 2k error norms, k being a positive integer.
But for both of these algorithms, the convergence behavior depends on the condition number, i.e., on the ratio of the maximum to the minimum eigenvalues of the input signal autocorrelation matrix, R = E[x n x T n ] where x n is the input signal. This is clearly seen from their respective time constants [1,3] and where σ 2 η is the noise power, λ i is the ith eigenvalue of the autocorrelation matrix of the input signal, µ is the step size used in the adaptation scheme and N is the number of coefficients in the adaptive filter. As seen from (1) and (2), the ratio of τ max τ min is constant for both algorithms and is given by the eigenvalue spread (i.e., condition number), λ max λ min i.e., To remove the dependency of the convergence of the LMS algorithm on the condition number, the normalized least-mean square (NLMS) [5] was introduced. As reported in [5], a great improvement in convergence is obtained through the use of the NLMS algorithm over that of the LMS algorithm at the expense of a larger steady-state error. Similar results were obtained for the case of the normalized LMF (NLMF) algorithm [6][7][8].
A mixed-norm algorithm [9][10][11], combining both the LMS and the LMF algorithms, will suffer as well from the problem of the eigenvalue spread dependency. Since both of these algorithms suffer individually from this problem, to circumvent this problem for the mixednorm LMS-LMF, we are adopting here the same technique of normalization that was successfully used with the LMS and LMF separately.
It is well known that fast convergence and lower steadystate error are two conflicting parameters in general adaptive filtering. When compared to the LMS algorithm, the NLMS algorithm results in a faster convergence but only at the expense of a higher steady-state error [12,13]. A promising solution to this conflict is a time-varying normalized mixed-norm LMS-LMF algorithm. In this mixednorm algorithm and during the transient state, the NLMS algorithm is used to speed up the algorithm's convergence. However when steady-state is reached, the algorithm automatically switches from the NLMS to the NLMF [7], thanks to a built-in "gear shifting" property, to secure a lower steady-state error.
In this work, the performance of a variable-parameter normalized mixed-norm (VPNMN) LMS-LMF algorithm is evaluated. It will be shown that a better performance in both convergence and steady-state error will be achieved by the VPNMN algorithm than either the NLMS or the NLMF algorithm.
The rest of the article is organized as follows. Section 2 deals with a more explicit development of the proposed algorithm, and Section 3 treats its convergence analysis. The steady-state analysis of the proposed algorithm is detailed in Section 4, while its tracking analysis is given in Section 5. Performance evaluation of the resulting algorithm is carried out in Section 6. Finally, the conclusion section summarizes this work.

Algorithm development
The mixed-norm LMS-LMF algorithm is based on the minimization of the following cost function [9,10]: where α is a positive mixing parameter in the interval [0, 1] and the error e n is defined as where d n is the desired value, w n is the filter coefficient of the adaptive filter, x n is the input signal and h n is the additive noise.
A major drawback of this algorithm is, however, the choice of the mixing parameter that is hard to fix a priori for an unknown system. In [14], a self-adapting LMS-LMF algorithm with a time-varying weighting factor was proposed. This time-variation of the weighting factor was achieved by allowing for a variable mixing factor that is updated every iteration using the modified variable step-size (MVSS) algorithm proposed in [15]. The variable weight mixed-norm LMS-LMF algorithm was defined to minimize the following performance measure [14]: where a n , chosen in [0, 1] such that the unimodal character of the above cost is preserved, is a time-varying parameter updated according to [15] and p n = βp n−1 + (1 − β)e n e n−1 .
The parameters δ and b, both confined to the interval [0,1], are exponential weighting parameters that govern the averaging time constant, i.e., the quality of estimation of the algorithm, and γ >0. Note that the algorithm defined by (4) is restored when δ = 1 and g = 0, which forces a n to have a fixed value.
Based on this motivation, the weight mixed-norm LMS-LMF algorithm for recursively adjusting the coefficients of the system is expressed in the following form: where μ is the step size. As mentioned earlier and because of its reliance on the LMS and the LMF, the algorithm defined by (9) will be affected by the eigenvalue spread of the autocorrelation matrix of the input signal. To overcome this dependency, a VPNMN adaptive algorithm is introduced and its weight update recursion is given by the following expression: x n x n 2 , where ║x n ║ 2 is the Euclidean norm of the input signal x n . In the case of zero input, the ε-VPNMN algorithm defined as follows: x n ε+ x n 2 , must be used for regularization purposes.

Convergence analysis of the VPNMN algorithm
In this section, the convergence analysis of the proposed VPNMN algorithm is carried out. Both the mean and the mean-square behaviors of the weight error vector are presented in the ensuing analysis.

Mean behavior
In the ensuing analysis, the following assumptions are used in the derivations of the convergence in the mean for the normalized mixed-norm LMS-LMF algorithm. These are quite similar to what is usually assumed in literature [2][3][4]16] and which can also be justified in several practical instances A.1 The noise sequence {h n } is statistically independent of the input signal sequence {x n } and both sequences have zero mean.
A.2 The weight error vector (v n ), to be defined later, is independent of the input x n .
A.3 The mixing parameter is independent of both the input signal and the error.
Examining the mean behavior of (10) under the above assumptions, sufficient conditions for convergence of the proposed algorithm in-the-mean can be derived and are stated as follows.
Proposition 1For the algorithm defined by (10) to converge in-the-mean, a sufficient condition is that μ be chosen in the following range: where σ 2 η is the noise power,ᾱ n = E[α n ]is the mean of the mixing parameter, and Cis is the Cramer-Rao bound associated with the problem of estimating the random quantity x T n w opt by using x T n w n . Proof: The mean convergence of the proposed algorithm is now studied by taking the expectation of the weight error vector, v n = w n -w opt . In this regard, the error e n can be set up in the following way: and hence (10) becomes x n x n 2 .
Consequently, taking the expectation on both sides of (14), under A.1-A.3, the mean weight-error vector of the proposed algorithm evolves as Now, considering the second expectation in the above equation, This will be especially true when the filter is long enough. Consequently, the independence assumption can be invoked to obtain the following: To solve the expectation E[e n x n ] we use the technique of [17], and thus it results in Now, considering the second expectation in the above equation, This will be especially true when the filter is long enough. Consequently, the independence assumption can be invoked to obtain the following: To solve the expectation E e 3 n x n we use the technique of [17,18], which does not employ any linearization of e 3 n As a result, E[e 3 n x n ] is found to be Ultimately, (15) can be set up in the following form: (20) If C ≤ ζ n is the Cramer-Rao bound associated with the problem of estimating the random quantity x T n w opt by using x T n w n , then after taking into account the fact that the eigenvalues of R are all real and positive, λ max being the largest eigenvalue of R and in general λ max <tr(R) [19], it follows that a sufficient condition for convergence of the proposed algorithm is that the step-size parameter μ satisfies (12). ▪ Two extreme scenarios can be considered here for the value of the mixing parameter α n (1) Scenario 1: When α n = 0, the VPNMN algorithm reduces to the NLMF algorithm [6], and it can be shown that (12) becomes (2) Scenario 2: When α n = 1, both the NLMS algorithm and its step size range, that is 0 < μ < 2, are recovered. Remarks: (1) It can be seen from (10) that the VPNMN algorithm can be viewed as a variable step-size LMS-LMF algorithm with time varying step size.
(2) The error is usually large during the initial adaptation and gradually decreases toward a minimum. Therefore, the signal power, ║x n ║ 2 , will act as a threshold to avoid taking large step sizes when the error converges to a minimum in the recursive updating equation.
(3) The bound for the step-size (μ) of the proposed algorithm that guarantees convergence of the mean weight-vector, given by (12), shows that the meanweight-vector stability depends on the Cramer-Rao bound. Therefore, the convergence of the mean-weightvector of the proposed algorithm depends on its meansquare stability. A similar fact was observed in [18] for the LMF algorithm.

Mean square behavior
In this section the performance of the VPNMN algorithm in the mean-square sense is analyzed. Here, we have used a unified approach to the transient analysis of adaptive filters with error nonlinearities. This approach does not restrict the regression data to be Gaussian and avoids the need for explicit recursions for the covariance matrix of the weight-error vector. This approach assumes that the adaptive filter is long enough to justify the following assumptions which are realistic for longer adaptive filters: A.4 The residual or a priori error e an , to be defined later, can be assumed to be Gaussian.
A.5 The norm of the input regressor (║x n ║ 2 ) can be assumed to be uncorrelated with f 2 (e n ) (f(e n ) is defined in (23)).
The framework is based on the concept of energy conservation relation which was first noted in [20] and in general the adaptation scheme defined in (14) can be written in the following form: where f(e n ) denotes a general scalar function of the output estimation error e n and in our case it is given by We are interested in studying the time-evolution and the steady-state values of E[|e 2 an |] and E[║v n ║ 2 ] which represent the mean-square-error and the mean-squaredeviation performances of the filter, respectively, whereas their time-evolution relate to the learning or the transient behavior of the filter.
Then, for some symmetric positive definite weighting matrix A to be specified later, the weighted a priori and a posteriori estimation errors are, respectively, defined as [21] e A an = x T n Av n , and e A pn = x T n Av n+1 .
For the special case when A = I, the weighted a priori and a posteriori estimation errors defined above are reduced to standard a priori and a posteriori estimation errors, respectively, that is, e an = e I an = x T n v n , and e pn = e I pn = x T n v n+1 .
It can be shown that the estimation error, e n , and the a priori error, e an , are related via e n = e an + η n . Also, using (10) and (24), it can be shown that where the notation x n 2 A denotes the weighted squared Euclidean norm x n 2 A = x T n Ax n . The performance measure in the analysis is the excess mean-square-error (EMSE), denoted by ζ n , and is defined as follows: Since e an = x T n v n , the EMSE can also be written as follows: Next, the fundamental weighted-energy conservation relation given in [21] is presented to develop the framework for the transient analysis of the proposed algorithm. Thus, by substituting (26) in (22), the following relation can be obtained: Ultimately, the fundamental weighted-energy conservation relation can be shown to be v n+1 This relation shows how the weighted energies of the error quantities evolve in time. It has been shown that different choices of A allow us to evaluate different performance measures of an adaptive filter.

Time evolution of the weighted variance
In this section, the time evolution of the weighted variance E[ v n 2 A ] is derived for the proposed algorithm using the fundamental weighted-energy conservation relation (30). Substituting the expression for a posteriori error from (26) in (30) and taking expectation on both sides to obtain the following relation: for these two quantities are given next. First, we will use the following assumption which was adopted in [21], that is, A.6 For any constant matrix A and for all n, e an and e A an are jointly Gaussian. This assumption is reasonable for longer filters using the concept of central limit arguments [21]. Moreover, a similar assumption was used in [22]. Hence, we can simplify the expectation E[e A an e n ] using Price's Theorem [23,24] and assumptions A.4 and A.6 as follows: .
Since e A an = x T n Av n and e an = x T n Iv n we can simplify the expectation E[e A an e an ] as follows: Ultimately, (32) can be written as .
Second, to solve the expectation E[ x n 2 A f 2 (e n )], we will resort to the following assumption [21]: A.7 The adaptive filter is long enough such that x n 2 A and f 2 (e n ) are uncorrelated. This assumption is found to be more realistic as the filter gets longer [21] and unweighted version of this assumption was used in [22,25]. The assumption enable us to split the expectation E[ x n 2 A f 2 (e n )] as follows: where E[f 2 (e n )] can be shown to be (with α 2 Ultimately, we can rewrite (31) as follows: Now, using Cayley-Hamilton theorem, we can write where is the characteristic polynomial of R. Consequently, the following relation is obtained: Ultimately, using (39) and (42), the transient behavior of the proposed algorithm can be shown to be governed by the following recursion: where and It can be noticed that the learning curves for the MSD and the EMSE can be obtained from the first and second elements of vector W n , respectively.

Mean-square stability
Finally, in this section, the mean-square stability of the proposed algorithm is investigated. Consequently, we provide a nontrivial upper bound on µ for which E [║v n ║ 2 remains uniformly bounded for all n.
Starting from (31) with A = I and using the Gaussian behavior of e an , it can be shown that the proposed algorithm will be mean-square stable provided that The above inequality, upon substituting the values of the two expectations (E[e an f(e n )] and E[║x n ║ 2 f 2 (e n )]), will lead us to get the following bound:

Steady-state analysis of the VPNMN algorithm
The purpose of the steady state analysis of an adaptive filter is to study the behavior of steady state EMSE. Now, analyzing (31) for the limiting case when n ∞ . Assuming that the weight error vector reaches a steadystate mean square error value, i.e., Consequently, for a unity weight matrix (A = I), (31) reduces to the following: Now, using the definition of the EMSE given by (28), its steady-state value denoted by ζ ∞ is found to be The terms lim n ∞ Z n and lim n ∞ ℱ n can be obtained from (35) and (37), respectively.
Since, the EMSE is very close to zero at steady state, therefore, the higher powers of ζ ∞ can be ignored. Ultimately, the steady-state EMSE of the proposed algorithm can be shown to be 5 Tracking analysis of the VPNMN algorithm Cyclic and random system nonstationarities are a common impairment in communication systems and especially in applications that involve channel estimation, channel equalization, and inter-symbol-interference cancellation. Random nonstationarity is present due to variations in channel characteristics which is true in most of cases, particularly in the case of a mobile communication environment [26]. Cyclic system nonstationarities arise in communication systems due to mismatches between the transmitter and receiver carrier generator.
The ability of adaptive filtering algorithms to track such system variations is not yet fully understood. In this regard, Rupp [27] presented a first-order analysis of the performance of the LMS algorithm in the presence of the carrier frequency offset. In [21,25,28,29] a general framework for the tracking analysis of adaptive algorithms was developed. It can handle both cyclic as well as random system nonstationarities simultaneously. This framework, based on an energy conservation principle [20], holds for all adaptive algorithms whose recursions are of the form In the ensuing analysis, the tracking analysis of the proposed algorithm is carried out in the presence of both random and acyclic nonstationarities. It should be noted here that in this case, unlike the convergence analysis which is a linear process, the tracking analysis is a nonlinear one due to the presence of the term (e jΩn ) in (54). This therefore justifies our use of complex signals, instead of real ones, in the (tracking) analysis.
A general system model is presented here which includes both types of nonstationarities, that is random and cyclic ones. To start, consider the noisy measurement d n that arises in a model of the form: where η n is the measurement noise and w o n is the unknown system to be tracked. The multiplicative term e jΩn accounts for a possible frequency offset between the transmitter and the receiver carriers in a digital communication scenario. Furthermore it is assumed that the unknown system vector w o n is randomly changing according to: where w o is a fixed vector, and q n is assumed to be a zero-mean stationary random vector process with a positive definite autocorrelation matrix Q n = E[q n q H n ] Moreover, it is also assumed that the sequence {q n } is mutually independent of the sequences {x n } and {η n }. Thus, from the generalized system model given by (54) and (55), it can be seen that the effects of both cyclic and random system nonstationarities are included in this system model.
In the tracking analysis of adaptive algorithms, an important measure of performance is their steady-state tracking EMSE and is given by whereṽ n is the weight-error vector for tracking scenario and is defined as follows: Using (53), (55) and (57) the following recursion is obtained: where c n is defined as Now, let us define the following a priori estimation error, e an = x H nṽ n and a posteriori estimation error, e pn = x H n (ṽ n+1 − c n e j n ) Then, it is very easy to show that the estimation error and the a priori error are related via e n = e an + h n . Also, from (26) when A = I, the a posteriori error is defined in terms of the a priori error as follows: whereμ n = 1/ x n 2 Substituting (60) into (58) results into the following update relation: v n+1 =ṽ n −μ n x * n e an − e pn + c n e j n .
By evaluating the energies of both sides of the above equation (taking into account thatμ n x n 2 = 1) the following relation is obtained: ṽ n+1 − c n e j n 2 +μ n |e an | 2 = ṽ n 2 +μ n |e pn | 2 .
It can be seen that if Ω = 0 (i.e., no frequency offset between the transmitter and the receiver), the above equation reduces to the basic fundamental energy conservation relation.
The energy relation (62) will be used to evaluate the excess-mean-square error at steady state. But before starting the analysis, first the following assumptions are stated: A.8 In steady-state, the weight error vectorṽ n takes the generic form z n e jΩn with the stationary random process z n independent of the frequency offset Ω.
Using (60), assumption A.8, and taking expectation of both sides of (62) and the fact that at steady state E ṽ n+1 = E ṽ n the following relation can be obtained: The above equation can be used to solve for the steady-state EMSE. To find the value of z = E[z n ], (58) is used where it is multiplied by the term e -jΩn and then expectation is taken on both sides to get which yields the following value of z at steady-state: where g o is defined as Ultimately, the steady-state excess-mean-square error of the proposed algorithm, ζ tracking , is obtained from (63): where and It can be seen from the above result that the steadystate tracking EMSE of the NLMS algorithm [28] and the NLMF algorithm [29] can be recovered by substituting a n = 1 and a n = 0, respectively, in (67).
For a white Gaussian input signal, the autocorrelation of the input signal R = σ 2 x I, and therefore (67) will look like the following:

Simulation results
The performance of the proposed algorithm, the VPNMN LMS-LMF, is assessed in different scenarios. Experiments are carried out where an unknown system is to be identified under noisy conditions. The unknown system is a non-minimum phase channel. The input signal to both the unknown system and the adaptive filter is obtained by passing a zero-mean white Gaussian sequence through a channel that is used to vary the eigenvalue spread of the autocorrelation matrix of the input signal. The example considered for the sequence {x n } has an eigenvalue spread of 68.9. The additive noise, h n , is a zero-mean. The signal to noise ratio is set to be equal to 20 dB and the performance measure considered is the normalized weight error norm 10log 10 ║w n w opt ║ 2 /║w opt ║ 2 . Results are obtained by averaging over 500 independent runs. The proposed algorithm is implemented with the parameters 8 = 0.97, b = 0.98, γ = 10 -2 a 0 = 0.8 and p 0 = 0. In the ensuing, different aspects of the performance are considered during the course of this study. Figure 1 compares the fastest convergence characteristics of both the proposed algorithm and the NLMS algorithm. It can be seen from this figure that the proposed algorithm converges as fast as the NLMS algorithm but results in a lower weight mismatch. An improvement of 25 dB is obtained through the use of the proposed algorithm. Also, as shown in Figure 2, the proposed algorithm outperforms the NLMS algorithm, for the lowest steady-state error reached by the later, thanks to its built-in gear-shifting mechanism which gives it an extra degree of freedom in this region. The fast convergence obtained by the proposed algorithm can be justified by the fact that when far from the optimum solution, this algorithm exhibits faster convergence than the NLMS algorithm by automatically increasing the step size (gear-shifting property). Figure 3 summarizes the performance of the proposed VPNMN algorithm in the three different noise environments with an SNR of 20 dB when the input signal is white. As can be depicted from this figure that the best performance is obtained when the noise statistics are uniform while the worst performance is obtained when the noise statistics are laplacian.

Convergence behavior
Similarly, Figure 4 depicts the results for the proposed VPNMN algorithm when the input signal is highly correlated and as can seen from this figure that almost equal performance is obtained by the VPNMN algorithm for the different noise statistics.
In order to verify the stability bound on step-size given in (48), we investigate it in a Gaussian environment and an SNR of 20 dB. Here, we choose a misadjustment of five which results in the Cramer-Rao bound to be C ≤ 0.05. Thus, choosing a tr(R) = 5, the upper bound given in (48) is found to be 0.95. It is observed from the various performed simulations that the NCLMF algorithm is stable while µ is less than 1.0 and thus, eventually validating the derived stability bound.
Finally, from the viewpoint of computational load the proposed algorithm requires an additional seven multiplications and three additions when compared to the fixed mixed-norm algorithm defined by (4), and only eleven multiplications and six additions when compared to the NLMS algorithm. The small computational over head of the proposed algorithm is therefore well worth the gain in the steady-state error reduction it brings about. Figure 5 depicts the time evolution of the MSE obtained for both the theoretical analysis, the second entry of (44), and the simulations. Excellent agreement between theory and simulation results is obtained; hence, a consistency in performance is obtained by the proposed VPNMN algorithm.

Results for tracking
For tracking, the simulations are carried out for a system identification problem, where the unknown system, having an FIR model, is given by [1.0119 -j0.7589, -0.3796 + j0.5059] T , while the system characteristics are time-varying according to the system model (54) and   The input signal x n to both the unknown system and the adaptive filter is a zeromean white Gaussian sequence. The signal to noise ratio is set to be equal to 30 dB two values are considered for tr{Q n }: a very small value of tr{Q n } = 10 -7 , and a very large one of tr{Q n } = 10 -2 . Figure 6 depicts the comparison of the theory to the simulation results for three different values of Ω, i.e., Ω = 0.001, 0.002, and 0.003. As can be seen from this figure, close agreement between theory and simulation results are obtained. Furthermore, it is observed from this figure that degradation in performance is obtained by increasing the frequency offset Ω and unlike the stationary case, the steady-state EMSE is not a monotonically increasing function of the step-size µ, that is the steady-state EMSE is smaller at larger values of the step-size µ.   Figure 6 is obtained for the case when tr{Q n } = 10 -7 which is represents a small value. Increasing this value to 10 -2 , the results depicted in Figure 7 for three larger values of Ω, i.e., 0.01, 0.02, and 0.03, still show that the previously stated observations are similar to those obtained for a smaller value of tr{Q n }.
Finally, the consistency in the performance of the steady-state EMSE of the proposed algorithm is observed in both cases (two different values of tr{Q n }) and different values of Ω.

Noise cancelation using VPNMN algorithm
In this example, we study the performance of the VPNMN algorithm for the application of noise cancelation. A pure sinusoidal noise generated by the process (u n = 0.8 sin (ωn + 0.5π)) with ω = 0.1 π is to be removed from a square wave generated by (s n = 2 × ((mod(n, 1000) <1000/2) -0.5)) where mod (n, 1000) computes the modulus of n over 1,000. Summing u n and s n gives us the reference signal to the adaptive filter. The input to the adaptive filter is a sinusoidal signal generated by x n = √ 2 sin(ωn) with ω = 0.1 π. The resulting output error signal e n will, in time, converge to the desired signal which will be noiseless. Figure 8 depicts the reference response and the processed results by the VPNMN algorithm and NLMS algorithm. It is clear that both algorithms are able to remove the noise component but VPNMN algorithm exhibits better noise cancelation capabilities as compared to the NLMS algorithm.

Conclusion
In this study, a normalized VPNMN algorithm is proposed where a combination of the LMS and the LMF algorithms is incorporated using the concept of variable step-size LMS adaptation. It is found that the proposed algorithm has the fast convergence property of the NLMS algorithm while resulting in a lower steady-state error, therefore eliminating the conflict between these two parameters, i.e., fast convergence and low steadystate error. Moreover, the consistency of the performance of the proposed algorithm has been confirmed by many simulation results which are reported here.
The analytical results of the tracking steady-state EMSE are derived for the proposed algorithm in the presence of both random and cyclic nonstationarities. The results, show that unlike in the stationary case, the steady-state EMSE is not a monotonically increasing function of the step-size µ, while the ability of the algorithm to track the variations in the environment degrades by increasing the frequency offset Ω.
Finally, the VPNMN algorithm proved its usefulness in a noise cancelation scenario where it showed its superiority over the NLMS algorithm.