EURASIP Journal on Applied Signal Processing 2002:1, 21–29 © 2002 Hindawi Publishing Corporation Nonlinear Effects of the LMS Adaptive Predictor for Chirped Input Signals

This paper investigates the nonlinear effects of the Least Mean Square (LMS) adaptive predictor. Traditional analysis of the adaptive ﬁlter ignores the statistical dependence among successive tap-input vectors and bounds the performance of the adaptive ﬁlter by that of the ﬁnite-length Wiener ﬁlter. It is shown that the nonlinear effects make it possible for an adaptive transversal prediction ﬁlter to signiﬁcantly outperform the ﬁnite-length Wiener predictor. An approach is derived to approximate the total steady-state MeanSquareError(MSE)forLMSadaptivepredictorswithstationaryorchirpedinputsignals.Thisapproachshowsthat,whilethe nonlineareffectissmallfortheone-stepLMSadaptivepredictor,itincreasesinmagnitudeasthepredictiondistanceisincreased. WealsoshowthatthenonlineareffectoftheLMSadaptivepredictorismoresigniﬁcantthanthatoftheRecursiveLeastSquare adaptivepredictor.


INTRODUCTION
The Least Mean Square (LMS) adaptive filter is widely used in many applications partly due to the simplicity of its implementation [1]. The simplicity belies the fact that the adaptive LMS filter is a complex nonlinear estimator [2,3,4,5,6,7,8,9]. Traditional analysis of adaptive filter performance is restricted to a statistical analysis of the LMS algorithm under a set of independence assumptions that ignore the statistical dependence among successive tap-input vectors [1]. The Mean Square Error (MSE) of the LMS adaptive filter using these assumptions is bounded by that of the corresponding finite-length Wiener filter, and the MSE of the adaptive filter increases monotonically as a function of the adaptation step-size. While simulations show that this simplified analysis predicts the performance reasonably well in many applications for small step-size, it was shown that in some applications there is a large discrepancy between the simulation results and what the independence analysis predicts [2,3,4,5,6,7,8,9]. The reason for the discrepancy is that these well-known assumptions mask the nonlinear effects that arise in LMS adaptive filters. It has been shown that it is possible for the LMS adaptive filter to outperform the finite-length Wiener filter in MSE for the cases of adaptive channel equalization for sinusoidal and first-order autoregressive process (AR1) interference suppression [9], and adaptive noise cancellation for narrowband AR1 signals when the primary and reference signals have slightly different frequencies [7]. An error transfer function approach is also derived in [9] to give an approximate expression for the total steady-state MSE of the LMS adaptive channel equalizer.
In this paper, the nonlinear effects in a third application of adaptive filters, adaptive prediction, is studied. The class of input signals which will be considered for adaptive prediction are the stationary and chirped narrowband input signals for varying chirp rates and bandwidth. This class of signals has been used to represent a signal whose spectrum is frequency offset and shifted with time in a nonstationary mobile communications environment [10,11]. They are different from those considered in [2,3,4,5,6,7,8,9] because they have a time-varying Power Spectral Density (PSD). Since they do not have a fixed PSD, the error transfer function approach [9] is not directly applicable. However, since the chirped signal has a constant spectral shifting rate, this special class of nonstationary inputs can be analyzed as stationary inputs by an unchirped transform defined below. It is proven in this paper that the MSE of the standard LMS adaptive predictor with a chirped input signal is equal to the MSE of a transformed LMS adaptive predictor with the corresponding stationary input signal. An error transfer function approach is derived for the transformed LMS algorithm with stationary input signals so as to approximate the MSE of chirped signal prediction. To bound the performance of the LMS adaptive predictor, the MSE of the optimal estimator (the infinite-length onestep causal Wiener predictor) is calculated.
To compare the magnitude of nonlinear effects of the LMS and RLS adaptive predictors, the error feedback transfer function is also derived for the RLS algorithm. By comparing the contributions of past errors to the current estimates in the two algorithms, it is shown that the LMS algorithm uses information from past prediction errors more effectively than the RLS algorithm.

BACKGROUND
The adaptive predictor application considered is the adaptive recovery of narrowband signals from embedded Additive White Gaussian Noise (AWGN). The narrowband input signal is modeled as an AR1 process. It is shown in [11] that the AR1 process provides a reasonable approximation to a BPSK communication signal. The AR1 process satisfies the recursive equation where ν n is a white noise process, with σ 2 and P s is the power of the AR1 process. The corresponding chirped AR1 signal s c n , where superscript c denotes the chirped signal, has the following form [11]: where Ω = e jω0 , ω 0 defines the initial center frequency of the spectrum, Ψ = e jψ , ψ is the chirp rate which linearly shifts the center frequency with time, and ν c n is a white noise process with the same statistics as ν n . This chirped AR1 signal can be used to represent a signal whose spectrum is frequency offset and shifted with time in a nonstationary mobile communications environment. The chirped AR1 process and chirped sinusoid have been used to study the tracking behavior of adaptive filters because they provide an input signal with a single constant nonstationary component [11,12,13,14,15]. Chirped signals are also used in conjunction with OFDM communications and radar systems to optimize power transmission over a wide bandwidth when the propagation medium is time-varying [10]. At the receiver the signal is given by where n n is the AWGN process with power P n . Figure 1 represents the linear ∆-step adaptive predictor structure to be analyzed, where W c (n) are the adaptive filter weights. The weight update equation of the LMS algorithm is where µ is the step-size parameter of the adaptive algorithm, X c (n) is the adaptive filter input tap-vector at time n, and * denotes the complex conjugate. For the ∆-step predictor, . . .
The error update equation is given by The finite-length Wiener predictor weight and the corresponding MSE are given as where R c (n) is the autocorrelation matrix of the input signal vector R c (n) = E[X c * (n)X cT (n)], P c (n) is the crosscorrelation of the input signal vector with the desired response P c (n) = E[X c * (n)x c n ], and J c is the MSE of the finite-length Wiener predictor. By setting ω = 0 and ψ = 0, R is the autocorrelation matrix of the corresponding stationary baseband input signal x n , P is the cross-correlation vector, Wiener predictor is W 0 = R −1 P , and the Wiener MSE is J w . It has been shown [11] that J c w (n) of the Wiener predictor for the chirped input signal x c n is equal to J w of the Wiener predictor for the corresponding stationary baseband input signal x n .

THE LMS PREDICTOR FOR CHIRPED INPUT SIGNALS
The error transfer function approach derived in [9] provides a method to approximate the total steady-state MSE of the LMS adaptive filter without explicitly invoking the independence assumptions for wide-sense stationary input signals, that is, signals with a fixed PSD. For a chirped input signal x c n , the PSD is constantly shifting with time, and this approach is not directly applicable. However, the adaptive recovery of a narrowband chirped signal using a ∆-step transversal predictor has one important characteristic, that is, the frequency offset among the input signal taps is the chirp rate ψ, and the frequency offset between the desired response x c n and input signal vector X c (n) is ∆×ψ. By multiplying the chirped input signal by a negative frequency offset sequence, we can transform the chirped signal s c n to its stationary form s n and leave the noise component n n unchanged since the AWGN has a constant spectral envelope across all frequencies. In the following, it is shown that the above transform will not change the MSE of the LMS adaptive predictor for a chirped input signal. This allows the error transfer function approach to be applied to rotated LMS algorithm with the transformed input signals in order to approximate the MSE of the standard LMS adaptive predictor with chirped input signals.

Equivalence of MSEs
For a chirped input signal x c n = s c n + n n , n = 0, 1, 2, . . . , where s c n has initial center frequency ω 0 and chirp rate ψ, we define a transformed process, where superscript u denotes the unchirped process. This operation will transform the chirped input signal to a stationary baseband signal, and it will change the formulation of the standard LMS algorithm in (4) and (6). Multiplying (6) by Ω −(n+1) Ψ −(n+1) 2 /2 , and defining e u n = Ω −n Ψ −n 2 /2 e c n , n = 0, 1, 2, . . . (which is the transformed version of the predictor error signal for chirped input process using the LMS adaptive predictor), this transforms (6) to In (9), .
is the corresponding predictor weight in the transformed domain and is the transformed version of the chirped input signal vector X c (n + 1). Applying (8) to the vector elements in X c (n + 1) results in stationary baseband signals.
Using (10) and (11), (4) can be shown to become where is the chirp rotation matrix. Since e u n is the transformed version of e c n , they have the same power, that is, Consequently, the MSE of the LMS adaptive predictor with a chirped input signal x c n is equal to the MSE of a different LMS adaptive predictor with a corresponding stationary baseband input signal x u n . Note that the two adaptive predictors have the same length M and step-size µ. Equations (9) and (12) define the error and weight vectors of the rotated LMS adaptive predictor. The only difference between these equations and the standard LMS adaptive predictor for stationary input signals as in (4) and (6) is that the weight vector is rotated in frequency by the chirp matrix V ∆ after each normal LMS update, as shown in Figures 2 and 3.

Error transfer function approach for the rotated LMS adaptive predictor
First, we decompose the rotated LMS adaptive predictor weight into the sum of a time-invariant finite-length Wiener predictor weight and a time-varying misadjustment component whereW u is the mean weight misadjustment corresponding to the weight fluctuation caused by weight rotation. From (12), the weight misadjustment is given by where I is the identity matrix. The mean weight misadjustment is when n → ∞, that is, the adaptive filter reaches steady state, is independent of X u * (n)X uT (n), and it is not necessary for W u mis (n) to be independent of X u (n). This steady-state mean weight misadjustment term corresponds to the lag weight misadjustment of the LMS adaptive predictor with a chirped input process as shown in [12].
The recursive weight update equation (12) can be written as The adaptive filter output is (21) At steady state, V * n ∆ W u (0) can be replaced with W 0 +W u mis , thus the error process e u n satisfies the recursive difference equation Using the approximations [9] where r u x (k) is the autocorrelation of the stationary input signal x u n , we have where Equation (22)  We can interpret the steady-state (n → ∞) rotated LMS adaptive predictor error e u n as the output of a time-invariant linear system with transfer function H(z) driven by the wide-sense stationary error process x u n −[W 0 +W u mis ] T X u (n), where H(z) is given by The steady-state MSE of the rotated LMS adaptive predictor is thus where are the transfer functions of the finite-length Wiener predictor and mean weight misadjustment of the rotated LMS adaptive predictor, respectively. S u xx (z) is the PSD of the stationary input process x u n transformed from the chirped input signal x c n . The error transfer function approach can also be applied to the Normalized LMS (NLMS) algorithm as defined below [9] with H(z) = 1 1 + µR(z)/(P s + P n ) . (32)

BOUND OF THE ∆-STEP ADAPTIVE PREDICTOR
Using the recursive equations (4) and (6), it follows from [8] that the LMS adaptive predictor is a nonlinear estimator of The optimal MSE estimator C opt using the same data as the adaptive predictor is given by For wide-sense stationary input process, the performance of ∆-step prediction is bounded by that of the optimal MSE estimator, which is the one-step infinite-length Wiener predictor. The optimal estimator is independent of the prediction distance ∆.
Since the finite-length Wiener predictor is not recursive, it can be written aŝ To illustrate that the nonlinear effect is small for the one-step LMS adaptive predictor, but increases in magnitude as the prediction distance ∆ is increased, Figures 4 and 5 delineate the data utilized by the adaptive predictor, the finite-length Wiener predictor and the optimal estimator for one-step and ∆-step prediction (∆ > 1). Figure 4 shows that for one-step prediction of x c n , the data available to the adaptive predictor and the optimal estimator but not available to the finite-length Wiener predictor is defined by the sequence  Figure 4: Information utilized by one-step adaptive predictor, finite-length Wiener predictor, and optimal estimator. The data segment marked by arrows is the information available to adaptive predictor and optimal estimator, but not available to finite-length Wiener predictor in the prediction of x c n . With an increase in the prediction distance ∆, there will be more information available to the adaptive predictor than to the finite-length Wiener predictor, and consequently the adaptive predictor may outperform the finite-length Wiener predictor. Conversely, the adaptive predictor performance is bounded by the one-step infinite-length Wiener predictor since they utilize the same amount of information and there is misadjustment noise associated with the adaptive predictor. Note that Figures 4 and 5 are also applicable to RLS adaptive predictors, that is, for one-step and multiple-step predictions. The LMS and RLS adaptive predictors use information from the same input data, so that any performance difference between these two algorithms must be explained from their difference in adjusting the filter weights according to the feedback errors.

THE COMPARATIVE PERFORMANCE OF THE LMS AND RLS ALGORITHMS
For simplicity, we only compare the two adaptive algorithms with stationary input signals x n . The weight update equation of the exponentially weighted RLS adaptive algorithm is given by [1] where Φ(n) = n i=0 λ n−i X * (i)X(i) T is the input signal autocorrelation matrix estimate at time n, and λ is the forgetting factor of the RLS algorithm. Decompose the weight vector as The predictor error is The steady-state error update equation of the RLS adaptive predictor is given by The following two approximations are used at steady state: Defining To show the difference of the two algorithms in utilizing past prediction errors, the error feedback equation of the LMS algorithm (26) is rewritten for a stationary input signal as where cl j = µMr x (j). Figure 6 is a plot of cl j and cr j , j = 1, 2, . . . , 50 for a narrowband AR1 input signal embedded in AWGN, with AR1 pole location a = 0.99, SNR = 10 dB, adaptive filter length M = 25. The LMS step-size µ = 0.01, and the RLS forgetting factor λ = 0.9. The error feedback coefficients of the RLS adaptive predictor exhibit a null for small j, which means that the contributions from the most recent prediction errors to the current estimate at time index n are nulled out. For the LMS adaptive predictor the most recent prediction errors contribute more to the current estimate than the time delayed prediction errors. Note that the choices of µ and λ only affect the magnitudes, not the shapes of the curves.

SIMULATIONS
For a chirped AR1 input, the autocorrelation of input signal vectors r c x (k) is given by where a is pole location of the transformed stationary baseband AR1 input signal. Equation (28) becomes The feedback transfer function in (27) is thus where a c = aΨ −(∆+(M−1)/2) , g c = (1−µMP s )a c . The steadystate MSE approximation for the LMS adaptive predictor can be computed from (29) using (45). Similarly, the steady-state MSE approximation of the NLMS adaptive predictor can be calculated from (29), (32), and (43). By setting ψ = 0, the MSE approximation of the standard LMS adaptive predictor for a stationary input signal is calculated. In the following simulations for multiple-step prediction, the NLMS adaptive predictors are used instead of the standard LMS adaptive predictors because the NLMS algorithm is stable for relatively larger values of the adaptive filter stepsize (0 < µ < 2) [9], where the nonlinear effects of adaptive algorithm are most significant. The MSEs of the finite-length Wiener predictor and the optimal estimator are calculated theoretically for AR1 input processes.  Figure 7 is a plot of MSEs for one-step LMS adaptive predictors as a function of filter step-size µ with a chirped AR1 input signal, where signal initial frequency ω 0 = 0.2π , chirp rate ψ = 5πe − 5, AR1 process pole location α = 0.999, signal power P s = 1, SNR = 0 dB, filter length M = 2. Simulation results and theoretical calculations using both the transfer function approach and the independence assumptions are plotted. These results are compared to the MSEs obtained for the finite-length Wiener predictor and the optimal estimator. It can be seen that in a small range of adaptive filter step-size parameters µ, the MSE from the error transfer function approach and simulation results are smaller than the MSE of the finite-length Wiener predictor. Extensive simulations and analytical results show that for the one-step LMS adaptive predictor, the nonlinear effect is small and observable only for very small filter length, very narrow bandwidth input signals. One possible explanation of this phenomenon is that under these conditions, the information in −∞ } which is available to adaptive predictor but not available to finite-length Wiener predictor, will have effective contributions to the prediction of current signal x c n . Figure 8 plots the MSEs of a 40-step NLMS and RLS adaptive predictors for a stationary and a chirped input signal with chirp rate ψ = 5πe − 4, signal pole location a = 0.99, input signal power P s = 1, SNR = 20 dB, and M = 25. For NLMS predictors, the MSEs obtained by the error transfer function approach and the simulation results are compared for both stationary and chirped inputs. The simulation results of the RLS adaptive predictor for stationary inputs reveal that the nonlinear effects are negligible for RLS algorithms. Comparing the results with Figure 7, the range of the adaptive filter step-size µ over which the NLMS adaptive predictors outperform the finite-length Wiener predictor is much larger, and the magnitude of the nonlinear effect is significant at optimal step-size (in this case, the optimal step-size for the adaptive predictor to achieve minimum MSE is about µ = 0.8).    One possible explanation for this is that for multiple-step prediction, the additional data which is available to adaptive predictors but not available to the finite-length Wiener predictor consists of two parts: −∞ }, and for one-step prediction, only the second part is available. The main contribution to the nonlinear effects is the first part and with the increase of prediction distance ∆, the correlation between the desired response x c n and the second part decreases, and the second part has less contribution to current estimation. Figure 9 compares the MSEs of the finite-length Wiener predictor, the optimal estimator with the MSEs obtained in simulations achieved at optimal step-size µ opt as a function of prediction distance ∆. It shows that with the above parameters, the LMS adaptive filter outperforms Wiener filter for ∆ ≥ 5 and the nonlinear effect becomes more significant with increasing ∆. Figure 10 is a plot of the various MSEs versus input signal pole location a for a 40-step predictor. It shows that the range of input signal pole location over which the nonlinear effect is observable is from about 0.75 to around 1. This range is also much larger compared to the one-step prediction case.

CONCLUSIONS
In conclusion, this paper shows that for very narrowband input signals, either stationary or nonstationary, traditional analysis using the independence assumptions is not valid and the nonlinear effect of the adaptive filter must be considered. For narrowband input signals embedded in AWGN, the LMS adaptive predictor can outperform the finite-length Wiener predictor in steady-state MSE. These cases arise when the adaptive filter uses more information than the finite-length Wiener filter. It shows that the nonlinear effect of one-step LMS adaptive predictors is small and only observable for a narrow range of input signal and adaptive filter parameters, and it is significant for multiple-step LMS adaptive predictors for a wide range of parameters. A transform is defined to convert the chirped input signal to baseband stationary input signal, and an error transfer function approach is derived for chirped input signals to approximate the total steadystate MSE of the LMS adaptive predictors. The performance of the one-step infinite-length Wiener predictor is used as the optimal estimator to bound the performance of adaptive ∆-step predictors. The nonlinear effects are much larger for the LMS adaptive predictor than for the exponentially weighted RLS predictor for the case examined.