An FIR Notch Filter for Adaptive Filtering of a Sinusoid in Correlated Noise

A novel adaptive FIR filter for the estimation of a single-tone sinusoid corrupted by additive noise is described. The filter is based on an offline optimization procedure which, for a given notch frequency, computes the filter coefficients such that the frequency response is unity at that frequency and a weighted noise gain is minimized. A set of such coefficients is obtained for notch frequencies chosen at regular intervals in a given range. The filter coefficients corresponding to any frequency in the range are computed using an interpolation scheme. An adaptation algorithm is developed so that the filter tracks the sinusoid of unknown frequency. The algorithm first estimates the frequency of the sinusoid and then updates the filter coefficients using this estimate. An application of the algorithm to beamforming is included for angle-of-arrival estimation. Simulation results are presented for a sinusoid in correlated noise, and compared with those for the adaptive IIR notch filter.


INTRODUCTION
Estimation of sinusoidal signals and their frequencies from noisy measurements is important in many fields such as angle of arrival estimation, frequency-shift keying (FSK) demodulation, Doppler estimation of radar waveforms, biomedical engineering, sensor array processing, and cancellation of periodic interferences [1]. The observed signal has the following general form: The problem in many applications is to recover the signal and/or its frequency θ, from the noisy observations x(k). Various adaptive filtering algorithms have been introduced for solving such problems. The least-mean-square (LMS) algorithm [2] based on the FIR transversal filter has been widely used due to its simplicity and robustness. On the other hand, the performance of this algorithm deteriorates when the input signal is correlated [3]. Transform-domain techniques have been introduced to decorrelate the input signal and achieve faster convergence [3,4]. Also, in certain applications, the filter length required for a satisfactory performance is large. The adaptive IIR filter, also known as the adaptive notch filter [5][6][7], has been introduced as an alternative to the LMS FIR filter. The IIR filter has the outstanding advantage of requiring considerably fewer coefficients compared with its FIR counterpart. However, the performance of the IIR notch filter in correlated noise has not been studied well in the literature. In [8], an adaptive IIR notch filter for suppressing narrow-band interference is described, where the filter's bandwidth is adaptively controlled to maximize SNR.
In this paper, a notch filter based on the FIR structure is presented which has offline optimized magnitude responses [9] at all frequencies in the range [0, π]. The magnitude frequency response of the proposed filter is designed to minimize a criterion which depends on the noise suppression performance and takes into account the power spectrum of the noise. It is assumed that the noise power is concentrated in a certain frequency range which can be estimated. The filter coefficients are then adapted to track the input signal by using filter coefficients stored at preselected frequencies in the range [0, π]. In this way, an adaptive FIR notch filter with frequency responses optimized to reject correlated noise is obtained. This approach resembles that of [10] which employs online constraints for waveform estimation in the frequency domain. It is shown that the proposed filter provides performance gains compared with the adaptive IIR notch filter in terms of signal and frequency estimation, at frequencies outside of the noise band.
The proposed adaptive filter is suitable for adaptive line enhancer applications where the noise is correlated and the power spectra can be estimated (e.g., using periodogram 2 EURASIP Journal on Applied Signal Processing techniques). It can also be successfully employed in adaptive beamforming applications [2], where interference in certain directions can be effectively suppressed.
The paper is organized as follows. In Section 2, the optimization of the FIR notch filter is described. In Section 3, the adaptation algorithm for tracking the input signal is obtained and stability analysis is performed in Section 4. Section 5 describes an example application of the algorithm to beamforming. Section 6 presents the simulation results for the proposed method and for the adaptive IIR filter.

OFFLINE-OPTIMIZED FIR NOTCH FILTER
Consider a predictive FIR filter of length N, where x o is the output of the filter and h o,n , n = 0, . . . , N − 1, are the filter coefficients which will be optimized. In order that the filter can predict a sinusoidal signal x o (k) = A cos(kθ 0 + φ) at frequency θ 0 , the following equations must be satisfied: Equation (3) can be written in matrix form as where A = 1 cosθ 0 cos 2θ 0 · · · cos (N − 1)θ 0 0 sinθ 0 sin 2θ 0 · · · sin (N − 1)θ 0 , This filter can be optimized by minimizing a cost function which depends on the noise suppression performance, subject to the constraints in (4). The frequency response of the filter in (2) is In order to suppress the correlated noise frequency components and design the frequency response of the proposed filter, the following cost function is defined: where θ m , m = 1, . . . , M, is the frequency range represented by M samples and w m , m = 1, . . . , M, are the weights that can be used to shape the frequency response. Minimization of J 1 subject to the constraints in (4) can be achieved by using the method of Lagrange multipliers. Incorporating the constraint (4) in the cost function, we obtain where λ = [λ1 λ 2] T is the vector of Lagrange multipliers. The minimization of J 2 with respect to h o results in the following equations: In (9), a 1, j and a 2, j are the elements of A. Equation (9) can be put into matrix form as In order to solve for the multipliers, we substitute (11) in (4). Then the vector λ can be solved as Finally, the filter coefficient vector h o can be obtained from (11) as

THE ADAPTIVE FILTER
Consider an adaptive predictive FIR filter of the form where x(k + 1) is the output of the filter, h(k) = [h0(k) · · · h N−1 (k)] T is the vector of the filter coefficients at time step k, and is the observed data vector, where x(k) = x o (k) + q(k). It is assumed that q(k) is a zero-mean Gaussian random process. The adaptive FIR filter will be designed such that it predicts a sinusoid x o of any frequency θ s by utilizing the optimum filter coefficients computed at a selected frequency θ 0 , where θ s is assumed to lie in a certain neighborhood of θ 0 . The prediction error is defined as With the filter coefficient vector fixed at h o , an error transfer function can be defined as Note that (17) is a notch filter tuned at the frequency θ 0 . Therefore, the first pair of the zeros (z 0,1 , z 0,2 ) of the polynomial in (17) corresponds to the frequency θ 0 . Hence H e,o (z) can be written as where {z 0,l , l = 3, . . . , N} are the remaining zeros of the polynomial. The proposed adaptive filter is based on the following parameterized error transfer function: where H n (z) denotes the product term in (18), which has constant coefficients of the powers of z. Now, H e (z, α) can be written in the form of H e,o (z) in (17) as The α-dependent filter coefficients can be obtained by expanding H e (z, α) in (19) as a power series. It is obvious that the coefficients will be linear in α, as follows: The filter will be optimal with respect to the cost function in (7) when α = α 0 = 2 cos(θ 0 ). However, it can be argued that the noise power will not change significantly when α is in the vicinity of α 0 . Note that (20) is a notch filter tuned at the frequency θ = cos −1 (α/2). Filtered prediction error is defined as follows: where h e may be chosen equal to h o . This error may be assumed to be equal to the difference between the original signal x o (k+1) and the prediction, where θ s is the original signal frequency, Now, if the following assumption is made: where α s = 2 cos θ s , then (23) can be written as The correction in the coefficient vector can be approximated by where b = [b0 · · · b N−1] T , and Δ θ s is the correction in the estimated frequency of the signal. Substituting (26) in (25) and solving for Δ θ s , the updating scheme for θ s is obtained as where μ is a suitably chosen stepsize. The coefficient vector is then updated using where α(k+1) = 2 cos( θ s (k + 1)). Note that the term b T x N (k) in (27) may become arbitrarily small since it is a linear combination of noisy sinusoids. In such case, the correction in the frequency cannot be solved from (25) and (26). Higher-order terms may be required in the series expansion in (26) for the solution of Δ θ s . However, in such case, e p will become nonlinear in Δ θ s . This is avoided by equating the correction Δ θ s to zero in such a singular case, where μ is set to zero whenever Here, ε is a threshold for successive updates. When ε is too small, it may lead to instability. On the other hand, a large value of ε may result in decreased tracking performance. Note that ε must be chosen less than the maximum amplitude of b T x N (k). A reasonably accurate initial value of the frequency estimate can be obtained by a periodogram with a relatively short length FFT.
With the thresholding of the second term, the update equation for θ s (k) can be written as where In the implementation of the proposed adaptive filter, the frequency range [0, π] is divided into L = 18 intervals. The optimum filter coefficient vector is calculated at the centre frequency θ 0 (l), l = 1, . . . , L, of each interval. Figure 1 shows the frequency response of the optimized filter tuned at the frequency π/4. The vectors (a, b) are then calculated offline (only once) by (18) using symbolic computation and then stored in processor memory. Given the frequency estimate θ s (k) at time k, the filter coefficient vector is then calculated using h(k) = a(l) + b(l)α(k), where l is the index of the interval to which θ s (k) belongs. Note that the larger L is, the smaller the deviation of the cost function value will be, at any frequency θ s (k) in an interval l, from the optimal value for θ 0 (l). Increasing L will not complicate the design of the filter. However, with a larger L, the variance of the frequency estimate will decrease at the increased cost of the time taken to search for the interval l to which θ s (k) belongs. Table 1 shows the variation of the variance of the frequency estimation, where the frequency to be estimated is located at the center of each interval. It can be observed that for large intervals (small L), there is a large variability in the variance as the .385 × 10 −5 . Further, it should be noted that increasing the filter length N improves the performance of the proposed algorithm. The larger N is, the sharper the notch in the frequency response at the signal frequency. While increasing N, L should also be increased to minimize the variability in estimation variance.
The computational complexity of the proposed filter is comparable with that of the standard LMS (requires approximately 2N multiplications and 2N additions per sample), but is much lower than the transform-domain LMS. The algorithm requires approximately 3N multiplications and 3N additions per sample, where N is the filter length. The offline optimization is done once and requires a single application of the FFT where the length is approximately 2N. The IIR-ANF has a low complexity of approximately 10 multiplications and 10 additions.
A summary of the algorithm is given below. (symbolic computation is used); (c) find h n (α), then a n (l) and b n (l).

STABILITY ANALYSIS
The updating equation for the estimated frequency in (27) is a nonlinear stochastic discrete-time equation. An exact stability analysis is only possible by using Lyapunov's direct method which is analytically intractable for this system. Therefore, an approximate stability analysis is performed when θ s is assumed to be close to the original signal frequency θ s . The perturbation in θ s is defined as The corresponding perturbation in the parameter α is In order to simplify the analysis, it will be assumed that there is no filtering on the prediction error. The prediction error in (16) can be expressed as e p (k + 1) = x o (k + 1) + q(k + 1) − x(k + 1).
Now, (28) can be used to write O. Kukrer and A. Hocanin

5
The predicted signal in (35) can be written as The input vector in (37) can be written as where Substituting (39) and (37) in (35), Substituting (33), (34), and (40) in (27), we obtain Equation (41) In (42), q d (k + 1) = q(k + 1) − q(k + 1) and d 1 (k) = b T x N (k). Taking the expectation of (42), the time-dependent part of the second term becomes where p t = P{|d 1 (k)| > ε}. In Appendix A, it is shown that this term is negligible. Taking expectation, (42) becomes whereμ = μ · P{|d 1 (k)| > ε}. The first-order discrete-time equation in (44) is stable if 0 <μ < 2, in which case lim k→∞ E{δ θ s (k)} = 0, implying that the frequency estimates are unbiased. Using (42), it is also possible to show that the frequency estimate converges to its true value in the mean, square sense, and the variance of the frequency estimate, which is obtained for white noise, is given as where G n = h(θ s ) 2 , λ = (1 − 2μ + μ 2 p t ), P = π/θ s , and A is the amplitude of b T x N . The derivation of (45) is outlined in Appendix B.

APPLICATION TO BEAMFORMING
The proposed method is well suited to be applied in beamforming applications with angle-of-arrival estimation. Consider an array of N sensors with real gains and an incident signal x 0 (k). The output of the nth sensor is then where θ is the angle of arrival (AOA) of the signal. The beamformer output can be written as The directional response of the beamformer is Following the procedure given in Section 2, the gain of the beamformer at a selected angle of arrival can be made unity, while the weights can be chosen to shape the response. Figure 2 shows the response where interference arising in the directions from 0.8 to 1.4 radian is suppressed by approximately −40 dB. Note that since the gains of the beamformer are real, the response is symmetrical about 0 • . The output of the beamformer can be written in general as where 6 EURASIP Journal on Applied Signal Processing  In (50), θ 0 and θ i are the angles-of-arrival of the main signal and the interference, respectively. In (49) it is assumed that the AOA of the main signal is not known and the gains h n (k) are adapted to estimate this angle. For this adaptation, the error signal is and is complex in general. Therefore, the update equation for the AOA estimate should be written as . (52)

SIMULATION RESULTS
For frequency estimation of a noisy sinusoid, the parameters used in the simulations are N = 16, a = 1, L = 18, μ = 0.01. Table 2 shows the estimates of selected frequencies in the range [0, π] in AWGN (σ 2 q = 0.25) and in correlated Gaussian noise (CGN) (σ 2 q = 0.30), averaged over 30 000 samples. The estimates are generally unbiased with a maximum absolute error of 0.7%. Figure 3 shows the theoretical and computed variance of estimated frequency in AWGN. There is good agreement except at the edges of the frequency range. This is due to the terms in the denominator in (45) which become zero at the edges. The RMSE performance of the FIR notch filter (FIR-ANF) is compared with that of the IIR notch filter (IIR-ANF, constrained poles and zeros [6]). The noise is correlated Gaussian with variance σ 2 q = 0.30 and is obtained by filtering white noise using a filter having the magnitude response shown in Figure 4. In order that a fair comparison is made, the stepsize of IIR-ANF is adjusted to have the same convergence rate as the FIR-ANF. Alternatively, the RMSE for the two methods could have been fixed to observe the improvement in the convergence rate. Figure 5 shows the computed RMSE values over the complete frequency range. It is observed that the RMSE of the FIR-ANF is less than that of IIR-ANF over almost the entire frequency range, except in 1.5 < θ s < 2.3 rad. (around the center of the noise band). Figure 6 shows the variances of the estimated signal frequency for the two methods. It is again observed that the variance of FIR-ANF is better than that of IIR-ANF over the frequency range, except in 1.2 < θ s < 2.4 rad. As the signal frequency approaches the noise band, the variance of FIR-ANF rapidly increases above that of the IIR-ANF. This is due to the inefficiency of the FIR filter to suppress noise components which are very near the signal frequency. Figure 7 shows the convergence of the estimated frequency from an initial value of θ s (0) = π/6 to the actual value θ s = π/4 with the FIR-ANF. It is observed that the convergence is almost exponential, that is, (1 −μ) k , which is the solution of (44). It is also important to note that the initial frequency estimate does not have to be in very close vicinity of the true one. Figure 8 shows the convergence of the estimated frequency when the initial value is at the center of the correlated noise band (1.92 rad.), which poses the biggest challenge for the algorithm (it still converges to the true frequency which is 2.80 rad.). However, as the initial frequency is far from the steady state value for this case, the linear model in (42) is not valid any more and the response of the error is not exponential. It should be noted here that, whatever parameters are chosen, IIR-ANF does not converge under the same conditions. Figure 9 shows the responses of the frequency estimates with FIR and IIR-ANFs for the case where the frequency variances are equated. A beamforming application is also simulated where the actual angle-of-arrival of the signal is 4 • and the initial estimate is 0 • . Figure 10 shows the convergence of the estimated AOA to the true one. The frequencies of the signal and the interference are 0.087 radian and 0.87 radian, respectively. Sensor inputs are assumed to be corrupted by zeromean Gaussian noise with uniform directional density. The signal-to-interference ratio (SINR) of the beamformer input is SINR = 4.15. After convergence, the signal-to-interference ratio is calculated as SINR = 84.9, with an increase by a factor of 20.45.

CONCLUSIONS
A new adaptive notch FIR filter is introduced. This filter has the novel feature that its frequency responses can be optimized in an offline manner. The proposed filter is considerably more flexible in shaping the frequency response, and thereby rejecting noise in selected frequency ranges. Unlike the IIR filter, the adaptive FIR filter is always stable for suitable choice of stepsizes. The algorithm can effectively be applied to beamforming problems with AOA estimation, whereas the IIR counterpart is inapplicable. Simulation results indicate that except for the frequency range around the peak noise power, the FIR-ANF is superior in estimating the sinusoid and its frequency in CGN, compared with the IIR notch filter.

A. EXPECTED VALUE OF THE SECOND TERM IN (41)
In [11], an approximation for the expected value of a function of two random variables is given as where g 0 = g(μ X , μ Y ) and the derivatives are evaluated at (μ X , μ Y ). For g(x, y) = x/ y (A.1), gives which can be applied to the expected value in (43). Letting X = q d (k + 1) and defining Y as a random variable taking values which satisfy the threshold. The fact that μ X = E{q d (k + 1)} = 0 gives where μ Y = E{d 1 (k) | |d 1 (k)| > ε}. Similar to (39), d 1 (k) can be written as  where s(k) is a sinusoid. Let s(k) = A sin((k + 1)θ s ), where A is determined by the vector b. Therefore, μ Y ∼ = s(k) whenever k ∈ I ∈ = {k : |d 1 (k) > ε|} and The time-average of the expected value is written as where τ = ε/A (is of the order of 0.1). The amplitude A can be estimated if the filter coefficients are approximated by those of an LMS adaptive line enhancer [12]: h n = 2 a * N cos (n + 1)θ s , n = 0, . . . , N − 1, where a * is approximately equal to one for large N. The derivatives with respect to α are obtained as b n = δh n δα = a * (n + 1) N sin θ s sin (n + 1)θ s , n = 0, . . . , N − 1. The covariance σ XY can be obtained as σ XY = E q d (k + 1) · d 1 (k) = b T r q − Qh α s , (A. 10) where Q is the autocorrelation matrix of the noise sequence and r q = E{q(k + 1) · q(k)}. It is difficult to derive a general result from (A.10) for any given correlated noise sequence.
To gain an insight as to the order of this term, we consider white noise. In this case, r q = 0 and Q = diag[σ 2 q , . . . , σ 2 q ] resulting in Note that the above expression is also approximately equal to the bias in the frequency estimate. Hence, for 0 < θ s < π and for sufficiently large N the bias is negligible.