EURASIP Journal on Applied Signal Processing 2005:17, 2874–2887 c ○ 2005 Hindawi Publishing Corporation Laguerre-Volterra Filters Optimization Based on Laguerre Spectra

In this paper, the optimization of Laguerre-Volterra filters (LVFs) is carried out adaptively. Each kernel is expanded on an independent Laguerre basis. An analytical solution to Laguerre poles optimization is provided using the knowledge of the expansion coefficients, also called Fourier coefficients, associated with an arbitrary Laguerre basis. These coefficients are estimated by means of the Normalized Least Mean Squares (NLMS) algorithm. The proposed method allows optimization of both the Fourier coefficients and the Laguerre poles.


INTRODUCTION
Truncated Volterra filters constitute a class of nonrecursive polynomial models. The main drawback of these models is their over-parameterization. During the last decade, in order to reduce the parametric complexity, that is, the number of parameters, three main approaches have been considered: (i) approximations by means of parallel-cascade structures composed of linear filters and memoryless nonlinearities [1]; (ii) algebraic decompositions of matrices or tensors formed with kernels coefficients [2,3,4]; (iii) expansions on discrete orthonormal bases of functions (OBFs) [5,6,7,8].
The class of OBFs generally used for modeling purposes is that of rational orthonormal bases, such as Laguerre basis [9]. The Laguerre functions have the property of being completely characterized by a single parameter, the Laguerre pole. When expanding a Volterra kernel on a Laguerre basis, the parsimony of the expansion is strongly linked to the choice of the Laguerre pole. Expansion of Volterra kernels on Laguerre basis was first suggested by Wiener in the 1960s [10]. To the best of our knowledge, Campello et al. [11] were the first to derive an analytical solution to the Laguerre pole optimization for Volterra models. They have generalized the work in [12] and have also shown that using independent bases to expand each kernel gives better results than the use of a single basis. However the obtained analytical solution requires the knowledge of the Volterra kernels. Consequently, a step of Volterra kernels estimation or reconstruction is necessary before computing the Laguerre pole. Note again that this method is applicable if and only if the kernels are strictly causal, that is, the Volterra kernels must satisfy the unit delay condition.
In order to circumvent these limitations, a new approach is proposed in this paper. It is based on the knowledge of the estimated expansion coefficients, also called Fourier coefficients, rather than the Volterra kernels coefficients. The requirement of a unit delay is relaxed. This approach can be viewed as a generalization of both [11,13].
The organization of the paper is as follows. In the next section, the principle of Volterra model expansion on OBFs is recalled and the analytical expression of the optimal Laguerre pole is given for each kernel. In Section 3, this pole is expressed in terms of the Fourier coefficients relative to the corresponding kernel. Then batch and adaptive identification methods are described in Section 4 and illustrated by means of simulation results in Section 5, before concluding the paper in Section 6.

BACKGROUND
A discrete-time Pth order Volterra filter is generally described as h p n 1 , . . . , n p p j=1 u n − n j , (1) where u, y, and h p denote respectively the input, the output, and the pth-order Volterra kernel. Boyd and Chua [14] showed that any causal, time invariant, nonlinear system, with fading memory, can be represented to an arbitrary degree of accuracy by a finite expansion in Volterra series.
Considering the expansion of the kernel h p on an OBF, where the coefficients h p n 1 , . . . , n p b k1,p n 1 · · · b kp,p n p (3) are called the Fourier coefficients associated with the pthorder kernel. When the used OBFs are Laguerre functions, the set of Fourier coefficients {g k1,...,kp } constitutes the Laguerre spectrum of the pth-order kernel. Then the input-output equation (1) can be rewritten as where s kj ,p (n) = ∞ i=0 b kj ,p (i)u(n − i). The OBFs used in this work are discrete-time Laguerre functions defined by their z-transforms as follows: One can note that only the parameter ξ p , called the Laguerre pole, characterizes the set of Laguerre functions {b k,p (i)}. The derivations presented in this paper are mainly based on the two following properties of Laguerre functions [13,15,16]: In the sequel, the basis associated with each kernel is separately optimized. We define the following cost function: where h p 2 = ∞ n1=0 · · · ∞ np=0 h 2 p (n 1 , . . . , n p ). It was shown in [11] that this cost function is an upper bound of the modeling squared error due to the truncation with a finite order of the Laguerre expansion.
Note that the numerator of the cost function is a convex, differentiable, and nonnegative function defined on the open convex set P = {ξ p ∈ : |ξ p | < 1} while its denominator is a concave function, differentiable and positive on P . Consequently, J p is a pseudoconvex function inside P . Then any solution to ∂J p /∂ξ p = 0 is a global minimum of J p [17]. It is straightforward to show that this minimum is reached for where It should be noted that ρ o,p is a characteristic of the kernel h p since it depends only on Q j,p , j = 1, 2, and therefore on the Volterra kernel coefficients h p (n 1 , . . . , n p ).

OPTIMAL POLES EXPRESSIONS BASED ON THE LAGUERRE SPECTRA
The optimal pole (12) is related to ρ o,p that depends on the Volterra kernel h p , which means that it is necessary to carry out an estimation of the kernel before determining the optimal pole. In this section, an expression of ρ o,p in terms of the Laguerre spectrum of the pth-order kernel expanded on any Laguerre basis is investigated. Such an expression will enable us to determine the optimal pole directly from the estimated Laguerre spectrum without computing the Volterra kernel coefficients. Similarly to definitions (9) and (10) associated with the Volterra kernel coefficients, we define the following quantities that depend on the Laguerre spectrum of the pth-order kernel: with j = 1, 2, and . . .
Example 1. For the quadratic case, that is, p = 2, we get 2k l + 1 g 2 k1,k2 , l = 1, 2, Now, the objective is to express Q j,p , j = 1, 2, as a function of R j,p . First of all, one can notice that, as described in Appendix A, the property (7) Thanks to relation (18), the following lemma is derived. Lemma 1. R 1,p and R 2,p are linked by means of their derivatives with respect to ξ p as follows: The proof of this lemma is based on a simple but long calculation summarized in Appendix B.
By remembering that h p 2 is a characteristic of the nonlinear system to be identified, and consequently independent of the Laguerre basis, the orthonormality of the Laguerre basis allows to get the following expression: h p 2 = ∞ k1=0 · · · ∞ kp=0 g 2 k1,...,kp . Then, from the definitions of R 1,p and of J p , a simple calculation yields thus From (22), one can note that R 1,p is strictly positive. Consequently R 2,p is strictly decreasing inside P . We derive a second lemma proved in Appendix C.
Lemma 2. The terms Q j,p associated with the Volterra kernel h p and R j,p , j = 1, 2, associated with its Laguerre spectrum are linked by Obviously, Q j,p can be expressed as a function of R j,p , j = 1, 2, by solving the linear system of (24) and (25): Thus ρ o,p , given by (13), can be expressed as a function of R 1,p and R 2,p that depend only on the Laguerre spectrum as follows: This result is summarized by the following theorem.
Theorem 1. The pole of the optimal Laguerre basis, associated with the expansion of the Volterra kernel h p , is obtained from the Laguerre spectrum, associated with the expansion of the same kernel on an arbitrary Laguerre basis characterized by the pole ξ p , as follows: where ρ o,p is given by the formula (27).
This result is particularly useful. When the Volterra kernel expansion on an arbitrary Laguerre basis characterized by the pole ξ p is infinite, ρ o,p can be recovered in using the associated Laguerre spectrum as stated by formula (27) with (14), (15), and (16) for the calculation of R 1,p and R 2,p .

PARAMETERS ESTIMATION METHODS FOR LAGUERRE-VOLTERRA MODELS
In practical cases, the expansion of the Volterra kernels on a Laguerre basis is truncated to a finite-order K. Consequently, for a given Laguerre pole ξ p , by using truncated expressions of (15) and (16), we get ρ o,p , which is an approximation of ρ o,p , the actual characteristic of the system. This approximation depends on the Laguerre pole and on the Laguerre spectrum, via R 1,p and R 2,p . For a fixed pole ξ p , the optimization of the Laguerre spectrum improves the approximation ρ o,p . It is then possible to determine the optimal pole corresponding to the current approximation ρ o,p . This process is iterated until convergence. For a fixed Laguerre pole, the optimization of Fourier coefficients, therefore of R 1,p and R 2,p , is done either by using the MMSE estimator (batch method) or by using a stochastic-gradient-based algorithm (adaptive method). The Laguerre pole optimization is done by using (28) with the approximated value ρ o,p . We consider the Laguerre-Volterra filter described by (4): ..,kp and S p (n) the pth-order cross-products of the s k,p (n) signals. In the sequel d(n) will denote the actual output of the system to be modeled.

Batch method
With a batch method, for fixed Laguerre poles, the Fourier coefficients can be estimated using the well-known MMSE (minimum mean square error)-based solution such as where , and E denotes the mathematical expectation. Note that to avoid eventual ill conditioning of the Γ Γ Γ matrix, an orthogonal formulation of the MMSE estimator can be used instead of the standard formula (30) (see, e.g., [10,18,19,20]). In order to derive an iterative procedure for Laguerre poles estimation, note that the combination of (20) with (23) yields As stated before, since J p is a pseudoconvex function inside P , then R 2,p is zero when J p is minimal and the reciprocal is true; that is, when R 2,p is nearly zero, then J p is close to its minimal value and consequently the pole ξ p is close to its optimal value. In other words, when R 2,p is nearly zero, the approximation ρ o,p is the best possible one. Thus the iterative procedure can be stopped. This property enables us to derive the following batch estimation method. Kernel reconstruction Add. - (1) Select arbitrary poles in the domain P and build the corresponding Laguerre bases. (2) For n = 0, . . . , N − 1, filter the input u(n) by the Laguerre filters defined by (5), calculate the cross products of the filtered inputs s k,p (n) for generating the vectors S p (n) and then S(n). (3) Estimate the Laguerre spectrum for each kernel by means of (30) in replacing Γ Γ Γ and C by estimated values obtained as time-averaged values of the products S(n)S T (n) and d(n)S(n). (4) For each basis, that is, p = 1, . . . , P, evaluate T j,l and R j,p , l = 1, . . . , p, j = 1, 2, by using truncated versions of (15), (16), and formula (14).
(b) determine a new pole ξ p using (28), (c) build a Laguerre basis associated with the pole ξ p determined in the previous step, and return to 2 until convergence.
By comparing with Campello's method [11], the main advantage of the algorithm proposed herein is its low computational complexity (See Table 1). Without considering the Laguerre spectrum estimation and the input filtering by the OBFs, because they are common to both the proposed method and the method in [11], at each iteration the estimation of the Laguerre pole associated with the pth-order kernel requires O(pK p ) operations. The computational cost reduction is due to the fact that, unlike Campello's method, no Volterra kernel reconstruction is required.

Adaptive method
This second method can be seen as a block-NLMS (normalized least mean square) method. The principle of this method is to adapt the Fourier coefficients until a given convergence criterion is satisfied, then the Laguerre poles are estimated, and the calculation is iterated with new data. Note that the Fourier coefficients are adapted by means of the NLMS algorithm (33) while the Laguerre poles are estimated by using (28):

(c) Computation
(1) Calculate the filtered inputs s k,p (n) associated with the Laguerre filters B k,p (z) defined by (5), organize the cross products of the filtered inputs into the vectors S p (n) to generate the vector S(n). (2) Estimate the Laguerre spectra (3) For p = 1, . . . , P, calculate R 1,p,n and R 2,p,n by using truncated expressions of (15), (16), and (14), (b) evaluate ρ o,p,n by using (27), (c) determine new Laguerre poles according to (28), (d) build the associated Laguerre basis, increment n, and return to step 1; (ii) else increment n and return to step 1.
Note that the experimental convergence of this algorithm is similar to that of a standard NLMS algorithm in a nonstationary environment due to the modification of the estimated Laguerre poles.

SIMULATION RESULTS
In this section, we present the results of two experiments that illustrate the good properties of the proposed algorithms. In the first example, the optimized Volterra filter is run with the same structure as that of the system to be identified. The second example involves experimenting with the Volterra filter under conditions of model mismatch. The measurement noise is assumed to be a white Gaussian process with zero mean. All the simulation results are obtained as ensemble averages over 50 independent runs. Note that, for the batch method, the iteration number 1 corresponds to the initialized values of the Laguerre poles in Figures 2, 3, 4, 5, and 9.

Example 1
The problem under consideration is that of identifying a second-order Volterra system described by (i) first-order kernel (ii) second-order kernel where This system is simulated as a Volterra system with memory M = 20. Taking the symmetry of the quadratic kernel into account, this filter has 230 parameters to estimate. The input signal is a centered, white, Gaussian process with a unit variance. The output signal-to-noise ratio (SNR) is equal to 30 dB. N = 5000 input/output data are simulated. The identification performances are evaluated in terms of the normalized mean square error (NMSE): To validate the theoretical analysis presented in the previous sections and to evaluate the quality of the estimated Laguerre poles, the cost functions J 1 and J 2 defined in (8) are plotted in Figure 1. As stated before, these functions have a single minimum respectively located at ξ 1,opt = 0.525 and ξ 2,opt = 0.733.   One can note that the described system is not strictly causal, so the method of [11] cannot be applied. In incorporating a unit delay in the transfer functions, the resulting system fills the requirements of [11], and by applying this method we find, respectively, ξ 1,cam = 0.525 and ξ 2,cam = 0.734. These values are similar to the theoretical optimal values previously determined and are to be compared with those provided by the proposed estimation methods.

Batch estimation
The system identification is considered in a batch mode. The behavior of the proposed algorithm is particularly studied to point out the influence of three factors: the truncation order of the Laguerre expansions of the Volterra kernels, the initial choice of the Laguerre poles, and the stop criterion.
To evaluate the truncation order effects, several runs are done with different truncation orders. The Laguerre poles are initialized with the value 0.001. The obtained results are given in Table 2. Figure 2 illustrates the behavior of the estimated Laguerre poles during the iterations of the proposed algorithm. From the figure we can conclude that the algorithm converges in relatively few iterations. For a high truncation order (K = 18, e.g.), one iteration is sufficient to reach the optimal value while more iterations are needed for smaller truncation orders. Table 2 shows that the variations of the estimated poles values are very small when the truncation order varies between 3 and 10. However, for a larger truncation order, one can note an improvement of the value of the pole, more remarkable in the case of the quadratic kernel (Figure 2), and that of the convergence speed of the algorithm, with a higher computation cost. On the other hand, in the context of system identification, obviously the truncation order has a great influence on the precision of the identified model as indicated by the NMSE (Table 2). With a fixed truncation order, one can see that increasing the number of iterations improves not only the estimation of the Laguerre pole but also the overall precision of the identified model ( Figure 3). Figure 4 shows the effects of the Laguerre poles initialization on the convergence speed of the proposed batch algorithm, when the truncation order is fixed at K = 7. Although the initial values of the Laguerre poles are different, the algorithm converges toward the same Laguerre poles estimated values in few iterations. However, the convergence is faster when the initial value of the Laguerre pole is closer to the optimal value.
We have shown that when the estimated pole ξ p reaches its optimal value, R 2,p is equal to zero. We have also shown that R 2,p is strictly decreasing in P . Figure 5a plots the variations of R 2,1 and R 2,2 . One can note that when these terms become almost null, the poles reach their optimal values (Figure 5b). From these simulation results we can conclude that (i) generally few iterations are needed to find the optimal poles; (ii) the convergence is faster when high truncation orders are considered and when the initial values of the Laguerre poles are close to their optimal values; (iii) when the Laguerre poles reach their optimal value, R 2,1 and R 2,2 are almost null, so these quantities can be used as a stop criterion for the proposed algorithm; (iv) the estimated poles are close to the theoretical values and to those obtained by applying the method in [11].

Adaptive estimation
The same Volterra system is simulated with the same experimental conditions. Laguerre poles are initialized to 0. The Volterra kernels expansions on Laguerre bases are truncated to K = 7; thus the resulting Laguerre-Volterra filter has only 35 Fourier coefficients to estimate while the standard Volterra model has 230 parameters. The step size of the NLMS algorithm is chosen as µ = 0.3. The convergence thresholds are chosen equal to p = 10 −3 , p = 1, 2, and the window length is N O = 50.
The variations of the estimated Laguerre poles and of two Fourier coefficients are plotted in Figure 6 for both the linear and the quadratic kernels. One can note that the poles converge towards values close to those obtained with the block estimation method. One important question is that of the choice of the convergence threshold p and of the window length N O . When the value of p is chosen too small, a huge number of iterations can be necessary for the convergence of a new estimated Laguerre pole. Consequently, particular care should be taken for choosing these parameters.
To illustrate the convergence of the overall identification procedure, the NMSE is also plotted (Figure 7). The evolution of the NMSE associated with the proposed adaptive algorithm is compared with that obtained when the Laguerre poles are arbitrarily chosen equal to 0.2 and when the Laguerre poles are chosen equal to their optimal values obtained from the plots of the cost functions J 1 and J 2 ( Figure 1).
We also compare the proposed algorithm with the gradient method [5,13] for Laguerre pole estimation. Recall that the gradient method consists in minimizing the output squared error both with respect to the Laguerre pole and the Fourier coefficients. Two different initializations are considered for both methods: (ξ 1 = ξ 2 = 0) and (ξ 1 = 0.3; ξ 2 = 0.5). The step sizes of the gradient method are chosen equal to 1/300 and 1/1000 for the adaptation of the Laguerre pole respectively associated with the linear and the second-order kernel.
As depicted in Figure 8, the proposed adaptive method performs much better than the gradient method. The gradient method is very sensitive to the choice of the step size, and its convergence to the optimal value is not guaranteed. The convergence can lead to local minima, under the influence of the initialization.
From these simulation results we can note the following.
(i) In steady state, the performances of the proposed algorithm are similar to those obtained with the Laguerre-Volterra filter the poles of which are optimal. Obviously its convergence is slower. (ii) Both the adaptive and batch methods provide estimated poles close to the poles obtained with Campello's method and also close to those deduced from the plots of Figure 1. As previously stated, the advantage of the proposed methods is that the a priori knowledge of the Volterra kernels is not needed, and both Fourier coefficients and Laguerre poles are simultaneously estimated. (iii) The adaptive method performs better than the gradient method.

Example 2
Power amplifiers (PAs) are important elements in radio communication systems and they are inherently nonlinear. Some of PAs are modeled by a static nonlinearity such as a polynomial. However, for wideband applications (e.g., in wideband CDMA) and/or with high-power amplifiers (HPA) (e.g., base stations PAs), memory effects show up in the PA [21]. Having a memory means that the output of the PA is not only a function of the current input but also of past inputs. A relatively simple baseband behavioral model that accommodates memory as well as nonlinear behavior is the Wiener model, that is, a linear filter followed by a memoryless nonlinearity given by the Saleh model [22,23]: where A(r) and r are respectively the amplitude of the signal at the output and at the input of the memoryless HPA. The linear filter is a fourth-order Butterworth filter described in [24]: The input signal is real and Gaussian. We consider its transmission over an AWGN (additive white Gaussian noise) channel with SNR= 30 dB. The overall channel constituted by an HPA with memory followed by an AWGN channel is nonlinear. In this example, we consider the identification of the HPA with memory modeled as a linear-cubic Volterra system by using N = 15000 data. We first evaluate the batch estimation method. The truncation order is chosen equal to K = 9. This choice corresponds to a trade-off between the model complexity and the model precision. From Figure 9 we can conclude that the Laguerre poles respectively associated with the linear and the cubic kernels converge to the same value. This behavior was predictable since the dominant dynamics associated with the linear and the cubic kernels are the same for a Wiener model. Moreover, we evaluate the proposed adaptive algorithm for two kinds of input: a white Gaussian signal and a uniformly distributed 8-PAM signal, which is a signal often used in digital communications. One can see that the estimated poles converge to the same values as those obtained with the block estimation method ( Figure 10). The performances, in terms of NMSE, of the Laguerre-Volterra filter with optimized poles are plotted in Figure 11. Obviously the Gaussian input allows to get better performance since it is more exciting than the 8-PAM input. However, in terms of Laguerre poles estimation, the proposed algorithm delivers quite similar estimated poles with a Gaussian signal and a PAM signal, the algorithm convergence speed being faster with the Gaussian signal.

CONCLUSION
In this paper, the optimization of Volterra kernels expansions on Laguerre bases has been addressed. An analytical solution to this optimization problem has been obtained using Volterra kernels expansions on independent Laguerre bases. This solution has been expressed in terms of the estimated Laguerre spectrum associated with each Volterra kernel. Then batch and adaptive methods have been proposed to optimize both Laguerre spectra and Laguerre  poles. This approach generalizes previous works for linear [13] and nonlinear systems [11]. The performance of the proposed identification methods has been illustrated by means of simulation results that show the usefulness of the proposed methods. Theoretical stability and convergence analysis of the adaptive method will be the subject of further study. Among the OBFs of rational type, the Laguerre basis is the simplest one to optimize. However, they generally present worse performances for system approximation compared with generalized orthonormal bases (GOB) [25]. The authors have recently proposed some GOB selection procedures for Volterra kernels expansions [7,26]. These methods based on the minimization of a least squares criterion are exhaustive in nature, that is, they search the optimal poles in a finite set resulting from the discretization of the interval ] − 1, 1[. In future works the authors intend to investigate analytical solution to the problem of GOB optimization for Volterra kernels expansions.