A Robust Orthogonal Adaptive Approach to SISO Deconvolution

This paper formulates in a common framework some results from the ﬁelds of robust ﬁltering, function approximation with orthogonal basis, and adaptive ﬁltering, and applies them for the design of a general deconvolution processor for SISO systems. The processor is designed to be robust to small parametric uncertainties in the system model, with a partially adaptive orthogonal structure. A simple gradient type of adaptive algorithm is applied to update the coe ﬃ cients that linearly combine the ﬁxed robust basis functions used to represent the deconvolver. The advantages of the design are inherited from the mentioned ﬁelds: low sensitivity to parameter uncertainty in the system model, good numerical and structural behaviour, and the capability of tracking changes in the systems dynamics. The linear equalization of a simple ADSL channel model is presented as an example including comparisons between the optimal nominal, adaptive FIR, and the proposed design.


INTRODUCTION
Deconvolution filters have a wide range of applications in communications, control and signal processing. Among other roles, they are used to reduce the distortion and additive noise that contaminate a signal propagating through some channel. When noise levels are negligible and the transmission part of the system is minimum phase and perfectly known, these filters are obtained as the inverse of the original system. However, if the system is nonminimum phase and noise is also present, a realizable deconvolution filter, that is, a filter that is stable and causal, and uses a finite smoothing lag, cannot achieve perfect signal reconstruction. In such case, the design procedure focuses on minimizing some performance index and thus, different optimal deconvolution filters are possible according to the objective functions used.
Another source of difficulty are the uncertainties in the model of the transmission path or in the noise spectrum.
This phenomenon is related to modeling errors, noise in the data used for identification, the random nature of the noise description, and other physical causes as time variations, changing environments, component aging, and drift. The deconvolution filter has to be capable of tracking these changes or exhibit a robust behavior assuring a good performance to the extent of these variations.
Different ways of dealing with these problems are available in the literature. If there is very little knowledge about the system, then blind or blind adaptive techniques have to be used [1,2]. These methods rely in random models and make use of the statistical theory for signal separation. When the system can be described by uncertain parametric models, robust approaches are available. In [3] a mean square error (MSE) is averaged with respect to model errors and noise. Probabilistic descriptions of the models uncertainties are used and the problem is formulated and solved by means of a polynomial approach. These results are further extended to nonlinear equalization applications in [4] and presented as a general polynomial equations framework for nominal and robust multivariable linear filtering in [5]. The problem of nonlinear equalization is also addressed in [6] where a design method for decision feedback equalizers (DFE) to be applied in transmission systems with small parameter perturbations is presented. A simple probabilistic structure for channel and noise models is used and then a closed-form result in the frequency domain using calculus of variation and spectral factorization is derived. This same methodology is used in [7] to solve the problem of linear deconvolution.
All these approaches yield time-invariant, that is, fixed, recursive structures for the optimal filters. However, in applications where the environment may suffer larger changes, the filters will also require some degree of adaptation. Timevarying or adaptive deconvolution filters are the common solution to this problem, increasing complexity, computational load, and cost. This type of solutions usually involves the use of transversal or finite impulse response (FIR) adaptive filters as an approximation to naturally recursive systems.
The contribution of this paper is the formulation of a comprehensible framework that concentrates some of the results given in the fields of robust filter design, function approximation by orthogonal bases, and adaptive filtering. The aim is the design of a general deconvolution processor, robust to parametric uncertainties in the system model and with a partially adaptive recursive orthonormal structure.
Robustness focuses on assuring a reasonable performance over the range of "practical restricted complexity parameterized system models," a set of rational functions identified from a finite noisy data record, and gaining properties similar to the designs of [3] or [7]. The recursive orthogonal structure has a twofold function. First, it approximates recursive systems naturally, requiring less parameters than FIR approximations. Second, it gives the design the classical advantages of orthonormal bases, that is, modularity, good numerical conditioning, and simplified performance analysis [8] along with other practical properties [9]. Adaptation is intended to extend the range of applicability of the design. Simple strategies can be used exploiting the orthogonal structure and updating only the coefficients that combine the basis functions. Because of its recursive nature, the performance can be close to full adaptivity with a lower computational load than that required by long FIR adaptive filters [10,11,12,13].
The design procedure is based on the optimization of a performance index that contemplates both the system model uncertainties and the usual quadratic error. The formulation is similar to that introduced in [6] for the nonlinear DFE and close to the development presented in [7]. The minimization follows the classical approach in the frequency domain and uses variational concepts. The results are presented in a theorem that establishes the optimum set of parameters of the robust orthogonal deconvolution processor.
The orthonormal structure is provided by time-invariant basis functions that have a simple construction [14] and allow the inclusion of different modes (poles). Adaptation is provided by a simple "gradient" updating algorithm. This algorithm updates the coefficients that linearly combine the basis functions. Some preliminary results in relation with this type of formulation were presented in [15] for a simplified deconvolution setup and in [16] for the application of echo cancellation. In this case, a fixed orthogonal basis (Laguerre) with a transversal filter type of adaptive structure was used for updating the coefficients.
The paper is organized as follows. Section 2 introduces some notation, general considerations, and the basis functions. The main results are developed in Section 3. Section 4 considers the coefficients updating algorithm. Section 5 presents an example where the proposed design strategy is used to derive an equalizer for a simple ADSL communication channel model. Comparisons of performance are made in terms of the MSE that different designs can theoretically achieve. Finally, in Section 6, some conclusions are drawn.

Notation and general description
Most common SISO deconvolution or inverse filtering problems are described by the simple scheme illustrated in Figure 1 where the signals involved are modeled: H(q −1 , α) and D(q −1 , β) are linear time-invariant filters that form the system. They are functions of q −1 , the unitary delay operator, that is, q −1 f (k) = f (k − 1). These filters are not known exactly in the sense that they also depend on unknown real parameter vectors α and β. We will use the simplified notation H and D when this dependence does not need to be put explicitly into evidence or, for example, H(α) and D(β) when the time information is not central in an argument. The same considerations apply when working in the transform domain with Z{ f (k − 1)} = z −1 F(z). For example, the following representations of H(z −1 , α) are equivalent when used in the right context: H, H(z −1 ), and H(α).
The input shaping filters W(q −1 ) and T(q −1 ) are perfectly known invertible linear filters that model the stochastic sequences a(k) and c(k). The signals d(k), p(k), and n(k) are mutually independent, zero mean white stochastic sequences with variance σ 2 d , σ 2 p , and σ 2 n , respectively. The symbol * is used to denote complex conjugation on |z| = 1 and transposition, so that if G(z −1 , α) is a matrix of rational functions, then G * = G * (z −1 , α) = G T (z, α). The analytic part of H outside (resp., inside) the unit circle is denoted by {H} + (resp., {H} − ). The degree of a polynomial is indicated as O(·, ·), where the arguments stand for negative and positive powers of q (or z) in that order. If only one argument is used, System n(k) Deconvolution processor  The signals d(k), p(k), and n(k) play different roles depending on the particular application. In classical deconvolution, p(k) = 0 and d(k) is colored by W to generate the input signal a(k). The corrupting noise is represented by D(q −1 )n(k). In this case, F is designed as a linear processor that produces an efficient estimate of a possibly delayed version of the signal a(k). Estimation is performed by linear filtering or smoothing operations on the noise-corrupted output signal of H, x(k).
The signal enhancement problem can also be considered letting p(k) = 0. The signal of interest p(k) (or T(q −1 )p(k)), corrupted by the interference a(k), is to be recovered by subtracting from s(k) a filtered and noise-corrupted version of a(k), that is, x(k). The filters H and D are not completely known. The error e(k) is actually the estimated value of p(k). The goal is to design the linear processor F that will efficiently, in some well-defined sense, estimate the interference signal a(k) (or W(q −1 )d(k)).
Yet another application that is contemplated by the scheme of Figure 1 is the problem of linear equalization, which is described in detail in the example of Section 5.
All these deconvolution problems casted in the common framework of Figure 1 and described mathematically by (1) share the same formulation and solution, as will be shown later in this section.

System uncertainty description
The system uncertainties are modeled as where α = α 0 + δ α and β = β 0 + δ β are the parameters vectors with α 0 and β 0 representing the nominal or mean value of the parameters. The vectors δ α and δ β are independent zero mean random perturbations, with a priori known covariance The uncertainty on the parameters represented by δ α and δ β results in an uncertain system which can be thought of as having different realizations for each particular value of the parameters α and β, as shown by (3).
There are several approaches for the description of the additive perturbations ∆H and ∆D. These methods range from adjusting simple models to the set of systems from time or frequency experimental data, to the development of usually detailed high-order models that tightly describe the uncertainty boundaries in a certain range of frequencies of interest. See for example [17,18,19,20]. The derivation in [20] could be of particular interest if a common orthogonal basis framework for the representation of the system, uncertainty and deconvolver, is pursued.
Without loss of generality, and keeping in mind the existence of more refined approaches, a simple linear approximation is adopted following a formulation close to that of Lin et al. in [6] or Chen and Lin in [7].
Expanding H(α) and D(β) around the values H(α 0 ) and D(β 0 ) in Taylor series and retaining the linear terms yields where ∂H(α)/∂α and ∂D(β)/∂β are the Jacobian matrices of H and D, respectively. With the models (4), the statistical characterization of the system uncertainties is straightforward

A family of orthogonal basis function
A generalized type of orthonormal construction will be used for the deconvolution processor F. Some of the advantages of this type of realizations for adaptive infinite impulse response (IIR) filters are discussed in [12]. If F is a linear time-invariant stable filter (or smoother), it can be expanded and represented as with L n (q −1 , Λ n ) a complete set of orthonormal basis functions in the Hilbert space H 2 of square (Lebesgue) integrable functions on the unit circle {z : |z| = 1} and analytic for |z| > 1. These basis functions are characterized by the subset Λ n of parameters taken from the general (finite or infinite) set with λ i ∈ C. The practical idea is to approximate F in (9) with a finite number of terms and a finite set of parameters from Λ. The following basis functions were reported in [14] and will be used in the expansion (9): where ν n = 1 − |λ n | 2 is the normalization constant, d is 0 or 1, and Λ F is a finite set of parameters that depend on the function F. The functions in (11) have the property of allowing the inclusion of a variety of modes (different basis parameters usually coincident with the poles of F). Furthermore, they provide a unifying formulation for almost all known system identification orthonormal constructions such as FIR, Laguerre, and Kautz models. Moreover, methods using balanced realizations of user-chosen dynamics such as that presented in [21] can also be generated by (11). From a practical point of view, the inclusion of different modes means that F may be exactly represented by (9) and with a finite number of terms if the basis parameters are adequately chosen. Another relationship stemming from (11) and useful for implementation purposes is the recursive form where η n+1 = ν n+1 /ν n , C(λ n ) = 1 − q −1 λ n and C(λ n ) = q −1 − λ n . Equations (11) and (12) are valid for real or complex parameters. Usually, in linear dynamical systems and for physical considerations, complex poles appear in conjugate pairs and the impulse response of the system is real. In this case, the new basis functions associated with the complex poles pairs are built in a different way. The construction uses linear combinations of those generated by (11), preserving orthogonality and assuring a real-valued impulse response [14]. For each pair of complex poles then, and if d = 0, the associated basis has the form where The other pair of coefficients grouped by vector x 2 = [a b ] T can then be found as a function of x 1 by evaluating where ρ = (λ n + λ * n )/(1 + |λ n | 2 ). With these expressions, and if the components of Λ F are real or complex conjugate pairs, the basis functions will have real impulse responses.

Problem formulation
From Figure 1, using (1), and with the system model given by (3), the error sequence e(k) is Assuming the signals d(k), p(k), and n(k) are also statistically independent of the model uncertainties and using (5)-(8), the MSE over the models uncertainties becomes where the operator E ∆ [·] is the expectation applied only over the uncertainties in the models ∆H and ∆D. As a measure of performance, the mean value of ξ over time E k [·] is considered, and this is simply the MSE, where ψ is the minimum phase right spectral factor of the spectral factorization [22] Taking into account the general objective of designing a deconvolution processor robust to parameter uncertainty with an orthogonal structure, the problem formulation may now be summarized in the following statement. Given the system (1), find the causal and stable deconvolution processor F o , with the structure given by (9) and using the orthonormal functions (11), that minimizes the performance index J of (18).

PROBLEM SOLUTION
Theorem 1. For the system (1), the optimal causal and stable deconvolution processor with the orthogonal structure of (9) that minimizes the performance index J given by (18) is The maximum number of terms of (20) is M = 2(N + S + P) + l + 1 and Λ o is the optimal basis parameter set, with λ z = {0, . . . , 0, p W1 , . . . , p WP }, composed of l + 1 zeros and P additional parameters, p Wi that are the poles of W, and λ ψ = The optimal coefficients of (20) are Proof. See the appendix.

Comments on these results
This theorem establishes the parameters Λ o and the coefficients Θ o that completely define the deconvolver F o with the orthogonal structure given by (20), together with the maximum number of basis functions required. In this sense, the theorem solves one of the problems usually associated with the approximation of functions with orthogonal basis, which is the way the parameters have to be chosen to optimally approximate a desired function [23]. In this case the desired function is the optimal deconvolution processor and the representation achieved using the bases is exact, it is not an approximation. This is so because of the multiple modes (parameters or poles) admissible by the basis functions. Also, these sets of parameters and coefficients represent the best choice in the MSE sense that defines a deconvolver capable of dealing with a whole family of systems as described by (3) and (4). Again, in this sense, we say the orthogonal deconvolver is robust to parameter uncertainty in the system. The poles of the orthogonal deconvolver are defined by Λ o in (21). This set is composed by l + 1 poles in zero plus the poles of W plus the zeros of ψ. It can be directly verified that in the case when no noise is present (n(k) = 0 or D = 0), the input is white (W = 1 ), the delay l = 0, H is minimum phase, and the parameters are unperturbed, then F o = H −1 and Λ o just groups the zeros of H. For this case, the coefficients Θ o will be such that the zeros of the numerator of the rational function resulting from (20) are the poles of H.
In the appendix during the proof of the theorem the following expression appears as an intermediate result for the optimal deconvolution processor: This expression is coincident with that obtained in [7] and may be compared with the classical Wiener filtering results, for example, in [24]. It is particularly useful to analyze and interpret some of the characteristics of the optimal deconvolver that finally appear in the orthogonal structure. First, F o may be considered as a cascade of two filters. The filter ψ −1 has an inherent recursive structure that is independent of the delay (see in (19) that ψ is fixed and unique for a given system and shaping filter W). From (24), the filter {Qz −l } + has the poles of W and l + 1 poles in zero. When the design delay l changes, only this part of the deconvolution processor varies accordingly. When W = 1, that is, when the input is white noise, the deconvolution processor is a cascade of an FIR filter and an IIR filter. In this case, only the zeros part of λ z will be present. So, if W = 1 and l = 0, the IIR part of the deconvolution processor is the optimal filter up to a scale factor. If l > 0, the deconvolver is a smoother and the FIR part of the processor performs the smoothing while the IIR portion remains unchanged. Any improvement in the performance of the deconvolution processor is generated by the FIR and the number of taps of this filter depends directly on the order of the delay l.
An additional comment applies referring to the structure of the deconvolver. The form of (20) is not the most practical from the point of view of implementation. Using the relation (12), the whole set of basis function can be generated as a cascade of first-order or second-order filters, depending on whether the poles are real or complex conjugate. This structure is illustrated in Figure 2 for the case when the basis parameters are real. It results in a very modular construction where additional basis functions can be easily incorporated if needed without affecting the existing structure.

Design algorithm
Before considering the incorporation of some adaptive capability to the deconvolver, the steps or algorithm for the optimal robust orthonormal design are summarized.
(1) Given the system and signal descriptions, choose the parameters that will be considered uncertain so as to give a good representation of the measured effects. (2) Evaluate Γ ∆H and Γ ∆D with (6) and (8), respectively.
(3) Evaluate the spectral factorization (19). (4) Evaluate (24). (5) Evaluate the basis parameters (poles) of (21), that is, l + 1 zeros, plus the poles of W, plus the zeros of ψ, and build the basis. (6) Evaluate the basis combining coefficients Θ with (23). (7) The robust orthonormal deconvolution processor is built with (20) or using the equivalent representation based in the recursive expression (12) as shown for example in Figure 2.
The recursive form is preferred from the point of view of implementation and also convenient for the development of the adaptation strategy for the Θ.

COEFFICIENTS UPDATE
The robust orthogonal design can handle systems whose perturbation parameters δ α and δ β are small enough for the Taylor series expansion in (4) to remain valid. When the system departs from such region, the MSE performance deteriorates. In order to keep a desired performance for larger perturbations and also for tracking slowly time varying systems, some degree of adaptivity is incorporated by updating only the coefficients of the linear combination of the basis functions. The main assumption is that the nominal or mean model for the system is still valid and representative of the real system and only the uncertainty region results enlarged. The basis structure remains fixed as well as the parameters Λ o and the new set of coefficients Θ that now approximate the optimal deconvolver will be close to the initial optimal robust design. Figure 1 includes an updating algorithm in the general scheme of the deconvolver and Figure 3 illustrates the case when W = 1, l > 0, and Λ o is real, so the deconvolver has the FIR-IIR cascade structure mentioned in the previous section with the coefficients Θ being updated by an adaptation algorithm.

Updating algorithm
The coefficients calculated from (23) are now treated as time varying and denoted accordingly as The updating algorithm is derived by minimizing an error functional σ(Θ, k) that is a function of the coefficients, where is a generalized regressor composed of the input signal to the deconvolution processor x(k), filtered by the basis functions.
Depending on the number of zeros in λ z , the generalized regressor may include some delayed samples of x(k), for example, in the case illustrated in Figure 3. Expanding (26), with U(k) = E{s(k)X(k)} and R I (k) = E{X(k)X T (k)}. A gradient-based family of adaptive algorithms can be generated by using a coefficient-updating equation of the form where is the gradient vector of the error functional (28) in the coefficients space and µ is the convergence factor, a small positive real number. Different approaches for the evaluation of an estimate of the real theoretical gradient G(k) result in different algorithms. One of the most popular approaches uses the instantaneous values of U(k) and R I (k) as estimates of their FIR filter IIR filter . . .
Adaptive algorithm means, that is, Using (31) in the gradient (30), where e I (k) = s(k) − a(k − l) is the instantaneous error of the adaptive structure. Using this estimation for the gradient in (29), the equation for updating the coefficients is and the algorithm may be classified as a transform domain least mean square or LMS [25,26]. With a slight increase in complexity, a recursive least squares or a lattice-like algorithm [27] may also be derived, but this will not be pursued here. The tracking capability and noise performance of this and other types of algorithms, related to these basis functions, have been analyzed in [13] for the application of system modeling. Also, issues related to convergence speed and other properties for orthogonal realizations of IIR filters were discussed in [12].

EXAMPLE: LINEAR ROBUST ADAPTIVE EQUALIZATION FOR AN ADSL TYPE OF COMMUNICATION CHANNEL
The general problem of equalization and particularly adaptive equalization is well described in [28] and a review with comparisons between recursive and nonrecursive techniques is given in [29]. Linear equalization is a particular case of the general deconvolution problem where T = 0. Additionally, the reference signal s(k) (a delayed version of a(k)) is generated as the output of a decision device in the receiver, assuming the decisions are correct. Figure 4 illustrates the adaptive linear equalization setup. The parts of the diagram in dashed lines represent the practical implementation for the generation of the reference signal in the receiver. The following simplifying assumptions are made to design the equalizer for this example: the design delay is l = 1 and the data sequence is a white noise signal, W = 1. The modeling assumptions are discussed first, then the robust orthogonal design is shown and finally adaptation is considered. Performance comparisons are presented in these steps. Figure 5 shows the frequency response (FR, normalized to 0 dB at zero frequency) of a subscriber telephone loop, with a length of 2.9 Km (gauge 24 AWG) with a bridge tap of 100 meters of gauge 26 AWG, used in this example for asymmetric digital subscriber line (ADSL) transmissions. It System n(k) Deconvolution processor

Modeling
Adaptive algorithm  was generated from the chain matrix characterization for this type of channels [30] with a bandwidth that extends to 1.104 Mhz.
The FR exhibits a notch around a frequency of 500 kHz. The frequency location of this notch is related to the minimum of the input impedance that presents an open circuited section of cable at frequencies for which the length is an odd number of quarter wavelengths. The attenuation or depth of the notch is proportional to the length and to the square root of the notch frequency. Also included in the same figure is the FR of a discrete third-order model designed to approximate the analog response. This model has the following expression in the transform domain: and is characterized by the nominal parameter vector The response of this model is 4 dB within the real FR curve and it will be used for the purpose of illustrating the potential performance of the proposed linear deconvolver. Nevertheless, it should not be considered as a reference model for general ADSL systems or digital subscriber loops [31].
The effect of the variations of the individual numerator coefficients of H on the FR are illustrated in Figure 6. Perturbations on b 0 have important effects in the depth of the notch and the gain of the high frequency portion of the response. Changes in b 1 seem to affect the whole response in a rather mild way, preserving the basic shape and modifying the location of the notch. The coefficients b 2 and b 3 affect both the location and depth of the notch but do not have much influence in the low frequency portion of the response.
Although H is not a physical model and its parameters are not necessarily related to the loop parameters, the family of responses or channels generated by the changes in these parameters can be associated with the uncertainties that arise when attempting to describe the loop. Usually the length, the exact location of bridge taps, and the precise conformation of the loop are not known. Additionally, most parameters are indirectly determined by impedance measurements. All these facts add up and make the determination of the exact response of the channel a difficult task. Uncertainties arise naturally about the overall gain of the loop and the location and depth of the notch, even though the shape (or mean value) of the response will not suffer considerable changes. Thus, it seems reasonable to consider an uncertain description for the channel as follows. The model (34) represents the nominal channel and b 1 the perturbed parameter. In this way, variations in b 1 model potential uncertainties, without distorting the basic shape of the FR over the whole range of frequencies of interest.
One of the most severe types of interference in ADSL is the near-end crosstalk (NEXT) produced by the voltages and currents induced in the line by nearby pairs of wires [30,32]. The "average and asymptotic" NEXT power is proportional to f 1.5 and depends on some parameters of the particular line. A first-order ARMA model D = (d 0 + d 1 z −1 )/(1 + c 1 z −1 ) is used to shape the white noise sequence n(k) with a power spectrum similar to the NEXT interference. This filter is characterized by the parameter vector To control the signal-to-noise ratio (SNR) at the input of the equalizer, the variance or power of the signal measured at the output of the channel H, σ 2 y is normalized to 1, and the gain of filter D is set in accordance with the following definition: where σ 2 v is the variance of the colored noise at the output of D.
For adaptive equalization, transversal FIR filters are the standard choice for many reasons [22,27,28,33], so comparisons with classical fixed recursive and adaptive FIR designs are made. First, the number of coefficients required for an FIR equalizer will be evaluated. Figure 7 shows the minimum MSE (MMSE) attainable as a function of the number of taps used for the equalizer. The family of curves is parameterized by the SNR. The MSE is limited by the SNR, so for low SNR, the performance of the equalizer is necessarily poor and only a few coefficients in the FIR are enough to attain the optimal performance. As the SNR rises, the number of taps needed to reach the MMSE is larger. If an SNR of 80 dB is considered the "no-noise design," then a minimum of 50 taps will be required by the FIR to approximate the optimal response.

Robust orthogonal design
Under the same design conditions, a similar analysis can be performed for the robust equalizer using the variance of the uncertain parameter b 1 as a "tuning knob." Figure 8 shows the MMSE attainable with the robust equalizer as a function of the SNR. The curves are parameterized by the variance σ 2 b1 .  For low SNR, even the unperturbed IIR design (the lower curve for σ 2 b1 = 0.00001 is almost coincident with the unperturbed design) has a poor performance with an MSE that is nearly in a one-to-one relation with the SNR. The curves show that the design variance has to be below 0.001 to obtain an MSE that is under −100 dB, that is, to obtain a performance similar to the FIR for the "no-noise design." The effect of the variance of the parameter b 1 in the design may be better appreciated in Figure 9 that illustrates the MSE when the parameter b 1 departs from its nominal value for an SNR of 35 dB. The solid line curves correspond to the fixed nominal (unperturbed) IIR and 50-tap FIR designs. This two curves overlap, confirming that the FIR filter can very well approximate the optimal recursive equal- izer. The dashed-line curves correspond to robust designs for different values of σ 2 b1 (the lower error curve corresponds to σ 2 b1 = 0.001). For higher variances, the designs are more conservative, the MSE grows and the curves tend to be "flatter." The performance is worst around the nominal value of the parameter but improves and even exceeds the nominal designs for larger deviations of b 1 . This is very reasonable since robustness against channel uncertainty is obtained at the expense of lack of performance at the nominal value. These curves can be directly compared and coincide with those obtained using the approach of [7].
From the previous analysis we select σ 2 b1 = 0.001, and the steps of the design algorithm for a SNR of 35 dB are as follows.

Robust and adaptive design
The previous design procedure did not incorporate coefficient adaptation. When adaptive equalization is considered, the FIR and the robust orthogonal design coefficients Θ are updated by an adaptation algorithm. The performance of the equalizers is evaluated in terms of the MMSE attainable when b 1 changes slowly with time (when compared to the convergence speed of the algorithms) and around the nominal design value. Figures 10 and 11 depict these results for two different SNRs. In Figure 10 the SNR is of 35 dB and the MMSE for the nominal design (solid line) is around −20 dB. The adaptive FIR is plotted with a dashed line and exhibits a great improvement in the performance when compared to fixed designs. This improvement is obtained at the cost of adapting all 50 coefficients. The dashed-dotted line corresponds to the robust adaptive orthogonal design that also improves the performance of the robust designs of Figure 9. It has almost the same performance as the FIR for positive perturbations of b 1 , but is over 4 dB above for negative variations of more than 50% in the channel parameter. The main structure of the equalizer is fixed and only 6 coefficients are updated to obtain this performance. Figure 10 also includes the response of a deconvolver designed as detailed in the previous sections but with an overparameterized orthogonal basis representation (dotted line). Additional parameters are added to the optimal Λ o to improve the performance for large deviations in b 1 . The coefficients θ i associated with these new parameters are almost zero when the perturbations are small and start to have significant values for larger departures. The optimal selection of these additional parameters is related in this particular case to the zeros of H, and more generally to the zeros of ψ that change as the system is perturbed. In this example, the added parameters are {0.2225 ± 0.9045i}. This means that the total number of coefficients to be adapted is 8 and the performance is almost the same as for the 50-tap adaptive FIR in the whole range of variation of b 1 . Figure 11 shows the MSE when the SNR is 50 dB. Again the performance of the robust adaptive orthogonal designs approaches the totally adaptive FIR, with only 6 to 8 adaptive coefficients.

Remarks
The example was developed assuming that the uncertainty is described by only one perturbed parameter just for clarity and simplicity of tuning. The procedure may be applied similarly, when more than one parameter is perturbed, including the coefficients of the denominator of H and D.
Over-parameterization of the basis functions may give significant improvement in the performance with little additional cost. A technique to find an optimal and systematic procedure for the selection of the additional parameters is actually a subject of research along with the potential problems of this type of over-parameterized adaptive and recursive structures [12,25].
Possible extensions of the orthogonal adaptive structure to more specific applications in communications include the design of decision feedback equalizers. The feedforward and feedback filters of the DFE can be given an orthogonal structure with the basis (11) and a coefficient-updating strategy, similar to that of Section 4, used to make both filters partially adaptive. The feasibility of this approach was initially investigated in [34] and could be used for comparisons with the robust fixed designs of Lin et al. in [6] or Sternad et al. in [4], since both of these approaches deal with the problem of robust DFE design. Also in this area, the partially adaptive recursive structure for the feedback filter of the DFE may be a good alternative to long FIR adaptive filters [11,31,35]. Other applications and performance analysis of this approach are currently the subject of further research.

CONCLUSIONS
A design strategy for a general SISO robust orthogonal adaptive deconvolution processor has been presented. The approach reformulates and combines results from the fields of robust filtering, function approximation with orthogonal basis, and adaptive filtering. The design exhibits several of the  advantages related to these fields: (i) it is robust to parameter uncertainties in the system model; (ii) it is recursive and will require a smaller number of parameters than FIR counterparts for similar performance, hence the total computational burden is also smaller than for adaptive FIR designs; (iii) it has an orthogonal structure with good numerical properties and is very modular from an implementation point of view; (iv) it is adaptive and recursive but with a fixed pole structure so it does not have the potential stability problems of adaptive IIR filters [25,27]; and (v) the complexity of the algorithms used for updating the coefficients is comparable to those used for FIR adaptive filters.
The main result was presented in the form of a theorem that puts together the design of the robust recursive deconvolver with an orthogonal basis representation and establishes how the basis parameters have to be selected.
An example was presented with the design of an equalizer for a simple ADSL channel model with NEXT interference. The simulation results show that the proposed design can extend the range of operation of fixed linear designs. It performs as well as FIR designs which require much more adaptive coefficients to yield acceptable results. Moreover, it was shown that the performance can be further improved by an over-parameterization of the orthogonal basis with a small increase in the number of adaptive parameters.
We summarize our contribution as having presented a design procedure for a filtering structure with a good tradeoff between computational burden and performance, under a wide variety of conditions and uncertainties with applications throughout the signal processing, communications, and control fields.

PROOF OF THEOREM 1
The proof proceeds in two steps. First, a general expression for the optimal deconvolver is obtained, and second, the exact representation of this processor by means of the orthogonal basis is developed. For the first step, the minimization will be performed using the calculus of variations methodology [36] following a procedure close to that of [6,7].
Initially a perturbation to the optimal unknown processor F o is included in the following form: where ζ(q −1 ) is an arbitrary, rational in q −1 , and realizable function, analytic on and outside the unit circle, and κ is a small bounded real constant. Replacing (A.1) in (18) and using Parseval's theorem to express the performance index in the transform domain, The simultaneous necessary and sufficient conditions to be satisfied for a minimum in J are [36] ∂J ∂κ κ=0 = 0, (A.3) The conditions imposed by (A.4) mean that the integral 1 2π j |z|=1 ζ * ψ * ψζ dz z (A.5) must be positive, and this is always satisfied because the integrands involved have the symmetric form f * (z −1 ) f (z −1 ) and are always greater than zero when integrated over the unit circle. The condition of (A.3) implies that the following integral must be equal to zero: Defining Q = σ 2 d W * WH * (ψ * ) −1 , applying the {·} + and {·} − operators, and writing condition (A.6) as two integrals give the following equivalent condition: Applying Cauchy's theorem, (A.7) is satisfied if the following part of the integrand is zero, since it is the only one which may have poles inside the unit circle. Finally, from (A.8) the expression for the optimal robust deconvolution processor is Three comments on this result follow. First, a realizable F o can only eliminate the parts of the integrand that involve {·} + terms (the filter is analytic outside the unit circle and as so, causal). Second, the {·} − term is a rational function starting with a free z to cancel the pole at the origin of the integrand in (A.7). Third, for symmetry reasons, if one of the integrals in (A.7) is zero, so will be the other. This derivation is usually followed in the classical Wiener filtering approach (see for example [37]), and the reader may compare this result with the ones in [6,7,24].
We now perform an analysis of the operations involved in (A.9) to establish bounds on the maximum degree of the polynomials that conform the optimal deconvolver and justify the parameter assignment of the orthogonal basis.
Recalling that H and D are up to O(N) and O(S), respectively, and assuming that at least one of the denominator parameters of these functions is perturbed, then, from (6), Γ ∆H is rational with numerator polynomial of O(N, N) and denominator O (2N, 2N). For Γ ∆D , the degrees are O(S, S) for the numerator and O(2S, 2S) for the denominator, respectively. Both polynomials of the rational spectral factorization ψ * of (19) have degrees upper bounded by O(2N + 2S + P, 2N + 2S + P). Performing the product to calculate Qz −l results in a rational function of O(P + l, 3N + 2S + P) for both numerator and denominator polynomials. All the poles of Q are outside the unit circle, except those afforded by W of O(P). Thus, {Qz −l } + is O(P) if l = 0, or O(P + l) for the numerator polynomial if l > 0. The optimal deconvolver is then conformed by a cascade of two filters. One is ψ −1 of maximum O (2N +2S + P). The other is {Qz −l } + with a maximum number of terms for the numerator polynomial of P + l.
To represent exactly the optimal deconvolver by means of the basis functions (11), the parameter set Λ has to be assigned to match the pole structure of F o . This means setting the P poles of W, plus l + 1 poles at the origin to account for the delay, plus 2N + 2S + P additional poles that are the zeros of ψ. This justifies the parameter assignment of (21) and the number of terms involved in (20).
Finally, (23) is the standard inner product used to determine the coefficients that linearly combine the basis functions once the parameters have been optimally assigned. This concludes the proof.