E(cid:14)cient Channel Shortening Equalizer Design

. Abstract | Time-domain equalization is crucial in reducing channel state dimension in maximum likelihood sequence estimation, and inter-carrier and inter-symbol interference in multicarrier systems. A time-domain equalizer (TEQ) placed in cascade with the channel produces an e(cid:11)ective impulse response that is shorter than the channel impulse response. This paper analyzes two TEQ design methods amenable to cost-e(cid:11)ective real-time implementation: minimum mean squared error (MMSE) and maximum shortening SNR (MSSNR) methods. We reduce the complexity of computing the matrices in the MSSNR and MMSE designs by a factor of 140 and a factor of 16 (respectively) relative to existing approaches, without degrading performance. We prove that an in(cid:12)nite length MSSNR TEQ with unit norm TEQ constraint is symmetric. A symmetric TEQ halves FIR implementation complexity, enables parallel training of the frequency-domain equalizer and TEQ, reduces TEQ training complexity by a factor of 4 and doubles the length of the TEQ that can be designed using (cid:12)xed-point arithmetic, with only a small loss in bit rate. Simulations are presented for designs with a symmetric TEQ or target impulse response.


I. Introduction
Channel shortening, a generalization of equalization, has recently become necessary in receivers employing multicarrier modulation (MCM) [1].MCM techniques like orthogonal frequency division multiplexing (OFDM) and discrete multi-tone (DMT) have been deployed in applications such as the wireless LAN standards IEEE 802.11a and HIPERLAN/2, Digital Audio Broadcast (DAB) and Digital Video Broadcast (DVB) in Europe, and asymmetric and very-high-speed digital subscriber loops (ADSL, VDSL).MCM is attractive due to the ease with which it can combat channel dispersion, provided the channel delay spread is not greater than the length of the cyclic prefix (CP).However, if the cyclic prefix is not long enough, the orthogonality of the sub-carriers is lost, causing inter-carrier interference (ICI) and inter-symbol interference (ISI).
A well-known technique to combat the ICI/ISI caused by the inadequate CP length is the use of a time-domain equalizer (TEQ) in the receiver front end.The TEQ is a finite impulse response filter that shortens the channel so that the delay spread of the combined channel-equalizer impulse response is not longer than the CP length.The TEQ design problem has been extensively studied in the literature [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12].In [3], Falconer and Magee proposed a minimummean-square-error (MMSE) method for channel shortening, which was designed to reduce the complexity in maximum likelihood sequence estimation.More recently, Melsa, Younce, and Rohrs [5] proposed the maximum shortening SNR (MSSNR) method, which attempts to minimize the energy outside the window of interest while holding the energy inside fixed.This approach was generalized to the min-ISI method in [9], which allows the residual ISI to be shaped in the frequency domain.A blind, adaptive algorithm that searches for the TEQ maximizing the SSNR cost function was proposed in [10].
Channel shortening also has applications in maximum likelihood sequence estimation (MLSE) [13] and multiuser detection [14].For MLSE, for an alphabet of size A and an effective channel length of L c + 1, the complexity of MLSE grows as A Lc .One method of reducing this enormous complexity is to employ a prefilter to shorten the channel to a manageable length [2], [3].Similarly, in a multiuser system with a flat fading channel for each user, the optimum detector is the MLSE, yet complexity grows exponentially with the number of users."Channel shortening" can be implemented to suppress a specified number of the scalar channels, effectively reducing the number of users to be detected by the MLSE [14].In this context, "channel shortening" means reducing the number of scalar channels rather than reducing the number of channel taps.In this paper we focus on channel shortening for ADSL systems, but the same designs can be applied to channel shortening for the MLSE and for multiuser detectors.
This paper examines the MSSNR and MMSE methods of channel shortening.The structure of each solution is exploited to dramatically reduce the complexity of computing the TEQ.Previous work on reducing the complexity of the MSSNR design was presented in [8].This work exploited the fact that the matrices involved are almost Toeplitz, so the (i + 1, j + 1) element can be computed efficiently from the (i, j) element.Our proposed method makes use of this, but focuses rather on determining the matrices and eigenvector for a given delay based on the matrices and eigenvector computed for the previous delay.
In addition, we examine exploiting symmetry in the TEQ and in the target impulse response (TIR).In [15], it was shown that the MSSNR TEQ and the MMSE TIR were approximately symmetric.In [16] and [17], simulations were presented for algorithms that forced the MSSNR TEQ to be perfectly symmetric or skewsymmetric.This paper proves that the infinite-length MSSNR TEQ with a unit norm constraint on the TEQ is perfectly symmetric.We show how to exploit this symmetry in computing the MMSE TIR, adaptively comput- H with rows ∆ through ∆ + ν removed Element i, j of matrix A A * , A T , A H conjugate, transpose, and Hermitian ing the MSSNR TEQ, and in computing the frequencydomain equalizer (FEQ) in parallel with the TEQ.The remainder of this paper is organized as follows.Section II presents the system model and notation.Section III reviews the MSSNR and MMSE designs.Section IV discusses methods of reducing the computation of each design without a performance loss.Section V examines symmetry in the impulse response, and Section VI shows how to exploit this symmetry to further reduce the complexity, though with a possible small performance loss.Section VII provides simulation results, and Section VIII concludes the paper.

II. System Model and Notation
The multicarrier system model is shown in Fig. 1, and the notation is summarized in Table I.Each block of bits is divided up into N bins, and each bin is viewed as a QAM signal that will be modulated by a different carrier.An efficient means of implementing the multicarrier modulation in discrete time is to use an inverse fast Fourier transform (IFFT).The IFFT converts each bin (which acts as one of the frequency components) into a time-domain signal.After transmission, the receiver can use an FFT to recover the data within a bit error rate tolerance, provided that equalization has been performed properly.
In order for the subcarriers to be independent, the convolution of the signal and the channel must be a circular convolution.It is actually a linear convolution, so it is made to appear circular by adding a cyclic prefix to the start of each data block.The cyclic prefix is obtained by prepending the last ν samples of each block to the beginning of the block.If the CP is at least as long as the channel, then the output of each subchannel is equal to the input times a scalar complex gain factor.The signals in the bins can then be equalized by a bank of complex gains, referred to as a frequency domain equalizer (FEQ) [18].
The above discussion assumes that CP length + 1 is greater than or equal to the channel length.However, transmitting the cyclic prefix wastes time slots that could be used to transmit data.Thus, the CP is usually set to a reasonably small value, and a TEQ is employed to shorten the channel to this length.In ADSL and VDSL, the CP length is 1  16 of the block (symbol) length.As discussed in Section I, TEQ design methods have been well explored [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12].
One of the TEQ's main burdens, in terms of computational complexity, is due to the parameter ∆, which is the desired delay of the effective channel.The performance of most TEQ designs does not vary smoothly with delay [19], hence a global search over delay is required in order to compute an optimal design.Since the effective channel has L c + 1 taps, there are L c + 1 − ν locations in which one can place length ν +1 window of non-zero taps, hence 0 ≤ ∆ ≤ L c − ν.For typical downstream ADSL parameters, this means there are about 500 delay values to examine, and an optimal solution must be computed for each one.One of the goals of this paper is to show how to reuse computations from one value of ∆ to the next, greatly reducing this computational burden.

III. Review of the MSSNR and MMSE designs
This section reviews the MSSNR and MMSE designs for channel shortening.

A. The MSSNR solution
Consider the maximum shortening SNR (MSSNR) TEQ design [5].This technique attempts to maximize the ratio of the energy in a window of the effective channel over the energy in the remainder of the effective channel.Following [5], we define and Thus, c win = H win w yields a length ν + 1 window of the effective channel, and c wall = H wall w yields the remainder of the effective channel.The MSSNR design problem can be stated as "minimize c wall subject to the constraint c win = 1," as in [5].This reduces to min where A and B are real, symmetric Lw × Lw matrices.However, A is invertible, but B may not be [20].An alternative formulation that addresses this is to "maximize c win subject to the constraint c wall = 1," [20] which works well even when B is not invertible.The alternative formulation reduces to max where A and B are defined in (4).Solving (3) leads to a TEQ that satisfies the generalized eigenvector problem, and the alternative formulation in (5) leads to a related generalized eigenvector problem, The solution for w will be the generalized eigenvector corresponding to the smallest (largest) generalized eigenvalue λ (λ).Section IV shows how to obtain most of B (∆ + 1) from B (∆), how to obtain A (∆) from B (∆), and how to initialize the eigensolver for w (∆ + 1) based on the solution for w (∆).

B. The MMSE solution
The system model for the minimum mean-squared error (MMSE) solution [3] is shown in Fig. 2. It creates a virtual target impulse response (TIR) b of length ν + 1 such that the MSE, which is measured between the output of the effective channel and the output of the TIR, is minimized.In the absence of noise, if the input signal is white, then the optimal MMSE and MSSNR solutions are identical [6].A unified treatment of the MSSNR and noisy MMSE solutions was given in [15].
The MMSE design uses a target impulse response (TIR) b that must satisfy [2] where is the channel input-output cross-correlation matrix and is the channel output autocorrelation matrix.Typically, b is computed first, and then ( 8) is used to determine w.The goal is that h w approximates a delayed version of b.The target impulse response is the eigenvector corresponding to the minimum eigenvalue of [3], [4], [7] Section IV addresses how to determine most of R (∆ + 1) from R (∆), and how to use the solution for b (∆) to initialize the eigensolver for b (∆ + 1).

IV. Efficient computation
There is a tremendous amount of redundancy involved in the brute force calculation of the MSSNR design.This has been addressed in [8].This section discusses methods of reusing even more of the computations to dramatically decrease the required complexity.Specifically, for a given delay ∆, • A (∆) can be computed from B (∆) almost for free.
• A shifted version of the optimal MSSNR TEQ w (∆) can be used to initialize the generalized eigenvector solution for w (∆ + 1) to decrease the number of iterations needed for the eigenvector computation.
• A shifted version of the optimal MMSE TIR b (∆) can be used to initialize the generalized eigenvector solution for b (∆ + 1) to decrease the number of iterations needed for the eigenvector computation.We now discuss each of these points in turn.

A. Computing A (∆) from B (∆)
Let C = H T H, and recall that A = H T wall H wall and B = H T win H win .Note that Thus, To emphasize the dependence on the delay ∆, we write Since C is symmetric and Toeplitz, it is fully determined by its first row or column: C can be computed using less than L2 h multiply adds and its first column can be stored using Lw memory words.Since C is independent of ∆, we only need to compute it once.Then each time ∆ is incremented and the new B (∆) is computed, A (∆) can be computed from w additions and no multiplications.In constrast, the "brute force" method requires L2 w (L h − ν) multiply-adds per delay, and the method of [8] requires about Lw (L w + L h − ν) multiplyadds per delay.
, where The key observation is that This means that so most of B (∆ + 1) can be obtained without requiring any computations.Now partition B (∆ + 1) as where B is obtained from (19).Since B (∆ + 1) is almost Toeplitz, α and all of the elements of g save the last can be efficiently determined from the first column of B [8]. Computing each of these L w elements requires two multiply-adds.Finally, to compute the last element of g, requiring ν + 1 multiply-adds.
Recall that for the MMSE design, we must compute Note that R x does not depend on ∆, and that it is Toeplitz.Thus, Let we see that Combining ( 26) and ( 28) The matrix R r is symmetric and Toeplitz.However, the inverse of a Toeplitz matrix is, in general, not Toeplitz [21].This means that R (∆) has no further structure that can be easily exploited, so the first row and column of R (∆ + 1) cannot be obtained from the rest of R (∆ + 1) using the tricks in [8].Even so, (29) allows us to obtain most of the elements of each R (∆) for free, so only ν + 1 elements must be computed rather than (ν + 1) (ν + 2) /2 elements.In ADSL, ν = 32; in VDSL, ν can range up to 512; and in DVB, ν can range up to 2048.Thus, the proposed method reduces the complexity of calculating R (∆) by factors of 17, 257, and 1025 (respectively) for these standards.

D. Intelligent eigensolver initialization
Let w (∆) be the MSSNR solution for a given delay.If we were to increase the allowable filter length by 1, then it follows that should be a near-optimum solution, since it produces the same value of the shortening SNR as for the previous delay.Experience suggests that the TEQ coefficients are small near the edges, so the last tap can be removed without drastically affecting the performance.Therefore, is a fairly good solution for the delay ∆+1, so this should be the initialization for the generalized eigenvector solver for the next delay.Similarly, for the MMSE TIR, b (∆ + 1) = 0, b T (∆) (0:ν−1) T (32) should be the initialization for the eigenvector solver for the next delay.
267,911,168 9,369,696 proposed step MACs adds

E. Complexity comparison
Table II shows the (approximate) number of computations for each step of the MSSNR method, using the "brute force" approach, the method in [8], and the proposed approach.Note that N ∆ refers to the number of values of the delay that are possible (usually equal to the length of the effective channel minus the CP length).For a typical downstream ADSL system, the parameters are Lw = L w + 1 = 32, Lh = L h + 1 = 512, L c = L w + L h = 542, ν = 32, and N ∆ = Lc − ν = 511.The "example" lines in Table II show the required complexity for computing all of the A's and B s for these parameters using each approach.Observe that [8] beats the brute force method by a factor of 29, the proposed method beats [8] by a factor of 140, and the proposed method beats the brute force method by a factor of 4008.
Table III shows the (approximate) computational requirements of the "brute force" approach and the proposed approach for computing the matrices R (∆) , ∆ ∈ {∆ min , • • • , ∆ max }.The "example" line shows the required complexity for computing the R (∆) matrices using each method for the same parameter values as the example in Table II.The proposed method yields a decrease in complexity by a factor of the channel shortener length over two, which in this case is a factor of 16.
It is also interesting to compare the complexity of the MSSNR design to that of the MMSE design.There are several steps that add to the complexity: the computation of the matrices A, B, and R (∆), as addressed in Tables II and III; and the computation of the eigenvector or generalized eigenvector corresponding to the minimum eigenvalue of R (∆) or minimum generalized eigenvalue of (A, B).If "brute force" designs are used, then the computation of the MSSNR matrices cost L h / Lw times more than the computation of the MMSE matrices, or 16 times more in the example; and if the proposed methods are used, then the computation of the MSSNR matrices cost roughly (2 Lw + ν)/2 L2 w times as much as the computation of the MMSE matrices, or 16 times less in the example.However, both solutions also require the computation of an eigenvector for each delay, and the cost of this step depends heavily on both the type of eigensolver used and the values of the matrices involved, so an explicit comparison cannot be made.

V. Symmetry in the Impulse Response
This section discusses symmetry in the TEQ impulse response.It is shown that the MSSNR TEQ with a unitnorm constraint on the TEQ will become symmetric as the TEQ length goes to infinity, and that in the finite length case, the asymptotic result is approached quite rapidly.

A. Finite length symmetry trends
Consider the MSSNR problem of (3), in which the all-zero solution was avoided by using the constraint c win = 1.However, some MSSNR designs use the alternative constraint w = 1.For example, in [22], an iterative algorithm is proposed which performs a gradient descent of c wall 2 .Although it is not mentioned in [22], this algorithm needs a constraint to prevent the trivial solution w = 0.A natural constraint is to maintain w = 1, which can be implemented by renormalizing w after each iteration.Similarly, a blind, adaptive algorithm was proposed in [10], which is a stochastic gradient descent on c wall 2 , although it leads to a window size of ν instead of ν + 1. (A still has the same size in this case, but the elements may be slightly different.)For these two algorithms, the solution must satisfy min w w T Aw subject to w T w = 1. ( This leads to a TEQ that must satisfy a traditional eigenvector problem, In this case, the solution is the eigenvector corresponding to the smallest eigenvalue.Henceforth, we will refer to the solution of (34) as the MSSNR Unit Norm TEQ (MSSNR-UNT) solution.
A centrosymmetric matrix has the property that when rotated 180 o (i.e.flip each element over the center of the matric), it is unchanged.If a matrix is symmetric and Toeplitz (constant along each diagonal), then it is also centrosymmetric [21].By inspecting the structure of A, it is easy to see that it is symmetric, and nearly Toeplitz.(In fact, the near-Toeplitz structure is the idea behind the fast algorithms in [8], in which A i+1,j+1 is computed from A i,j with a small tweak.)Hence, A is approximately a symmetric centrosymmetric matrix.The eigenvectors of such matrices are either symmetric or skew-symmetric, and in special cases the eigenvector corresponding to the smallest eigenvalue is symmetric [23], [24], [25].Thus, we expect the MSSNR-UNT TEQ to be approximately symmetric or skew-symmetric, since it is the eigenvector of the symmetric (nearly) centrosymmetric matrix A, corresponding to the smallest eigenvalue.Oddly, it appears that the MSSNR-UNT TEQ is always symmetric as opposed to skew-symmetric, and the point of symmetry is not necessarily in the center of the impulse response.
To quantify the symmetry of the finite-length MSSNR-UNT TEQ design for various parameter values, we computed the TEQ for Carrier Serving Area (CSA) test loops [26] 1 through 8, using TEQ lengths 3 ≤ Lw ≤ 40.For each TEQ, we decomposed w into w sym and w skew , then computed w skew 2 / w sym 2 .A plot of this ratio (averaged over the eight channels) for the MSSNR-UNT TEQ is shown in Fig. 3.The symmetric part of each TEQ was obtained by considering all possible points of symmetry, and choosing the one for which the norm of the symmetric part divided by the norm of the perturbation was maximized.For example, if the TEQ were w = [1, 2, 4, 2.2], then w sym = [0, 2.1, 4, 2.1] and w skew = [1, −0.1, 0, 0.1].The value of ∆ was the delay which maximized the shortening SNR.The point of Fig. 3 is not to prove that the infinite-length MSSNR-UNT TEQ is symmetric (that will be addressed in Section V-B), but rather to give an idea of how quickly the finite-length design becomes symmetric.
Observe that the MSSNR-UNT TEQ (Fig. 3) becomes increasingly symmetric for large CP and TEQ lengths.For parameter values that lead to highly symmetric TEQs, the TEQ can be initialized by only computing half of the TEQ coefficients.For MSSNR, MSSNR-UNT, and MMSE solutions, this effectively reduces the problem from finding an eigenvector (or generalized eigenvector) of an N × N matrix to finding an eigenvector (or generalized eigenvector) of a N /2 × N /2 matrix, as shown in [23], where we use N to mean Lw for the MSSNR TEQ computation and to mean ν for the MMSE TIR computation.This leads to a significant reduction in complexity, at the expense of throwing away the skew-symmetric portion of the filter.Reduced complexity algorithms are discussed in Section VI.

B. Infinite length MSSNR designs: asymptotic results
This section examines the limiting behavior of A and B, and the resulting limiting behavior of the eigenvectors of A (i.e. the MSSNR-UNT solution).We will show that lim where • F denotes the Frobenius norm [27].Since H T H is symmetric and Toeplitz (and thus centrosymmetric), its eigenvectors are symmetric or skew-symmetric.Thus, as L w → ∞, we can expect the eigenvectors of A to become symmetric or skew-symmetric.Although this is a heuristic argument, the more rigorous sin(θ) theorem1 [28] is difficult to apply.First, consider a TEQ that is finite, but very long.Specifically, we make the following assumptions: Such a large ∆ in A1 is reasonable when the TEQ length is large.Now we can partition H as The row blocks have heights ∆, (ν + 1), and (L h + L w − ν − ∆); and the column blocks have widths (∆ − L h ), (ν + 1), (L h − ν − 1), (ν + 1), and (L w − ν − ∆).
where H 3 is a size (ν + Lh ) × (ν + 1) channel convolution matrix formed from Jh, the time-reversed channel.Since B is a zero-padded version of H 3 H T 3 , it has the same Frobenius norm.Also, the values of L w and ∆ affect the size of the zero matrices in (37) but not H 3 (assuming that our assumptions hold), so L w and ∆ do not affect the Frobenius norm of B. Therefore, whenever our two initial assumptions A1 and A2 are met.
The limiting behavior for A is determined by noting that (Only the top-left and bottom-right blocks are of interest for the proof.)Thus, a lower bound on the Frobenius norm of A can be found as follows: which goes to infinity as L w → ∞.In the second inequality, we have dropped all of the terms in the Frobenius norms except for those due to the diagonal elements of which goes to zero as L w → ∞.Thus, in the limit, A approaches C, which is a symmetric centrosymmetric matrix.Heuristically, this suggests that in the limit, the eigenvectors of A (including the MSSNR-UNT solution) will be symmetric or skew-symmetric.However, for special cases (such as tridiagonal matrices), the eigenvector corresponding to the smallest eigenvalue is always symmetric as opposed to skew-symmetric [23].Every single MSSNR TEQ that we have observed for ADSL channels has been nearly symmetric rather than skew-symmetric, suggesting (not proving) that the infinite length TEQ will be exactly symmetric.Thus, constraining the finitelength solution to be symmetric is expected to entail no significant performance loss, which is supported by simulation results.Essentially, if v is an eigenvector in the eigenspace of the smallest eigenvalue, then Jv is as well, so 1 2 (v + Jv) (which is symmetric) is as well, even if the smallest eigenvalue has multiplicity larger than 1.
Note that in the limit, B does not become centrosymmetric (refer to (37)), although it is approximately centrosymmetric about a point off of its center.Thus, we cannot make as strong of a limiting argument for the MSSNR solution as for the MSSNR-UNT solution.Symmetry in the finite-length MSSNR solution is discussed in [15].

VI. Exploiting Symmetry in TEQ Design
In [15], it was shown that the MMSE target impulse response becomes symmetric as the TEQ length goes to infinity, and in Section V-B it was shown that the infinite-length MSSNR-UNT TEQ is an eigenvalue of a symmetric centrosymmetric matrix, and is expected to be symmetric.In [16] and [17], simulations were presented for forcing the MSSNR TEQ to be perfectly symmetric or skew-symmetric.This section present algorithms for forcing the MMSE TIR to be exactly symmetric in the case of a finite length TEQ, and for forcing the MSSNR-UNT TEQ to be symmetric when it is computed in a blind, adaptive manner via the MERRY algorithm [10].It is also shown that when the TEQ is symmetric, the TEQ and FEQ designs can be done independently (and thus in parallel).
Consider forcing the MSSNR-UNT TEQ to be symmetric as a means of reducing the computational complexity.The MSSNR-UNT TEQ arises, for example, in the MERRY algorithm [10], which is a blind, adaptive algorithm for computing the TEQ; or in the algorithm in [22] (if the constraint used is a unit norm TEQ), which is a trained, iterative algorithm for computing the TEQ.We focus here on extending the MERRY algorithm to the symmetric case.Briefly, the idea behind the MERRY algorithm is that the transmitted signal inherently has redundancy due to the CP, so that redundancy should be evident at the receiver if the channel is short enough.The measure of redundancy is the MERRY cost, where M = N + ν is the symbol length, k is the symbol index, and ∆ is a user-defined synchronization delay.This cost function measures the similarity between a data sample and its copy in the CP (N samples earlier).The MERRY algorithm is a gradient descent of (42).
In practical applications, the TEQ length is even, due to a desired efficient use of memory.Thus, a symmetric TEQ has the form w T = v T , (Jv) T , where J is the matrix with ones on the cross-diagonal and zeros elsewhere.(An even TEQ length is not necessary; a similar partition can be made in the odd-length case, as will be done for the MMSE target impulse response later in this section.)The TEQ output is which can be rewritten for a symmetric TEQ as The Sym-MERRY update is a stochastic gradient descent of (42) with respect to the half-TEQ coefficients v, with a renormalization to avoid the trivial solution v = 0.The algorithm is For symbol k = 0, 1, 2, . . ., where Compared to the regular MERRY algorithm in [10], the number of multiplications has been cut in half for Sym-MERRY, though some additional additions are needed to compute ũ.Simulations of Sym-MERRY are presented in Section VII.Now consider exploiting symmetry in the MMSE target impulse response in order to reduce computational complexity.Recall that in the MMSE design, first the TIR b is computed as the eigenvector of R (∆) [as defined in (11)], and then the TEQ w is computed from (8).The MSE (which we wish to minimize) is given by Typically, the CP length ν is a power of 2, so the TIR length (ν + 1) is odd.This is the case, e.g., in ADSL [29], IEEE 802.11a [30] and HIPERLAN/2 [31] wireless LANs, and digital video broadcast (DVB) [32].To force a symmetric TIR, partition the TIR as where γ is a scalar and v is a real ν 2 × 1 vector.Now rewrite the MSE as where For simplicity, let vT = √ 2v T , γ .In order to prevent the all-zero solution, the non-symmetric TIR design uses the constraint b = 1.This is equivalent to the constraint v = 1.Under this constraint, the TIR the minimizes the MSE must satisfy where λ is the smallest eigenvalue of R. Since both R and R are symmetric, solving (51) requires 1  4 as many computations as solving the initial eigenvector problem.However, the forced symmetry could, in principle, degrade the performance of the associated TEQ.Simulations of the Sym-MMSE algorithm are presented in Section VII.
Another advantage of a symmetric TEQ is that it has a linear phase with known slope, allowing the FEQ to be designed in parallel with the TEQ.A symmetric TEQ can be classified as either a Type I or Type II FIR Linear Phase System ( [33], pp.298-299).Thus, for a TEQ with L w + 1 taps, the transfer function has the form where M (ω) = M (−ω) is the magnitude response.The DC response is Since the TEQ is real, e jβ must be real, so If k w(k) = 0, the DC response does not reveal the value of β.In this case, one must determine the phase response at another frequency, which is more complicated to compute.The response at ω = π is fairly easy to compute, and will also reveal the value of β.
From ( 52) -( 54), given the TEQ length, the phase response of a symmetric TEQ is known up to the factor e jβ , even before the TEQ is designed.The phases of the FEQs are then determined entirely by the channel phase response.Thus, if a channel estimate is available, the two possible FEQ phase responses could be determined in parallel with the TEQ design.Similarly, if the TIR is symmetric and the TEQ is long enough that the TIR and effective channel are almost identical, then the phase response of the effective channel is known, except for β.If differential encoding is used, then the value of β can arbitrarily be set to either 0 or π, since a rotation of exactly 180 degrees does not affect the output of a differential detector.Furthermore, if 2-PAM or 4-QAM signaling is used on a subcarrier, the magnitude of the FEQ does not matter, and the entire FEQ for that tone can be designed without knowledge of the TEQ.For an ADSL system, 4-QAM signaling is used on all of the subcarriers during training.Thus, the FEQ can be designed for the training phase by only setting its phase response.The magnitude response can be set after the TEQ is designed.The benefit here is that if the FEQ is designed all at once (both magnitude and phase), then a division of complex numbers is required for each tone.However, if the phase response is already known, determining the FEQ magnitude only requires a division of real numbers for each tone.This can allow for a more efficient implementation.

VII. Simulations
This section presents simulations of the Sym-MERRY and Sym-MMSE algorithms.The parameters used for the Sym-MERRY algorithm were an FFT size of N = 512, a CP length of ν = 32, a TEQ length of Lw = 16 (8 taps get updated, then mirrored), and an SNR of σ 2 x h 2 /σ 2 n = 40 dB, with white noise.The channel was CSA loop 4 (available at [34]).The DSL performance metric is the achievable bit rate for a fixed probability of error, where SN R i is the signal to interference and noise ratio in frequency bin i. (We assume a 6 dB margin and 4.2 dB coding gain; for more details, refer to [9].) Fig. 4 shows performance vs. time as the TEQ adapts.The dashed line represents the solution obtained by a non-adaptive solution to the MERRY cost (42), without imposing symmetry, and the dotted line represents the performance of the MSSNR solution [5].Observe that Sym-MERRY rapidly obtains a near-optimal performance.The jittering around the asymptotic portion of the curve is due to the choice of a large stepsize.
The simulations for the Sym-MMSE algorithm are shown in Fig. 5 and in Table IV.In Fig. 5, TEQs were designed for CSA loops 1-8, then the bit rates were averaged.The TEQ lengths that were considered were 3 ≤ Lw ≤ 128.For TEQs with fewer than 20 taps, the bit rate performance of the symmetric MMSE method is not as good as that of the unconstrained MMSE method.However, asymptotically, the results of the two methods agree; and for some parameters, the symmetric method achieves a higher bit rate.Table IV shows the individual bit rates achieved on the 8 channels using 20 tap TEQs, which is roughly the boundary between good and bad performance of the Sym-MMSE design in Fig. 5. On average, for a 20-tap TEQ, the Sym-MMSE method achieves 89.5% of the bit rate of the MMSE method, with a significantly lower computational cost, but the performance (at this filter length) varies significantly depending on the channel.Thus, it is suggested that the symmetric MMSE design only be used for TEQs with at least 20 taps, and preferably more.

VIII. Conclusions
The computational complexity of two popular channel shortening algorithms, the MSSNR and MMSE methods, has been addressed.A method was proposed which reduces the complexity of computing the A and B matrices in the MSSNR design by a factor of 140 (for typical ADSL parameters) relative to the methods of Wu, Arslan, and Evans [8], for a total reduction of a factor of 4000 relative to the brute force approach, without degrading performance.A similar technique was proposed  to reduce the complexity of computing the R (∆) matrix used in the MMSE design by a factor of 16 (for typical ADSL parameters).It was also shown that the infinite length MSSNR TEQ with a unit norm TEQ constraint has a symmetric impulse response.Algorithms for reducing complexity by exploiting symmetry in the TEQ and target impulse response were derived, and simulations were used to show that the symmetric algorithms incur only a minor performance penalty.The Matlab code to reproduce the figures in this paper is available online [35].

2 Fig. 3 .
Fig. 3. Energy in the skew-symmetric part of the TEQ over the energy in the symmetric part of the TEQ, for ν = 32.The data was delay-optimized and averaged over CSA test loops 1 -8.

TABLE II
Computational complexity of various MSSNR implementations.MACs are real multiply-and-accumulates and adds are real additions (or subtractions).

TABLE III
w L3 w Each R (∆) L3 w 2 L2 L3will be constant for all values of ∆ and L w .As such, the limiting behavior of B = H T win H win is The sections [H L2 , H L1 ] and H L3 are both lower triangular and contain the "head" of the channel, [H U 1 , H U 2 ] and H U 3 are both upper triangular and contain the "tail" of the channel, H 1 and H 2 are tall channel convolution matrices, and H M is Toeplitz.Then H win is simply the middle row (of blocks) of H, and H wall is the concatenation of the top and bottom rows.Under the two assumptions above, H U 3 , H M , and H

TABLE IV
Achievable bit rate (Mbps) for MMSE and Sym-MMSE, using 20-tap TEQs and 33-tap TIRs.The last column is the performance of the Sym-MMSE method in terms of the percentage of the bit rate of the MMSE method.The channel has AWGN but no crosstalk.