EURASIP Journal on Applied Signal Processing 2002:5, 525–531 c ○ 2002 Hindawi Publishing Corporation Maximum-Likelihood Sequence Detection of Multiple Antenna Systems over Dispersive Channels viaSphereDecoding

Multiple antenna systems are capable of providing high data rate transmissions over wireless channels. When the channels are dispersive, the signal at each receive antenna is a combination of both the current and past symbols sent from all transmit antennas corrupted by noise. The optimal receiver is a maximum-likelihood sequence detector and is often considered to be practically infeasible due to high computational complexity (exponential in number of antennas and channel memory). Therefore, in practice, one often settles for a less complex suboptimal receiver structure, typically with an equalizer meant to suppress both the intersymbol and interuser interference, followed by the decoder. We propose a sphere decoding for the sequence detection in multiple antenna communication systems over dispersive channels. The sphere decoding provides the maximum-likelihood estimate with computational complexity comparable to the standard space-time decision-feedback equalizing (DFE) algorithms. The performance and complexity of the sphere decoding are compared with the DFE algorithm by means of simulations.


INTRODUCTION
Multiple antenna wireless communication systems are capable of providing data transmission at potentially very high rates [1]. To secure high reliability of the data transmission, special attention has to be payed to the design of the receiver. When transmitting over noisy dispersive channels, the received signal at each receive antenna is the combination of the transmitted signals perturbed by noise, intersymbol interference (ISI), and by interuser interference (IUI). In this case, the optimal receiver structure is the multichannel maximum-likelihood sequence estimation (MLSE). However, the computational complexity of the traditional maximum-likelihood sequence detector often prohibits its practical implementation. (For instance, the Viterbi decoder is exponential in the length of the channel [2].) One way to alleviate the computational burden is to settle for (suboptimal) reduced complexity MLSE algorithms by reducing the number of states (see, e.g., [3,4]). In practice, however, most often a multichannel (space-time) equalizer is used to suppress ISI and IUI first; then, a hard decision is made to recover the symbol that has been sent [2,5,6]. The equalizer may be linear (zero-forcing or minimum mean square), or nonlinear decision-feedback equalizer (DFE). DFEs essentially perform successive interference cancellation: a soft symbol estimate is used to cancel the trailing interference, upon which the hard decision is made to recover the symbol. (For the analysis of the performance of DFE algorithm in a dispersive MIMO environment, see [6].) For high enough SNR, DFEs obtain better performance than linear equalizers while still having much lower complexity than the optimal MLSE algorithm. However, the performance of the DFE is highly inferior compared to the performance of the optimal MLSE algorithm.
In this paper, we propose an algorithm that yields the optimal MLSE performance on dispersive multiple-input multiple-output (MIMO) channels with finite impulse response (FIR). (We should point out that the wireless communication systems may or may not employ feedback from the receiver to the transmitter. In this paper, we focus on optimal detector structures for systems where feedback is unavailable and the receiver learns the channel based on the training information.) We consider the so-called sphere decoding, an algorithm for solving integer least-squares problems, which, in the communication context, provides the ML estimate of the transmitted data sequence. The algorithm is due to Fincke and Pohst [7] and was first proposed in the context of the closest point searches in lattices (for a review of these, see [8] and the references therein). The algorithm was rediscovered in [9] in the context of detection in GPS systems. The use of the sphere decoding for lattice codes was first proposed in [10], and further investigated in [11,12]. In [13], it has been analytically shown that the average complexity of the sphere decoding used for ML detection in flat fading multiple-antenna systems is polynomial (often sub-cubic) for a wide range of signal-to-noise ratios (SNRs).
The paper is organized as follows: in Section 2, we describe the FIR MIMO channel model. In Section 3, we pose the detection problem, briefly overview heuristics for solving it, and describe the sphere decoding algorithm. Simulation results are presented in Section 4, where it is shown that the sphere decoding provides significant improvement (several dBs) over the MIMO DFE. The computational complexity of the sphere decoding turns out to be comparable to that of the MIMO DFE, thereby suggesting that it can be implemented in practice. The paper concludes with Section 5.

FIR MIMO MODEL DESCRIPTION
We consider a multiple-antenna system with M transmit and N receive antennas. The MIMO channel is modeled as blockfading frequency-selective, where the channel impulse response is constant for some discrete interval T, after which it changes to another (independent) impulse response that remains constant for another interval T, and so on. The additive noise is spatially and temporally independent identically distributed (i.i.d.) circularly-symmetric complex-Gaussian. The MIMO channel model is shown in Figure 1.
The channel is represented by its complex baseband equivalent model. Let the column vector denote the single-input single-output (SISO) channel impulse response from the jth transmit to the ith receive antenna. For convenience, we shall make the following assumptions on the SISO channels h (i, j) : channels have impulse responses of the same length, (2) the channel coefficients h The received signal at the ith antenna can then be expressed as for k = 1, 2, . . . , T + C − 1. Equation (2) can be written in a (2,1) h (1,2) h (N,1) where is the transmit vector, whose entries typically come from a QAM constellation, ᐂ k ∈ Ꮿ N×1 is the additive noise vector defined as and H l ∈ Ꮿ N×M is the lth coefficient matrix in the MIMO channel impulse response, In other words, the z-transform of the MIMO channel impulse response is given by Define the following vectors: where Ᏼ ∈ Ꮿ N(T+C−1)×MT is constructed as Model (9) is illustrated in Figure 2. We assume that symbol bursts are uncorrelated (which is an appropriate assumption when modeling, for instance, packet transmission in TDMA systems). It will be convenient to define the signal-to-noise ratio ρ for the system in (9), Assuming that the entries in are coming from an L×L QAM constellation (where L is assumed to be even), and that the minimum distance between constellation points is d min = 1, we find that Notice that all quantities in (9) are complex. We will find it useful to rewrite (9) in terms of real quantities. To this end, define Thus, with the previously defined x ∈ 2N(T+C−1)×1 , v ∈ 2N(T+C−1)×1 , and H ∈ 2N(T+C−1)×2MT , we can rewrite (9) Ᏼ ᐄ ᐂ Figure 2: Matrix equivalent channel model. as where the signal vectors s are typically obtained upon modulation of the input bits onto an L-PAM constellation Ᏸ 2MT (This particular structure of vector s stems from the assumption that entries of in (9) are points in L × L QAM constellation.) Notice that we assumed that L is even. (In practice, L is commonly a power of 2, giving rise to 2-PAM, 4-PAM, 8-PAM, etc., constellations.) Finally, notice that Ᏸ 2MT L is a finite lattice carved from an infinite one, ᐆ 2MT .

PROBLEM STATEMENT
With the notation introduced in Section 2, due to the Gaussian assumption on the additive noise, we can express the MLSE problem as the optimization problem where the minimization is over all points in the constellation Ᏸ 2MT L . We can interpret problem (16) as follows. Given the "skewed" lattice Hs, find the "closest" lattice point to a given 2NT-dimensional vector x.
The closest lattice point search problem in (16) is known to be, in general, of exponential complexity [8]. There are several reduced complexity heuristic methods that can be used to obtain approximate solutions to (16). The most obvious are the following two.
• Inverting and rounding to the closest integer where H † denotes the pseudo-inverse, and where for a ∈ the notation [a] ᐆ means the closest integer to a. So [H † x] ᐆ is simply the vector obtained by this operation applied to each entry of H † x. The aboveŝ is called the Babai point (estimate). In the communications context, the preceding procedure is nothing but simple zero-forcing equalization, followed by a hard decision. • In nulling and canceling [14], one uses the Babai estimate for one of the entries of s, say s (1) ; then assumes that s (1) is known and subtracts out its effect to obtain a reduced integer least-squares problem with 2MT − 1 unknowns. Then the procedure is repeated to solve similarly for s (2) , and so on. (Nulling and cancelling is fundamentally equivalent to the generalized decision-feedback equalization discussed in [15].) As a side note, one can further improve the performance of nulling and canceling by introducing optimal ordering: the algorithm starts from the "strongest" and proceeds to the "weakest" entry in s (see, e.g., [14,16]).
The aforementioned heuristics have acceptable polynomial-time computational complexity for practical implementation purposes. However, their performance is inferior in comparison with the exact solution to the MLSE problem.
We proceed by describing an algorithm, the so-called sphere decoding, for efficient closest point search in the lattice.

Sphere decoding
The sphere decoding performs the closest-point search in a somewhat more sophisticated manner than doing a full search over the integer lattice, which requires exponential complexity. In particular, it performs search only over lattice points lying in a certain hypersphere of radius r centered around the received vector x. The closest lattice point is clearly the solution.
From a practical point of view, there are two issues that have to be resolved. One is the proper choice of the sphere radius r: if r is too large there will be too many lattice points in the sphere and we may still require an exponential search; if r is too small there will be no points in the sphere. The other issue concerns determining which lattice points lie within the sphere-if the algorithm were to check all the points in the lattice, we would be again stuck with an exponential search.
We use a statistical criterion to choose radius r. In particular, the radius of the sphere is chosen so that with high probability we find at least one lattice point in the sphere. To this end, note that is a chi-square random variable with NT degrees of freedom.
(Recall that each entry on v is an independent N(0, σ 2 ) random variable.) We choose the radius r to be a linear function of the variance of v 2 , where the coefficient α is chosen in such a way that with a high probability p fp we find a lattice point inside a sphere, We find α in (20) by a simple table lookup. Once we have chosen radius r, we need to determine which lattice points belong to the sphere of radius r. An efficient way to check whether a lattice point belongs to the sphere is given by the algorithm of Fincke and Pohst [7]. Note that s lies in a sphere of radius r if whereŝ = H † x. To make the notation simpler, denote size of the vector s as (Note that m is the number of unknowns and it will be of interest in studying the complexity.) Introducing the QR decomposition H = QR (where Q is unitary and R is upper triangular), and defining r 2 = r 2 − x 2 + Hŝ 2 , we can write (21) as where r i, j denotes (i, j) entry of the matrix R. A necessary condition for s to lie inside the sphere is therefore that This condition is easy to check and it leads to However, condition (25) is by no means sufficient. For every s m satisfying (25), upon defining r 2 m−1 = r 2 − r 2 mm (s m −ŝ m ) 2 one can state a stronger necessary condition which is equivalent to In a similar fashion, one proceeds for s m−2 , and so on, stating nested necessary conditions for all elements of s. This leads us to the sphere decoding algorithm which essentially finds all points that satisfy the previously stated conditions: Input: R, x,ŝ, r.
In general, the closest point search has both worst-case and average complexity that is exponential in the number of unknowns [17]. The same is true for the sphere decoding. However, in our application, the vector x in (16) is not an arbitrary point in space but rather a lattice point perturbed by the noise as expressed by (14). Clearly, the higher the SNR in (12), the less perturbed the lattice point is. Therefore, one may suspect that the expected complexity of the sphere decoding algorithm will depend on the SNR. Indeed, this is the case-the higher the SNR, the lower the complexity.
In [13], we have computed in closed-form the expected complexity (averaged over the noise and the lattice) of the sphere decoding for the nondispersive (flat-fading) channels. It is shown that the expected complexity is polynomial-time over a wide range of SNRs, and is, in fact, often sub-cubic for SNRs that support the data rates being transmitted.
For dispersive channels explicitly computing the expected complexity appears to be much more complicated, and we are currently not able to analytically perform all the required steps. Nonetheless, simulation suggest the same qualitative performance of polynomial-time complexity as we observe from the examples in Section 4.
Furthermore, the complexity of the sphere decoding can be improved by exploiting the Toeplitz structure of the channel matrix. In particular, note that the channel matrix preprocessing is required only in order to transform H into an upper triangular form. Due to the Toeplitz structure of H, it is in fact sufficient to perform QR factorization of only one coefficient matrix in the MIMO channel impulse response (H C in (10)). Upon QR factorization of H C the bottom square submatrix of H becomes upper triangular and thus can be processed by the sphere decoding algorithm to find a lattice point s; then one proceeds by adding the contribution of the top 2(C − 1) rows of H to find the metric x − Hs 2 and by testing whether the lattice point s belongs to the sphere.
Further improvement in the complexity of the sphere decoding can be obtained by employing the Schnorr-Euchner variation of the Fincke-Pohst algorithm (see [8,18]). Essentially, by examining points in the hypersphere in a different order (in particular, by starting from the Babai point), significant computational savings can be obtained [18].

SIMULATION RESULTS
We first consider a communication system with M = 2 transmit and N = 2 receive antennas. The channel memory is assumed to be C = 4, and the coherence interval time T = 4. Data is modulated onto 4-QAM constellation (corresponding to 2-PAM, or L = 2, in the real-valued set of (14)). The resulting transmission rate is therefore 4 bits/channel use. The performance comparison of an uncoded transmission in terms of bit error rate (BER) between the sphere decoding and nulling and canceling (or, equivalently, generalized DFE) is shown in Figure 3.
As an indicator of the expected computational complexity of the sphere decoding, we adopt the complexity exponent, c e , defined as where m is defined in (22). The expected complexity can therefore be expressed as The complexity exponent as the function of SNR for the previous example with m = 16 is shown in Figure 4. Note that for SNRs above 7 dB we obtain sub-cubic complexity.
As another example, we consider the same 2 × 2 system (M = 2, N = 2), with C = 4, but now increase the block length to T = 8, and the constellation to 16-QAM, corresponding to L = 4 and a transmission rate of 8 bits/channel use. The performance comparison between the sphere decoding and generalized DFE is shown in Figure 5. The complexity exponent as the function of SNR for this example (where m = 32) is shown in Figure 6.
As a final example, consider the 4×4 communication system (M = 4, N = 4), with C = 4 and block length T = 8 (and thus m = 64). The constellation used is 4-QAM (hence L = 2, and the corresponding transmission rate is 8 bits/channel use). The performance comparison between sphere decoding and generalized DFE for this system is shown in Figure 7. The  corresponding complexity exponent of the sphere decoding is shown in Figure 8 and is sub-cubic for SNRs above 12 dB.

DISCUSSION AND CONCLUSION
We have proposed sphere decoding for maximum-likelihood sequence detection of multiple antenna systems over frequency-selective channels. To employ the sphere decoding, the detection problem was posed as an integer leastsquares problem. As illustrated by simulations, the sphere decoding provides several dBs improvement over the MIMO decision-feedback equalization. We have shown empirically that the expected computational complexity of the sphere decoding is polynomial (often sub-cubic) for a wide range of SNRs. Both the sphere decoding and MIMO DFE require some preprocessing of the channel matrix (usually in a form of QR factorization) which, in general, has cubic complexity. Therefore, the maximum-likelihood detection on MIMO channels with memory can be implemented with complexity similar to that of heuristic methods, but with significant performance gains.