MMSE Beamforming for SC-FDMA Transmission over MIMO ISI Channels

,


Introduction
Single-carrier frequency-division multiple access (SC-FDMA) transmission, also referred to as discrete Fourier transform (DFT) spread orthogonal frequency-division multiple access (OFDMA), has been selected for the uplink of the E-UTRA Long-Term Evolution (LTE) mobile communications system [1]. In comparison to standard OFDMA, SC-FDMA enjoys a reduced peak-to-average power ratio (PAPR) enabling a low-complexity implementation of the mobile terminal [2]. SC-FDMA is employed along with multipleinput multiple-output (MIMO) techniques in LTE in order to further improve coverage and capacity. Another advantage of SC-FDMA is that relatively simple frequency-domain minimum mean-squared error linear equalization (MMSE-LE) techniques [3,4] can be applied for signal recovery at the base station, if a frequency-selective MIMO channel is present and introduces intersymbol interference (ISI). Incorporating additional MMSE noise (error) prediction, tailored for single-carrier transmission techniques with cyclic convolution, compare with for example, [5], an MMSE decision-feedback equalization (MMSE-DFE) structure results with enhanced performance compared to MMSE-LE.
In order to fully exploit the potential benefits of MIMO transmission, closed-loop transmit beamforming should be employed, compare with for example, [6,7], where a pragmatic eigenbeamforming algorithm using unitary precoding matrices in conjunction with uniform power allocation across all subcarriers has been introduced for SC-FDMA MIMO transmission with MMSE-LE. However, in this work, we show that eigenbeamforming with uniform power allocation is suboptimum. We prove that beamforming filters, minimizing the mean-squared error (MSE) after MMSE-LE, lead to eigenbeamforming with a nonuniform power allocation across the subcarriers. The optimum power allocation policy is derived and shown to be similar in spirit to classical results for the optimum continuous-time transmit filters for a conventional single-carrier transmission, compare with [8], that is, it is given by an inverse waterfilling scheme. For MMSE-DFE, it is shown that also eigenbeamforming 2 EURASIP Journal on Advances in Signal Processing together with a nonuniform power allocation across the subcarriers is optimal in general. Here the optimum power allocation policy is proved to be given by classical capacity achieving waterfilling, again similar to conventional singlecarrier transmission, compare with [9].
Simulation results demonstrate the high performance of the proposed beamforming schemes and show that beamforming introduces a certain increase in the peak-to-average power ratio (PAPR). For PAPR reduction, symbol amplitude clipping has been proposed in [7], which is known to introduce in-band signal distortion. Therefore, in this work, a modified version of the selected mapping (SLM) method [10,11] is used, which can be incorporated without loss of optimality into the beamformer design to keep the increase of the PAPR at a minimum.
This paper is organized as follows. In Section 2, the underlying system model for a single-user MIMO SC-FDMA transmission is described. MMSE-LE and MMSE-DFE for MIMO SC-FDMA transmission are introduced in Sections 3 and 4, respectively. MMSE beamforming for MMSE-LE and MMSE-DFE are derived in Sections 5 and 6, respectively, and a method for PAPR reduction is proposed in Section 7. Numerical results for beamforming and the proposed PAPR reduction method are presented in Section 8, and some conclusions and suggestions for future work are provided in Section 9. Notation 1. E {·}, (·) T , and (·) H denote expectation, transposition, and Hermitian transposition, respectively. Bold lowercase letters and bold uppercase letters stand for column vectors and matrices, respectively. An exception are frequency-domain vectors for which also bold upper case letters are used. [A] m,n denotes the element in the mth row and nth column of matrix A; I X is the X × X identity matrix, 0 X×Y stands for an X × Y all-zero matrix, and diag{x 1 , x 2 , . . . , x n } is a diagonal matrix with elements x 1 , x 2 , . . . , x n on the main diagonal. tr(·) and det(·) refer to the trace and determinant of a matrix, respectively. W X denotes the unitary X-point DFT matrix and ⊗ denotes cyclic convolution.

System Model
We consider single-user SC-FDMA transmission over a frequency-selective MIMO channel. Here, we assume N t = 2 transmit antennas, which is the most realistic setting for the LTE uplink, and N r ≥ 2 receive antennas. The derived solution can be generalized in a straightforward way to any number of transmit antennas N t > 2. Figure 1 shows the considered SC-FDMA transmitter. After channel encoding of binary symbols and interleaving, Gray mapping to a quadrature amplitude modulation (QAM) signal constellation is applied.  Figure 1: Transmitter with SC-FDMA signal processing and beamforming.
Subsequently, the frequency domain symbols are mapped onto N subcarriers, resulting in frequency domain vectors B i of size N . Hereby, mapping is done by the assignment to M consecutive subcarriers beginning from the ν 0 th subcarrier, which can be represented as with the assignment matrix Using an N -point inverse with a 2 × 2 beamformer frequency response matrix P[μ] as shown in Figure 1. For the subcarrier assignment, sequences A cyclic prefix of length L c is added to vectors b i and the form an SC-FDMA transmit symbol. (Here, index "c" stands for the additional cyclic prefix.) The signal at the lth receive antenna, l ∈ {1, 2, . . . , N r }, is where the discrete-time subchannel impulse response h l,i [λ] of length L characterizes transmission from the ith transmit antenna to the lth receive antenna including transmit and receiver input filtering. (Symbols from the preceding SC-FDMA symbol can be ignored in the model because they do not contribute after removal of the cyclic prefix.) During the transmission of each slot consisting of several vectors (SC-FDMA symbols) b i,c , the MIMO channel is assumed to be constant but it may change randomly from slot to slot. n l [κ] denotes spatially and temporally white Gaussian noise of variance σ 2 n . In the receiver, the cyclic prefix is first removed, eliminating interference between adjacent SC-FDMA symbols if L c ≥ L − 1, and after an N -point DFT the received frequency domain vector R l at antenna l can be represented as

MMSE-LE for SC-FDMA
MMSE-LE for a MIMO SC-FDMA transmission has been outlined for example, in [4,6]. The optimum filtering matrix for joint processing of vectors R l is given by [4, (8)    Thus, the error vector of MMSE equalization is given by Taking into account the statistical independence of terms for different discrete frequencies μ in the sum of the right hand side of (9) and the mutual independence of A[μ] and N[μ], the error correlation matrix can be expressed as

MMSE-DFE for SC-FDMA
To enhance the performance of MMSE-LE, a MIMO noise (error) prediction-error filter may be inserted after the MMSE linear equalizer as shown in Figure 2 and applied to y[k]¸[y 1 The introduced postcursor intersymbol interference is removed by decision feedback after the quantizer Q producing decisions a[k] for a[k], resulting in an MMSE-DFE structure, where the feedback filter coefficient matrices are identical to those of the prediction filter T[k], compare with, for example, [5].
The signal after prediction-error filtering is described by where T e [k] are the coefficients of the prediction-error filter, is the error signal of the MMSE-LE output filtered with the prediction-error filter. The optimum predictor coefficients are obtained from the multichannel Yule Walker equations [4,5] T H [2] . . .
with the cyclic autocorrelation matrix sequence of the error signal of MMSE-LE (with corresponding periodical extension) 4.1. Case (q p = M − 1). We now consider the limit case of the maximum possible prediction order, q p = M − 1. Here, from a closer inspection of (12), can be deduced for the optimum prediction-error filter.
(For evaluation of the cyclic convolution arising in (14), the matrix sequences are periodically extended beyond the set k ∈ {0, 1, . . . , M − 1}.) Solving (14) in the frequency domain and taking into account the constraint T e [0] = I 2 , the frequency response S[μ] of the optimum prediction-error filter can be expressed as After some further straightforward calculations, the covariance matrix of the prediction error and its power density spectrum as The frequency response in (15) may be viewed as that of a multichannel extension of an interpolation-error filter, compare with [12]. This is because for q p = M − 1 all other available error vectors, that is, future and past vectors, are contributing to the estimation of the current error vector, and the filter no longer acts as a predictor but as an interpolator. Also (16) and (17) may be interpreted as multichannel cyclic generalizations of corresponding results in [12]. It is important to note that an interpolation error is not white, in contrast to the prediction error produced by an optimum causal prediction filter, compare with also [13]. In fact, it can be shown that the cascade of MMSE-LE and an interpolation-error filter has a frequency response proportional to H H [μ], that is, a matched filter results requiring a DFE feedback filter with equally strong causal and noncausal coefficients.

4.2.
Case (q p = (M − 1)/2). In a system with cyclic convolution, a predictor with q p = (M − 1)/2 (M odd) may be viewed as the counterpart of a classical, causal prediction filter of infinite order with linear convolution. Therefore, it can be expected that for sufficiently large M, results for infinite prediction order and linear convolution hold well for the considered case. In [9], it has been shown that for a multichannel MMSE-DFE, the optimum filters minimizing tr(Φ wp wp ) (arithmetic MSE) minimize also det(Φ wpwp ) (geometric MSE), that is, both criteria are equivalent, and an expression for the minimum determinant has been given [9, (37)]. Adapting this expression to our notation and discretizing the integral, is obtained. Elaborating further on (18) yields

Optimum Beamforming and Power Allocation for MMSE-LE
If knowledge of the MIMO transmission channel is available at the transmitter, this can be exploited to make the transmit signal more robust to distortions during transmission. Therefore, in this section, a beamformer is presented which is optimal in the MMSE sense when MMSE linear equalization is applied at the receiver side.
Hence,the optimum beamformer is given by the solution of the optimization problem where P denotes the prescribed average transmit power per subcarrier and i.i.d. data sequences have been assumed for the power constraint. Using the eigenvalue decomposition for square matrices A and B, we obtain compare with also [14], where an OFDM MMSE beamforming problem has been considered. Inserting the singular value decomposition (SVD) of P[μ], exists with a 2 × 2 unitary matrix Q[μ]. Then, the cost function can be written as In (28) (21), where tr(A B) = tr(B A) has been used and the step from (29) to (30) follows from majorization theory, compare with 6 EURASIP Journal on Advances in Signal Processing [14,15]. Hence, we have proved that there is always an eigenbeamformer which exhibits the same cost function as a given arbitrary beamformer at equal or even lower transmit power. Therefore, eigenbeamforming is optimum and considered further in the following.

MSE Minimizing Power Allocation for Eigenbeamforming.
For eigenbeamforming, it is straightforward to show that the error correlation matrix in (10) is given by Convex optimization problems of the form (32) have been considered for example, in [16,17]. Via the Karush-Kuhn-Tucker (KKT) optimality conditions [16], the following solution can be obtained: to be fulfilled as a condition for the existence of a stable ZF equalizer. Thus, the frequency response of the beamforming filter is given by where M[μ] is a unitary matrix. It can be observed that factors 1/ d i [μ] are employed for both beamforming filtering and ZF-LE, that is, beamforming acts as a kind of preequalization and the channel equalizer is split in equal (up to a scaling and unitary matrices) transmitter and receiver parts.
It is interesting to note that our results for beamforming for SC-FDMA with LE are similar in spirit to the classical results of Berger and Tufts [8] and Yang and Roy [18] who developed the optimum continuous-time transmit filters assuming LE at the receiver for transmission with conventional linear modulation over single-input single-output (SISO) and MIMO channels, respectively. The

Optimum Beamforming and Power
Allocation for MMSE-DFE 6 where p i [μ]¸c 2 i [μ] is again the power allocation coefficient for transmit antenna i and subcarrier μ. It is easy to see that an optimum power allocation policy puts all available transmit power in that stream i and subcarrier μ with maximum d 2 i [μ]. This, however, results in a widely spread impulse response of overall channel and corresponding DFE feedback filter, that is, the MMSE-DFE is likely to be affected by severe error propagation. It should be noted that such a feedback filter fed by hard decisions can be only employed in the last iterations of an iterative DFE, when reliable past and future decisions are available [19]. However, beamforming should be adjusted to the situation in the first iteration where a causal feedback filter has to be applied. Because of this and other practical constraints, the scheme with q p = M − 1 is mainly of theoretical interest and not considered for our numerical results for noniterative DFE schemes (Iterative DFE schemes are beyond the scope of this paper).
where the cost function J is given by and P denotes again the prescribed average transmit power per subcarrier.
In [17, pages 136-137], it has been shown that for prob- Via the Karush-Kuhn-Tucker (KKT) optimality conditions [16], the well-known classical waterfilling solution is obtained, The determination of water level ω and subsets S i ⊆ {0, 1, . . . , M − 1} is well investigated, compare with for example, [17]. It should be noted that the cost function J in (39) characterizes the MMSE-DFE performance exactly only for M → ∞ and q p = (M − 1)/2, however, it is still a very good performance approximation for practically relevant M and q p and therefore suitable for beamformer optimization also in these cases. Unlike linear equalization, the capacity-achieving waterfilling power allocation solution is obtained for MMSE-DFE, which is in agreement with results for systems with linear convolution, compare with for example, [9]. It should be noted that for linear equalization power has to be allocated mainly to subcarriers where the channel frequency response is weak, whereas for DFE mainly the strong subcarriers are used. Only for σ 2 n → 0, a flat transmit spectrum results. Regarding the computational complexity of beamforming filter calculation, similar remarks as for LE hold, compare with last paragraph of Section 5.

PAPR Reduction
In the previous analysis, we chose for simplicity K[μ] in (25) to be the identity matrix. As a unitary K[μ] has no influence on the cost functions (20) and (39) and the power constraint in (38), in this section K[μ] is exploited for PAPR reduction. For this purpose the SLM method, proposed in [10] and extended for MIMO systems in [11], is invoked and adjusted to our problem. First, we define N set subsets of subcarriers U ι , ι = 1, . . . , N set , where U ι contains a specified subcarrier arrangement for both transmit antennas. Subsequently, a phase rotation θ ι ∈ Θ, where Θ contains N θ allowed rotation angles, is included into the beamforming filter for each subset U ι , exploiting unused degrees of freedom in filter design, compare with Section 5.1. Thus, the modified MMSE optimum eigenbeamformer for μ ∈ U ι is given by which is calculated for every combination of rotation angles θ ι in time domain. This procedure is repeated for each SC-FDMA symbol. Note that with increasing number of angles in Θ and increasing number of subsets, the number of possible combinations increases according to N Nset θ and, hence, the computational complexity to find the best θ ι s increases. In order to take into account the rotation operation in equalizer design at the receiver side appropriately, the N set chosen θ ι s have to be transmitted to the receiver as side information, as is typically done in SLM type of PAPR reduction schemes, compare with [10,11].

Assumptions for Simulations.
For the presented simulation results, parameters based on the LTE FDD standard [1] are adopted. Here, Turbo coding with code rate R c and following channel interleaving is applied over a block of two slots, each containing 7 SC-FDMA symbols. For parallel transmission each of the two slots is assigned to one antenna. The DFT sizes are chosen to M = 300 and N = 512, where ν 0 = 60. The MIMO subchannels are assumed to be mutually independent, and the receivers and transmitters with beamforming have ideal channel knowledge. Figures 3 and 4 show the block error rate (BLER) after channel decoding versus E b /N 0 (E b : average received bit energy, N 0 : single-sided power spectral density of the continuous-time noise) for code rates R c ∈ {1/3, 1/2, 2/3} and for transmission over a MIMO channel with ITU Pedestrian B and A subchannels [1], respectively. For each channel type, the performance of conventional linear equalization without beamforming (LE), linear equalization with eigenbeamforming and uniform power allocation (LE-BF), and linear equalization with eigenbeamforming and optimal power allocation (LE-PA) is shown. For R c = 1/3 and R c = 1/2 a significant gain can be achieved by applying LE-BF only, but using additionally the proposed power allocation yields a further performance improvement for both channels. However, for R c = 2/3 we observe a degradation of LE-BF relative to the BLER of LE for both channel profiles. By applying the optimal power allocation, the loss introduced by eigenbeamforming can be compensated, showing better results than LE. To investigate this behaviour more in detail the MSEs of the substreams  are analyzed. Figure 5 shows the MSE of LE, LE-PA, and LE-BF for transmit antenna 1 (Figure 5(a)) and transmit antenna 2 ( Figure 5 realization, (10)   Note, that N θ = 1 corresponds to MMSE beamforming without any rotation operation. Figure 6 shows the complementary cumulative density function of the PAPR for SC-FDMA transmission without beamforming, with MMSE beamforming for LE (E b /N 0 = 7 dB) and different N θ , and OFDMA transmission. As is well known, pure SC-FDMA transmission has a lower PAPR than OFDMA transmission which can also be observed here. But applying MMSE beamforming with N θ = 1 increases the PAPR, nearly bridging the gap between OFDMA and SC-FDMA. With increasing N θ the PAPR for MMSE beamforming can be decreased, where we note that additional PAPR reduction is diminishing for N θ > 8. From Figure 6 we can see that already N θ = 4 is sufficient to reduce the PAPR significantly, also meaning that the increase in computational complexity due to the proposed PAPR reduction scheme can be kept low. Recall, that in contrast to symbol amplitude clipping as considered in [7], the proposed PAPR reduction technique does not have any effects on the BLER performance.

Results for MMSE-DFE.
For DFE a code rate of R c = 1/3 is used, and q b = 60 symbols are fed back, where we assume ideal feedback. Note that the performance of DFE with ideal feedback can be achieved with Tomlinson-Harashima precoding (up to a small transmit power increase) [20] or alternatively by an interleaving scheme that allows the use of decoded bits to generate the feedback symbols [5,21]. Figures 7 and 8 show the BLER after channel decoding versus E b /N 0 for different channel profiles. Hereby, the performance of DFE without beamforming, DFE with eigenbeamforming, and DFE with eigenbeamforming and MMSE power allocation is compared to that of the conventional LE, LE with eigenbeamforming, and LE with MMSE beamforming, respectively.
For the simulation results shown in Figure 7, the Pedestrian B channel profile has been used for the MIMO subchannels. It can be seen clearly, that each of the DFE schemes exhibits a lower BLER than the corresponding LE scheme and the performance of DFE can be boosted significantly with the proposed MMSE power allocation. Compared to pure LE, even a gain of more than 2 dB can be observed for DFE with MMSE power allocation. For the results in Figure 8,  A further analysis of the MSEs of the transmit streams after equalization (results not depicted) has shown that the application of eigenbeamforming leads from a balanced error level for both substreams in case of no beamforming to an unbalanced MSE pattern, where there is one strong substream and one weaker substream with unreliable symbols. Additional MMSE power distribution even tends to enlarge the difference between the substreams, which is preferable for a scheme with strong channel coding, as the reliable symbols can be exploited for error correction of less reliable symbols. On the other hand, if weak or no channel coding is applied, the weaker substream dominates the performance of the DFE. Hence, in this case MMSE beamforming leads to a higher BLER.
Finally, Figure 9 shows the complementary cumulative density function of the PAPR for SC-FDMA transmission, SC-FDMA with MMSE power distribution for DFE (E b /N 0 = 7 dB) and different N θ , and orthogonal frequency-division multiple access (OFDMA) transmission. N set = 2 has been selected. Similar to LE, already N θ = 4 is sufficient to reduce the PAPR significantly.

Conclusion and Future Work
In this paper, we have investigated the application of beamforming to spatial multiplexing MIMO systems with SC-FDMA transmission. The transmitter was optimized for MMSE-LE and MMSE-DFE, respectively, at the receiver side.
With the MMSE as optimality criterion, the derivations lead to an eigenbeamformer with nonuniform power allocation.
Here, minimization of the arithmetic MSE for LE results in a power distribution scheme, where more power is assigned to poor frequencies which is in contrast to the classical capacity achieving waterfilling scheme resulting for minimization of the geometric MSE in case of DFE. This proves that eigenbeamforming with uniform power allocation, which was proposed in other work on beamforming for SC-FDMA, is suboptimum. Simulation results confirmed these derivations. Because perfect feedback has to be assumed for a satisfactory performance of the beamforming scheme with DFE, the combination of Tomlinson-Harashima precoding and beamforming at the transmitter side should be investigated in more detail.
To mitigate the increase of the PAPR, which is caused by beamforming in general, the beamformer design was modified exploiting unused degrees of freedom without compromising optimality. Without affecting the BLER performance, rotations were introduced, where for transmission the combination of rotations with the lowest PAPR was chosen. It was shown that a small set of different angles of rotation is sufficient to obtain a significant PAPR reduction, hence, the additional computational complexity can be kept low. However, the angles of rotation used in the transmitter need to be communicated to the receiver for derotation after reception. Although a slight increase of the PAPR still remains, this seems to be acceptable considering the performance gain that can be achieved with beamforming and optimal power allocation.
In this work, we have assumed the availability of the exact beamforming matrices at the transmitter. In practice this is not possible, as it would cause a high overhead to feed back these matrices. Therefore, quantized beamforming matrices are fed back from the receiver to the transmitter in a real system, which reduces the transmission overhead but also leads to a channel mismatch. The impact of quantization on the performance remains to be determined in future work. Another open issue is to determine an optimization criterion that is more suitable for weak channel coding. It is known that for high code rates the weaker subchannel dominates the BLER performance. Therefore, a Min-Max optimization, where the maximum MSE of the substreams is minimized, seems to be a promising approach in this case.