Frequency domain soft-constraint multimodulus blind equalization for uplink SC-FDMA

Single-carrier frequency division multiple access (SC-FDMA) has been adopted and employed as the standard in the 3rd Generation Partnership Project (3GPP) Long-Term Evolution (LTE) uplink multiple-access scheme. It offers comparable performance and complexity to orthogonal frequency multiple access scheme (OFDMA) with a lower peak to average power ratio (PAPR) offering power-efficient transmission and longer battery life to mobile terminals. However, due to its single-carrier nature, SC-FDMA performance degrades in channels with long impulse responses and becomes prohibitive to equalize when implemented in time domain (TD). Furthermore, of the seven SC-FDMA symbols in the LTE uplink slot, one full symbol is used for channel estimation leading to about 14 % throughput degradation. In this work, a novel frequency domain soft-constraint satisfaction multimodulus blind algorithm (FDSCS-MMA) is developed and proposed. The frequency domain approach results in computational complexity reduction while blind implementation ensured improved spectral efficiency and throughput. The algorithm convergence is further improved by normalization of each of the frequency bin in the weight update. Simulation results show superior performance of the developed algorithm over other blind algorithms.


Introduction
The demand for high data transmission rates has been on the rise in recent years with organizations and individuals requiring ultra high-speed data transmission scheme. Broadband wireless transmission is employed in delivering this high speed data requirement to subscribers in a very hostile radio environment which offers multipath to transmitted signal. The multipath could be severe requiring sophisticated corrective measures at the receiver. Orthogonal frequency multiple access scheme (OFDMA) is a popular technique which uses a low symbol rate modulation specially designed to cope with severe channel conditions in multipath environment [1]. However, it has high peak to average power ratio (PAPR) which imposes high-power penalty on the mobile users [2].
Single-carrier frequency division multiple access (SC-FDMA) is a variant of OFDMA with an additional discrete Fourier transform (DFT) processing block hence is function in [12] seeks to minimize the average error for a block of received symbols which does not necessarily force/restore each of the transmitted symbols to its correct point on the signal constellation while [13] is essentially a time domain implementation and hence has high complexity [12]. However, since SC-FDMA technique is set up in frequency domain, it is easier to implement its equalization in frequency domain as this avoids a lot of complications [10].
This paper presents a novel frequency domain implementation of soft-constraint satisfaction multimodulus algorithm (FDSCS-MMA) for equalization of SC-FDMA. The proposed frequency domain implementation is based on SCS-MMA [14] which was derived by applying the principle of soft-constraint satisfaction to relax the constraints in Lin's cost function [15]. This implementation avoids the use of reference symbols in order to improve the spectral efficiency and throughput. This is highly desired due to the fact that in the LTE uplink, a frame has 20 slots and each slot contains 7 SCFDMA symbols. Of these seven, one full training SC-FDMA symbol (preamble) is used followed by six data symbols (which has no training) and the channel is estimated (with channel-estimate-based approach, e.g., least squares) using this single preamble [16]. Hence, one out of seven SC-FDMA symbols in the LTE uplink is already designated for channel estimation leading to approximately 14 % throughput degradation [3]. Therefore, blind algorithms provide attractive solution for SC-FDMA equalization. Also, the frequency domain (FD) implementation greatly reduces the computational complexity [17] that is associated with time domain implementation in channels with long impulse responses and has many other advantages [18]. Therefore, the frequency domain approach results in computational complexity reduction, while blind implementation ensured improved spectral efficiency and throughput [19][20][21][22]. Furthermore, FDSCS-MMA achieve lower mean square error (MSE) than both the normalized FD-modified constant modulus algorithm (NFDMMA) [23] and the popular constant modulus algorithm (CMA). Finally, FDSCS-MMA convergence is greatly improved by normalization of each of the frequency bin in the weight update. We have used the square root of the spectral power of the equalizer input for our normalization rather than the spectral power considered in [14] as we found that this gives better performance. Specific contributions as presented in this paper include: (1) frequency domain implementation of SCS-MMA, (2) convergence improvement of FDSCS-MMA to realize normalized FDSCS-MMA, (3) adaptation and implementation of NFDSCS-MMA for the equalization of SC-FDMA, (4) reduced overhead and improved bandwidth efficiency compared to channel estimation algorithms, (5) superior phase recovery and intersymbol interference (ISI) optimization capability compared to other popular blind algorithms such as CMA and MMA. This paper is organized as follows: Section 2 details the mathematical description of SC-FDMA system. Section 3 provides the time domain (TD) implementation of the blind algorithms. Section 4 describes the FD implementation of the proposed algorithms. Section 5 shows simulation results of the performance for the proposed equalizers. Section 6 concludes the paper.

Description of SC-FDMA
SC-FDMA is a multi-access single-carrier modulation technique with a frequency domain equalization at the receiver and allows parallel transmission of multiple users' data. It is a variant of OFDMA with an additional DFT and IDFT processing block at the transmitter and receiver, respectively. What follows in this section is a detailed treatment of this well-known scheme. As stated in [8], it is advantageous to set up our system in terms of matrices as this simplifies implementation, provides a clear understanding of the system, and eases many performance analyses. Hence, our system is set up in this manner with the block diagram shown in Fig. 1.
In order to form an SC-FDMA block, sequences of data bits {a n } are first modulated into symbols using any of the modulation methods (BPSK, QPSK or M-QAM). For the qth user, where Q represents the total number of users in the system, data block x consisting of N symbols, is generated from the resulting modulation scheme as N-point DFT of x is taken as X = F N x to yield frequency coefficients which are then assigned orthogonal subcarriers for transmission over the channel. From the DFT operation, X represents DFT outputs for q th user given as while F N is an N × N DFT matrix defined as The 1 √ N is a normalization factor to ensure the same signal output power. There are two ways of assigning subcarriers in SC-FDMA. When adjacent subcarriers are allocated to DFT outputs from the same user such that the user data is confined to only a fraction of the available bandwidth, this is referred to as localized SC-FDMA (LFDMA) but when DFT outputs are spread over the entire bandwidth with zero amplitude allocated to unused subcarriers, it is referred to as distributed SC-FDMA (DFDMA). A special case of DFDMA is interleaved SC-FDMA (IFDMA) where the occupied subcarriers are equally spaced. The allocation schemes can be implemented using a resource allocation matrix D given in [8].
while F H M is an M × M IDFT matrix and H is an Hermitian operator. The total number of users in the SC-FDMA system equals bandwidth expansion factor Q = M/N where M is the total number of subcarriers. In order to complete an SC-FDMA block, the time domain signal is converted from parallel to serial arrangement and is cyclically extended by addition of cyclic prefix. A cyclic prefix (CP), which is typically removed at the receiving section before any major processing, is obtained by prefixing a symbol with its tail end to achieve mainly two purposes. If the CP length is the same or longer than the length of multipath channel delay spread, it helps prevent interblock interference (IBI) and also enable convolution between the channel impulse response and transmitted signal to be modeled as circular as opposed to normal linear convolution. This makes frequency domain equalization easy at the receiver. It is this second purpose that we have taken advantage of in adapting the FD blind algorithms to equalizing SC-FDMA symbols. The transmitted SC-FDMA block is where P is the length of the appended CP. In matrix format, both the transmitted and received signals can be written, respectively, as and We define T and G which are used in adding and removing CP, respectively, as In (8), I P×M is a matrix used in copying the last P row of The received signal undergoes the reverse of what it has undergone during the transmitting phase as shown in Fig. 1, hence the input to the equalizer is where H is an N × N diagonal matrix containing the channel frequency response for the qth user and V is the effective 1 × N noise vector. They are given as and Equation (10) results from the fact that addition and removal of CP turns channel matrix into a circulant matrix, and the resulting circulant matrix is diagonalized by DFT processing [24].

CMA
CMA is a blind algorithm that is also termed "Property Restoral" algorithm in that it restores the constant envelope of signal, that is lost due to multipath transmission and ISI, at the receiver utilizing only the signal statistics without employing any training or pilot symbols and as such improving the spectral efficiency [25]. CMA [26,27] basically reduces the error between the magnitude of equalizer output and a circle of constant radius. However, CMA is not able to correct any phase rotation introduced by channel characteristics since its cost function is independent of any phase information. The cost function for CMA is given as where z(n) is the output of the equalizer, E[ ·] denotes statistical expectation operator, and R is a constant defined as Denoting equalizer input vector as y(n) = y(n), T for an equalizer of length N, the equalizer output is expressed as In order to obtain the optimum coefficients of the equalizer, we use stochastic gradient to optimize the defined cost function with respect to the equalizer tap coefficients. Hence, we take stochastic gradient of (12) with respect to the tap weights vector to obtain where e(n) is the error factor and is given as and the tap weights vector are recursively updated as

MMA
MMA addressed the phase ambiguity of CMA by limiting the ambiguity to within ± π 2 [23]. The modified form of CMA was proposed in [28] to realize a cost function that is able to perform both blind equalization and carrier phase recovery simultaneously. The cost function for MMA is given as where and both subscripts R and I denote real and imaginary parts, respectively. However, including both real and imaginary parts of the equalizer output in the cost function and equalizing them separately sometimes results in diagonal solutions [29]. The error sample for MMA can be derived from (18) and is given as

SCS-MMA
A new blind algorithm, proposed by Lin [15], was derived by using the dispersion of real and imaginary parts of the equalizer output of MMA algorithm as constraints and minimizing the squared euclidean norm of the change in the tap weight vector to ensure that error samples approach zero. The proposed technique was based on the principle of minimum disturbance. From Lin algorithm, a new algorithm termed soft-constraint satisfaction multimodulus algorithm (SCS-MMA) was derived by relaxing the constraints defined by Lin using principle of softconstraint satisfaction (SCS) [14]. The cost function for SCS-MMA is given as where and R 2, The error term for SCS-MMA is derived from (21) and is given as SCS-MMA achieves equalization by forcing the real and imaginary parts of equalizer output onto a four-point contour with distance R 2 from the origin.
TD blind algorithms operate on a symbol-by-symbol basis processing a sample at a time. However, in order to take advantage of DFT processing, we need to formulate a block-by-block processing algorithm which will operate on a block of symbols at a time. This greatly improves computational cost and efficiency and is the most appropriate mode of processing for SC-FDMA FD equalization.
In the next section, we have taken advantage of CP embedded in the SC-FDMA block formation in adapting FD blind algorithms to its equalization. It should be noted that the frequency domain processing proposed in this work does not require the use of overlap-save and overlap-add signal processing techniques because these techniques are needed and employed in order to segment long streams of data for block processing and can be avoided with the inclusion of CP [30]. Additionally, since multiplication in frequency domain for discrete data is essentially circular convolution in time domain, overlap-save and overlap-add techniques helps in implementing linear convolution in frequency domain for cases where transmitted symbol is much longer than the channel impulse response. However, in SC-FDMA case, the received data are in blocks and these blocks of data, kept from IBI due to the appended CP, are fed into the equalizer for FD equalization.

Frequency domain blind algorithms
It is essential to point out the fundamental difference between the frequency domain equalization considered in this work and the frequency domain equalization (FDE) which is common in the literature. The FDE considered in works such as [23,31] and [32] are linear convolution implemented through the use of overlap save method. In this work, cyclic-prefixed single-carrier system (CP-SCS) results in periodic transmitted symbols which trick the channel to perform circular convolution rather than linear convolution. The periodicity is then removed at the receiver before carrying out frequency domain equalization. This sort of transmission format eliminates the need for overlap save method. Therefore, we simply feed the received symbol represented by (9) into the equalizer .
Frequency domain implementation differs from time domain implementation due to the fact that the former performs block update of the tap weight vector while the latter performs sample-by-sample update. This block update of tap weight vector greatly improves computational complexity and convergence rate. The structure of FD equalizer is shown in Fig. 2 for a single user. There are two major operations involved in time domain equalization detailed in Section 3 above. They are linear correlations in the update equation of (17) and linear convolution embodied by the filtering operation in (14). In this section, we take advantage of DFT processing in implementing these two operations which lead to circular correlation and circular convolution, respectively. As mentioned earlier, the special nature of SC-FDMA which includes DFT processing and insertion of CP at the transmitter ensures that the received data is in blocks rather than long streams which implies that we do not require the use of overlap-save or overlap-add sectioning methods. The DFT of equalizer input and tap weight vector for kth received block will, respectively, yield and where and Hence, the kth block of the equalizer output can be implemented with IDFT as where and is the element-wise multiplication while matrix D, a 2N × 2N matrix, is defined as and is used to implement conversion between DFT of a vector and that of its complex conjugate. Equation (28) follows from the complex conjugation property of DFT. Using equalizer output, the error factor can be computed and its DFT taken as where and An important observation is noted in (33) where the error factor is being computed in time domain. It is stated earlier that only correlation and convolution operations, which correspond to computation of equalizer output and weight update, respectively, are carried out in frequency domain. This is because error functions of blind equalizers are non-linear, and their frequency domain implementation is not equivalent to their time domain implementation. However, for non-blind equalizers like LMS whose cost function is linear in the error term, then, it is straightforward to extend its implementation to frequency domain. The weight update recursion of (17) is then implemented with DFT as Both (28) and (34) completely describe the equalizer operation in frequency domain.
The error functions of equalizers are derived from their cost functions, and this cost functions are different for different equalizers. Table 1 gives a synopsis of blind algorithm cost functions and their respective error functions. Following the preceding discussion, both CMA and MMA can easily be fitted into the developed framework. We find that the convergence of SCS-MMA can be greatly improved, following the treatment in [23], by considering the square root of the spectral power as a normalization factor and we subsequently referred to the improved algorithm as normalized frequency domain SCS-MMA (NFDSCS-MMA). Therefore, each frequency bin in the weight update equation of (34) is normalized by the spectral power of its respective input data. Both the power recursive and resulting normalized weight update equation are given by the following: and where λ is a forgetting factor and is an element-wise division operator. A careful re-ordering of the normalized weight update equation reveals another insightful observation into its effectiveness in improving the equalizer convergence. It is seen that the normalization is tantamount to using variable step size in each of the frequency bin which amounts to power control on each bin, and such technique is especially useful in applications where the input level is uncertain or vary widely across the band as noted in [18]. The procedure outlined in this section is repeated to realize normalized FDMMA and normalized FDCMA (NFDMMA) from the equations given in Table 1, and the details of the algorithm are given in Fig. 3.

Results and discussion
The algorithms proposed above were investigated by means of computer simulations in MATLAB environment. Specifically, we have evaluated the performance of both frequency domain soft-constraint multimodulus algorithm (FDSCS-MMA) and improved FDSCS-MMA and compared their performance with the well-known  This makes a total number of four users whose data were transmitted simultaneously. The MSE convergence curve in decibels was obtained as ensemble average and is plotted as a function of the number of iterations where each iteration represent an SC-FDMA symbol consisting of all the users' signal for that transmission time. The filter taps are of the order of N with center spike initialization. The modulation scheme employed for SCFDMA transmisson is 4 QAM. The localized carrier transmission mode is used in LTE uplink since it offers much better performance with the arrangement of pulse-shaping filter. Simulation results are averaged over 100 Monte Carlo iterations and are done for LFDMA since DFDMA is no longer supported in three GPP LTE standards though a scenario is shown for comparison of both allocation schemes [8,33]. The values of R 2,R , R 2,I , and λ are 1, 1, and 0.55, respectively. The step size for the equalizers are 4 × 10 −3 , 3 × 10 −3 , 3 × 10 −4 , and 1 × 10 −4 for NFDSCS-MMA, FDSCS-MMA, NFDMMA, and NFDCMA, respectively. The channel considered is frequency selective with six paths and each path fades independently, according to the Rayleigh distribution. A high speed of 360 km/h is used to account for time variation in the channel [34,35]. The additive white Gaussian noise have been chosen such that the signal to noise ratio (SNR) at the input of the equalizer is 20 dB. SNR of 10 dB is also considered for comparison of low and high SNR performance. The simulation parameters described above are implemented except stated otherwise. Figure 4 shows performance of localized (LFDMA) and interleaved (IFDMA) allocation schemes. It is shown that IFDMA slightly outperformed LFDMA in convergence speed but LFDMA has been selected as the uplink transmission scheme due to its low PAPR over OFDMA in general and high rate-sum capacity over IFDMA in particular [33]. Figure 5 shows performance of NFDSCS-MMA and FDSCS-MMA. The two algorithms achieve the same residual MSE but have different convergence time. It is seen that FDSCS-MMA took a longer time to converge, about 3000 symbols. This slow adaptation is a setback in broadband wireless communication system which typically requires high-speed transmission. The convergence rate was then improved greatly by considering a normalization factor leading to NFDSCS-MMA which converges at about 500 symbols. This corresponds to almost 83 % improvement in symbols saving over the algorithm without normalization for the same residual MSE. It can be  Fig. 5 that the effect of appropriate normalization is to provide better convergence seeing that both algorithms achieve the same residual MSE. Based on the preceding discussion, only normalized versions of the blind algorithms proposed in this work are considered in the remaining discussion. Figures 6 and 7 show MSE convergence comparison between the proposed algorithms for both SNR of 10 and 20 dB, respectively. Normalised versions are considered due to their faster convergence rate compared to unnormalized versions. Normalized FDSCS-MMA achieve fastest MSE convergence rate and lower residual error for the case of low and high SNR. NFDMMA is slightly better than NFDCMA at low SNR while both of them achieve similar MSE performance at high SNR of 20 dB.
Figures 8 shows MSE convergence comparison between the proposed algorithms for long channel impulse responses using the model C in [36] corresponding to a typical large outdoor environments with large delay spread. It is shown that NFDSCS-MMA has the best performance reflecting the robustness of the proposed algorithm while NFDMMA outperformed NFDCMA. performance than both NFDMMA and NFDCMA. The algorithms achieve same residual ISI but NFDSCS-MMA converges fastest for both low and high SNR scenarios and as a result, gives better performance. Figures 11 and 12 show the phase recovery capability of the proposed algorithms for both 16 QAM and 64 QAM, respectively. All the algorithms are able to recover 16-QAM symbol constellation but only NFDMMA and NFDSCS-MMA are able to recover 64-QAM symbol constellation. However, NFDSCS-MMA constellation is better than that of NFDMMA. It is also seen that NFDCMA is not able to correct the phase rotation introduced by the channel characteristics and that both NFDMMA and NFDSCS-MMA do this perfectly. This is due to the fact that both equalizers achieve equalization by forcing both the real and imaginary parts of the equalizer output onto  Figure 13 shows the BER performance of both FDSCS-MMA with its normalized version compared to optimum equalizers which are minimum mean square (linear MMSE) and zero forcing equalizers. It should be noted that both linear MMSE and zero forcing equalizers are non-blind channel estimation equalizers meaning that pilot symbols are periodically transmitted to accurately estimate the channel at the receiver. The expression for the output of zero forcing equalizer for the kth received block is given as   while that of MMSE is where Both H k and N k represent the channel response and noise component for the kth received block. We have assumed perfect knowledge of the channel in our simulation of the optimum equalizers. In order to assess the BER performance of NFDSCS-MMA, knowledge of the first two received symbols has also been assumed since SCS-MMA only minimizes the dispersion between real and imaginary parts of the received signal and four-point contours of distance R 2 . This assumption is required to correct the received signal phase [23] as "blind" in blind equalizers is with respect to the phase; hence, they are said to be blind to the "phase". It is shown in Fig. 13 that both NFDSCS-MMA and FDSCS-MMA achieve similar BER performance which is slightly less than that of linear MMSE. In situations where blind equalizers are used to open the eye of the signal constellation, a probability of symbol error of 10 −2 is considered acceptable [29]. From Fig. 13, it is seen that to achieve this acceptable performance, 8 dB is required for NFDSCS-MMA as compared to that of 7 dB for linear MMSE which is a small tradeoff compare to 14 % improvement in throughput.

Conclusions
In this paper, we have implemented a novel frequency domain soft-constraint multimodulus algorithm for single carrier. It is shown that the proposed algorithm outperforms the popular blind algorithm, CMA and its modified version, MMA in both residual MSE and convergence rate. Phase recovery capability of the proposed algorithm is also demonstrated with acceptable BER performance. This suggests that SC-FDMA can be perfectly equalized in broadband systems using the proposed algorithm with the resultant lower MSE, faster convergence, and improved spectral efficiency.