doi:10.1155/2010/974652 Research Article Block Transmissions over Doubly Selective Channels: Iterative Channel Estimation and Turbo Equalization

Modern wireless communication systems require high transmission rates, giving rise to frequency selectivity due to multipath propagation. In addition, high-mobility terminals and scatterers induce Doppler shifts that introduce time selectivity. Therefore, advanced techniques are needed to accurately model the time- and frequency-selective (i.e., doubly selective) channels and to counteract the related performance degradation. In this paper, we develop new receivers for orthogonal frequency-division multiplexing (OFDM) systems and single-carrier (SC) systems in doubly selective channels by embedding the channel estimation task within low-complexity block turbo equalizers. Linear minimum mean-squared error (MMSE) pilot-assisted channel estimators are presented, and the soft data estimates from the turbo equalizers are used to improve the quality of the channel estimates.


Introduction
Broadband wireless communication systems require high transmission rates, giving rise to frequency selectivity caused by multipath propagation, and consequently to intersymbol interference (ISI). In addition, recent wireless communication standards, such as WiMAX and Long-Term Evolution (LTE), also need to support high mobile speeds, leading to high-mobility terminals and scatterers that introduce Doppler shifts and time selectivity, that is, intercarrier interference (ICI). Due to the concomitant presence of ISI and ICI, specialized techniques are necessary to counteract the related performance degradation. However, with a properly designed transceiver, time-and frequency-selective (i.e., doubly selective) channels can even provide multiplicative delay-Doppler diversity gains [1,2].
LTE is a major 3GPP step in next generation wireless networks [3]. The LTE physical layer relies on a multipleaccess scheme based on orthogonal frequency-division multiplexing (OFDM) in the downlink, and on single-carrier frequency-division multiple access (SC-FDMA) in the uplink [3]. In both cases, the transmission scheme is blockwise, and a cyclic prefix (CP) is included in each data block in order to eliminate the ISI between consecutive data blocks. OFDM and single-carrier (SC) block transmissions share some similarities: since an SC system can be viewed as a discrete Fourier transform (DFT) precoded OFDM system [4], performance and complexity are comparable, but part of the complexity (i.e., an inverse DFT) is moved from the transmitter to the receiver [4]. However, there are also some important differences: with respect to OFDM, SC has a lower peak-to-average power ratio, and hence powerefficient terminals are suitable for the uplink [5]. However, both SC and OFDM systems suffer from doubly selective channels, and call for appropriate ICI mitigation methods.
A possible way to counteract a doubly selective channel is by means of iterative equalizers. The iterative approach, inspired by the turbo equalization principle [6,7], exchanges soft information between the channel equalizer and the decoder, in an iterative fashion, and greatly improves the system performance. In the last fifteen years, many turbo equalizers have been proposed for time-invariant frequencyselective channels (see [6][7][8][9][10][11], and the references therein). More recently, the turbo approach has been proposed also 2 EURASIP Journal on Advances in Signal Processing for doubly selective channel equalization, which is more challenging due to the time variation of the channel. For OFDM systems with doubly selective channels, lowcomplexity minimum mean-squared error (MMSE) turbo equalizers have been proposed in [12,13]. The turbo equalizers [12,13], which are based on frequency-domain processing, estimate the data either serialwise, that is, each subcarrier is sequentially equalized [12], or blockwise, that is, all subcarriers are jointly equalized [13]. For SC transmissions over time-invariant frequency-selective channels, time-domain turbo equalization (see [6,7]) is traditionally more popular than frequency-domain turbo equalization [10]. However, recently, frequency-domain equalization has gained renewed interest, due to its reduced complexity for channels with significant delay spread [5]. For SC systems over doubly selective channels, a low-complexity iterative equalizer has been proposed in [14], which can be regarded as the time-domain counterpart of the iterative frequencydomain equalizer [15]. However, time-domain iterative equalizers are not suitable for channels with significant delay spread, since their complexity is quadratic in the channel length [14]. Besides, maximum likelihood and maximum a posteriori sequence estimators for SC and OFDM systems have been proposed in [16], which models the doubly selective channel using a basis expansion model (BEM).
In this paper, as a first contribution, we apply the block philosophy to design a low-complexity MMSE turbo equalizer for SC systems in doubly selective channels. To the best of our knowledge, all the turbo equalizers proposed so far for SC systems over doubly selective channels employ a serialwise data processing, that is, use a sliding window either in the time domain [14] or in the frequency domain [15,17]. However, since the presence of the CP makes the transmission scheme blockwise, block equalization becomes a valid alternative. We design our block turbo equalizer for SC systems in the frequency domain, in the same spirit of the turbo equalizers designed for OFDM in [13]. An interesting feature of the proposed block equalizer is its reduced computational complexity, which scales only linearly with the block length. As a result, for doubly selective channels with significant multipath delay spread, our frequency-domain approach is less complex than timedomain equalizers like [14]. To keep the complexity low, some ad-hoc approximations are required, so that the proposed block turbo equalization algorithm for SC turns out to be different from that for OFDM [13]. In this paper, a performance comparison between the proposed algorithm and [13] is also given.
The ICI caused by Doppler spreading also makes the channel estimation problem more difficult. Pilot designs and pilot-assisted channel estimation algorithms have been developed for SC over time-varying flat-fading channels [18], for SC over doubly selective channels [19,20], and for OFDM over doubly selective channels [21,22]. All these papers, which employ a BEM for the channel, share the design principle that pilots and data are placed in such a way that they should remain orthogonal after transmission over the fading channel. Indeed, this criterion eliminates the data-to-pilot interference and hence it simplifies the channel estimation task. (The same criterion also eliminates the pilot-to-data interference, whose cancellation is therefore not necessary.) However, in doubly selective channels, orthogonal designs have two drawbacks. First, only approximateorthogonal designs are really possible, since a doubly selective channel cannot be perfectly diagonalized [23]. Second, a rate loss is introduced by the presence of zero symbols that are necessary to keep the almost orthogonality between data and pilots. On the other hand, nonorthogonal designs are also possible, such as the superimposed training approach developed in [24]. Actually, turbo-inspired iterative channel estimators can handle the data-to-pilot interference by means of reliability-based soft cancellation. In other words, the soft data estimates can be used to improve the quality of channel estimation, as shown by the adaptive iterative channel estimators [25,26].
As a second contribution of this paper, we present iterative (turbo-like) pilot-assisted channel estimators for both OFDM and SC block transmissions. Differently from the turbo-based channel estimators already proposed for SC transmissions over doubly selective channels [25,26], the proposed turbo-like channel estimators are nonadaptive and hence more suitable for block transmissions. For both OFDM and SC cases, the proposed iterative channel estimators firstly estimate the time-domain channel exploiting the BEM, and then transform the time-domain channel into the frequency domain for equalization purposes. This strategy is similar to that used for OFDM doubly selective channel estimation in [21,22]. However, differently from the channel estimators of [21,22], the proposed channel estimators exploit the reliability of the estimated data and can thus also work in the presence of nonorthogonal pilot designs.
To keep low-complexity channel estimation processing, we assume that the pilot symbols are located in the same domain where the data symbols are placed, that is, we assume frequency-domain pilots for OFDM systems, and time-domain pilots for SC systems. Although this choice is mainly dictated by computational complexity benefits, it is consistent with almost-orthogonal pilot allocation strategies for doubly selective channels, which indeed suggest timedomain pilots for SC systems [19,20], and frequencydomain pilots for OFDM systems [20].
The rest of this paper is organized as follows. In Section 2, we introduce the system model. Section 3 presents the proposed block turbo MMSE equalizer for SC systems. Section 4 deals with the design of iterative MMSE pilotassisted channel estimators, for both OFDM and SC systems. In Section 5, we evaluate and compare the performance of the proposed equalizer and of both channel estimators, by means of simulated results. Section 6 concludes the paper.
Notation. We use upper (lower) boldface letters to denote matrices (column vectors). (·) T and (·) H , and (·) † represent transpose, complex conjugate transpose (Hermitian), and pseudoinverse, respectively. [A] m,n indicates the (m + 1, n + 1)th entry of the matrix A. We use the symbol • and ⊗ to denote the Hadamard (element-wise) product and Kronecker product between matrices, respectively. diag(a) is a diagonal matrix with the vector a on the diagonal. E(·) stands for the statistical expectation. The covariance matrix between x and y is defined as Cov(x, y) = E(xy H ) − E(x)E(y H ). Finally, 0 M×N and I N denote the M × N all-zero matrix and the N × N identity matrix, respectively.

System Model
We consider a single-user communication system with blockwise transmission, and a channel that is both frequency and time selective. The structure of both transmitter and receiver is shown in Figure 1. At the transmitter, the information bits are encoded with error correction coding, and the coded bits are interleaved and mapped into N d complex symbols, represented by the N d × 1 vector s d , and the data symbols are assumed to be uncorrelated. We define s p as the N p × 1 vector that stands for the pilot symbols, which are multiplexed with s d to form a block of N = N d + N p transmitted symbols s. For simplicity, we consider unit-energy quaternary phase-shift keying (QPSK) with the symbol alphabet shown in Table 1. However, the equalizers and channel estimators proposed herein can be easily extended to other constellations, like in [8].
As far as the time dispersion of the channel is concerned, we adopt the standard assumption that the maximum channel order is equal to the CP length, both denoted by L, where L < N. This way, there is no interference between successive blocks, and the equalizer can be designed separately for each block. As a consequence, we can omit the block index from our notation.
At the receiver, after removing the CP, the N × 1 received vector y t can be expressed as where H t is the N × N time-domain channel matrix, P denotes the N × N precoder matrix, s represents the N × 1 symbol vector consisting of the multiplexed pilot and data symbols, and n t stands for the N × 1 noise vector, which is assumed to be uncorrelated with the data symbols. The precoder is set to P = I N for SC systems, and P = F H for OFDM systems, where F denotes the N × N unitary DFT matrix. For simplicity, we assume that n t is a circularly symmetric complex Gaussian noise vector, with zero mean and covariance matrix R nt = E(n t n H t ) = σ 2 n I N . At the receiver, a length-N time-domain window can be applied after CP removal and before the DFT operation. In this case, the output vector after the DFT operation can be expressed as where y f = FWy t , n f = FWn t , H f = FWH t F H , and W = diag(w), with w the N × 1 vector denoting the time-domain receiver window. Note that classical systems do not include windowing, that is, When the channel is time varying, H t is no longer circulant, and the N × N frequency-domain channel matrix H f becomes a nondiagonal matrix, giving rise to ICI, where the ICI coupling is summarized by the nonzero offdiagonal elements of H f . However, with a proper window design, H f is cyclically banded, with the most significant elements around the main diagonal, and on the upper-right and lowerleft corners [12]. In this paper, we employ the minimum band approximation error windowing developed in [27], where the window w is obtained as a sum of complex exponentials. This choice permits the use of low-complexity equalization algorithms specially tailored to banded and cyclically banded matrices, as explained in [12,28]. Observe that the receiver windowing in [27] only requires some statistical knowledge about the channel time variation, and this knowledge does not even have to be very exact.
To simplify the equalization procedure, the matrix H f is further approximated by its cyclically banded version where Θ is the N × N cyclically banded circulant matrix, which has ones on the main diagonal, on the B c superand B c subdiagonals, and on the upper-right and lowerleft B c -size corners, while the remaining entries are zeros.
The matrix bandwidth parameter B c allows for a trade-off between equalization complexity and performance, and it can be chosen according to some rules of thumb [12]. When windowing is included, B c is usually much smaller than the number of subcarriers N. It can be observed that the transmitted data block s represents a time-domain signal in SC systems, while it represents a frequency-domain signal in OFDM systems. This clearly explains why SC systems are more prone to multipath effects, which mix the data due to the associated ISI, while OFDM systems suffer from Doppler effects, which mix the data due to the associated ICI. Our equalizer will be designed in the frequency domain, with the goal of mitigating the interference caused by the offdiagonal elements of H.

Low-Complexity Block Turbo Equalization
In order to derive frequency-domain block turbo equalizers for doubly selective channels, let us define s i as the ith QPSK symbol of s, and (s i,1 , s i,2 ) as the related bits.   Figure 1: System model.
to the pilot symbol value, while the variance is zero, for all the iterations. After each iteration of the equalizer, we update the means and the variances using the soft estimated symbols. Specifically, we need to calculate the extrinsic log-likelihood [7,13]. To perform this calculation, we should derive the probability density function (PDF) p( s i | s i = s), which can be approximated as Gaussian: [7,8]. As shown in Figure 1, the extrinsic LLR L e (s i, j ) is passed to the decoder to generate a new extrinsic LLR L d e (s i, j ), which is added to the a priori LLR to form the new a posteriori LLR L new (s i, j ). The new a posteriori LLR permits to update the means and the variances of the estimated symbols as in [7,13]: The a posteriori LLR L new (s i, j ) of the current iteration then becomes the a priori LLR L(s i, j ) used in the next iteration.
In the first iteration, no prior information is available, and therefore the a priori LLR is zero. The whole procedure described above can then be repeated, depending on the chosen number of iterations.
In the next subsection, we present a block turbo equalizer for SC systems. The proposed equalizer is derived using a similar approach as in [13], which develops three block turbo equalizers for OFDM systems.

Block Turbo Equalization for SC Systems.
In SC systems, the precoder is absent and therefore it is set to P = I N . In this case, (2) can be rewritten as where s f = Fs. Similarly to our previous notation, we However, as it will be explained later, dealing with a diagonal V f is crucial for complexity reasons. Therefore, to save complexity, we replace V f with its approximated version V f • I N obtained by setting its offdiagonal elements to zero. Since the diagonal elements of A similar approximation is sometimes used also in timedomain equalizers [7,8].
By using m f ,i and v f ,i as prior information, a frequencydomain linear MMSE equalizer can be obtained using the approach used in [13] (Equalizer III) for OFDM systems. In SC systems, this approach leads to where h i is the ith column of H, It should be observed that, when V f is approximated as diagonal, the matrix A is cyclically banded, and hence the computations in (6) can be performed using special algorithms designed for solving cyclically banded linear systems. In this work, we EURASIP Journal on Advances in Signal Processing 5 have used a cyclic band LDL H factorization obtained by a convenient modification of [28], in the same spirit of the fast Cholesky factorization of [29]. An alternative LU factorization algorithm could be derived using the divide-andconquer method of [30]. Using the algorithms specifically tailored to cyclically banded matrices, the computational complexity per data block reduces to O(B 2 c N), which is linear in the block size N. On the contrary, the complexity of a time-domain equalizer would be O(L 2 N). Since the Doppler support B c is usually much lower than the maximum channel order L, our frequency-domain MMSE equalizer is computationally cheaper than the corresponding timedomain MMSE equalizer.
On the other hand, without any approximation on V f , A would not be cyclically banded, and therefore the complexity order of MMSE equalizers would be O(N 3 ). Since the block size N is by far greater than the Doppler support B c , the diagonal approximation is necessary for low-complexity MMSE equalizers. However, when moderate complexity is affordable, other approximations are possible. For instance, if V f is approximated as cyclically banded with bandwidth B v , the matrix A would be cyclically banded too, but the computational complexity would increase to O((B c + B v ) 2 N). Alternatively, a low-complexity weighted least-squares (WLS) equalizer that avoids the approximation of V f could be employed, by neglecting the noise covariance matrix inside A. However, since doubly selective channels lead to highly ill-conditioned matrices, WLS equalizers produce a very poor performance. Indeed, MMSE equalizers can be interpreted as regularized WLS equalizers.
From (6), the estimated time-domain data-symbol vector is successively obtained by s = F H s f , which leads to where i k is the N × 1 indicator function, defined as the kth column of I N , In order to derive the extrinsic LLR, the mean μ i and the variance σ 2 i of the Gaussian PDF p( s i | s i = s) are calculated from (7) as where and Σ by diagonal matrices, by setting their offdiagonal elements to zero. As it will be explained later, similarly to the diagonal approximation of V f , these approximations are necessary to maintain a low complexity. We now separately discuss the three approximations. First, the approximation of F(V − v i i i i H i )F H is similar to the approximation of V f , which has been discussed previously. However, now the obtained diagonal matrix is not a scaled identity. Second, since H is cyclically banded, the offdiagonal elements of H H H decay to zero very rapidly. Hence, we expect that the approximation on H H H will not introduce a significant error. Third, the matrix Σ = H H A −1 H represents the effect of a linear MMSE equalizer H H A −1 applied to the channel matrix H. Since the MMSE equalizer highly mitigates the ICI, Σ is already very close to a diagonal matrix. This last approximation also leads to TΣ ≈ I N , which justifies the equalizer unbiasedness μ i ≈ s i .

EURASIP Journal on Advances in Signal Processing
It is easy to prove that the calculation of the extrinsic LLR in (9) has complexity O(B c N). Therefore, the equalization complexity of (6) dominates over the extrinsic LLR calculation complexity of (9). Taking into account FFT operations, the overall computational complexity per iteration for each block of N symbols is O((B 2 c + log(N))N), which is independent of the channel length L. On the other hand, the complexity of the time-domain equalizer of [14] is O (L 2 N). Therefore, for multipath channels with a long impulse response, we obtain a significant complexity saving. A more detailed discussion (i.e., flops count) about the computational complexity of banded turbo equalizers can be found in [13].
We highlight that the three diagonal approximations introduced in (8) are fundamental in reducing the computational complexity. For instance, if the full matrix Σ is used, the computation of σ 2 i in (8) involves full matrices, and therefore the computational complexity would be at least O(N 2 ). In this case, the complexity of the extrinsic LLR calculation (9) would dominate. Clearly, the nonapproximated equalizer would be useful only when the block size N is small, which is not feasible in long multipath channels due to the constraint N > L. Therefore, if low computational complexity is important, there is no way to avoid diagonal approximations. We also point out that, among the different possible ways to approximate the two matrices H H H and Σ as diagonal, the only reasonable approach is setting their offdiagonal elements to zero. Indeed, as explained after (8) Intuitively, the average approximation assigns to all the symbols the average reliability of all the symbols, whereas the maximum approximation assigns to all the symbols the reliability of the worst symbol estimate. In the simulation section, we compare the performance of both approximations.

Iterative Channel Estimation
The turbo equalizers presented in the previous section require the channel-state information (CSI) at the receiver. To acquire the CSI, we propose a modification of a pilotassisted channel estimator presented in [21] for OFDM. Specifically, we modify the iterative linear MMSE channel estimator of [21] in such a way that it can operate in a turbo fashion. Therefore, besides the pilot symbols, we also use the soft data estimates originating from the turbo equalizer and the decoder. Indeed, after the first iteration, the soft data symbol estimates can be used as auxiliary pilot symbols, in order to improve the quality of the subsequent channel estimates [31]. For both OFDM and SC systems, our channel estimators produce an estimate H t of the timedomain channel matrix H t , and then translate H t into the frequency-domain cyclically banded matrix estimate H. The channel estimators are assumed to have perfect knowledge of the channel statistics, that is, the Doppler spectrum and the power-delay profile. We highlight that the channel estimators considered in this paper are nonadaptive, that is, the CSI is newly estimated in each transmitted block, using both pilots and data. This way, severe time variation can be handled.
In pilot-assisted transmissions, there exist various approaches to design the pilot pattern. We can distinguish between two broad categories: multiplexed training and superimposed training [24]. In the multiplexed training case, each element of the transmitted vector contains either a pilot symbol or a data symbol, while in the superimposed case both pilot and data symbols are located in the same positions, typically distributed over the whole transmitted vector. In this paper, we assume multiplexed training, which is also known as periodic training when the pilots are placed in the time domain, and as orthogonal training when the pilots are located in the frequency domain. In particular, we focus on the pilot placement schemes developed in [19,20], which have been proved to be optimal in the MMSE sense under certain channel conditions. In these schemes, pilot symbols are interleaved with the data symbols to form the transmitted signal vector. For OFDM systems, we employ the frequency-domain Kronecker delta (FDKD) pilot structure [20], while, its dual scheme [19], identified as time-domain Kronecker delta (TDKD), is adopted for SC systems. In both cases, the pilot symbols are grouped into equidistant clusters, each having the same length. Within each cluster, a unique nonzero pilot symbol is located in the middle of the cluster, while null pilot symbols are placed on both sides. Therefore, the FDKD scheme coincides with equispaced pilot tones with guard frequency bands, while the TDKD scheme uses periodic training with guard time intervals.
Suppose that there are M pilot clusters, each containing L p (odd) pilots, denoted by the vector s respectively, with size ML p and N − ML p , respectively.
In addition, we define h t n,l as the lth channel tap at the nth time instant, where h t n,l = 0 for l < 0 or l > L, since the maximal channel order is assumed to be L. Thus the elements of H t can be expressed as which means that our channel estimation problem has N(L+ 1) unknowns. However, these unknowns are correlated in the EURASIP Journal on Advances in Signal Processing 7 time domain. The BEM can be used to reduce the number of unknowns from N(L + 1) to (Q + 1)(L + 1), where Q + 1 is the number of basis functions [21]. By stacking all the channel taps within the block in a single N(L + where B = [b 0 , . . . , b Q ] is an N × (Q + 1) matrix that has Q + 1 orthonormal basis functions b q as columns, and h is a (Q + 1)(L + 1) vector that collects all the BEM coefficients of all the channel taps.
In order to derive our MMSE channel estimator, the following assumptions are made. Assumption 1. The wireless channel can be regarded as a wide-sense stationary uncorrelated scattering (WSSUS) process, which has the following statistics where σ 2 l denotes the variance of the lth channel tap, γ t is the normalized time correlation, and δ n stands for the Kronecker delta function.
Assumption 2. The data symbols in s d are assumed to be uncorrelated with zero mean and variance σ 2 s , while the noise at the receiver is assumed to be uncorrelated with the transmitted symbols, as expressed by respectively.
Assumption 3. The BEM coefficients h are assumed to be uncorrelated with the transmitted signal s and the noise, respectively, as expressed by Assumption 4. The covariance matrix of the BEM coefficients is assumed known to the receiver, and it is calculated as [21] where Assumption 5. The average power of the pilot symbols is the same as that of the data symbols, as expressed by

Iterative Channel Estimation for OFDM Systems.
For OFDM systems, the pilot and data symbols are interleaved in the frequency domain. Since the frequency-domain channel matrix is cyclically banded only approximately, the received samples used for channel estimation are always contaminated by ICI, independently of the length of the null guard bands inserted. To be precise, the frequency-domain channel matrix is (with high probability) a full matrix, and hence the power of the pilot symbols is spread over all the received samples. While a time-domain receiver window can reduce the ICI to get a better equalization performance, it is still unclear whether the same window can improve the channel estimation quality or not. Thus, to estimate the time-domain channel matrix H t , we use the frequency-domain received signal without applying the time-domain receiver window. For OFDM systems, the precoder is set to P = F H , and (2) can be rewritten as By substituting (13) in (19), we can rewrite (19) as where . , h q,L ]}, and F L represents the first L + 1 columns of the matrix √ NF. It is noteworthy that, for channel estimation purposes, it is not necessary to process all the received samples. Indeed, the computational complexity of the channel estimator can be highly reduced by extracting a subvector of y f before channel estimation [21]. Obviously, this subvector should contain the relevant information given by the pilot symbols. Therefore, with reference to the mth pilot cluster s p m, we consider the following observation subvector where Δ is a smoothing parameter used to control the amount of interference taken into account for channel estimation. Please observe that Δ can be positive as well as negative, or zero: When Δ is positive, the channel estimator is actually enlarging the observation window, which in this case monitors also the 2Δ data symbol locations closest to the pilot symbols. The received signal in (21) can also be expressed as (22) where D q,m is an (L p +2Δ)×N matrix consisting of the L p +2Δ rows of D q with indices from n m − Δ to n m + L p − 1 + Δ. It can be observed that the pilot symbols, as well as the soft data estimates m, are used to estimate the CSI, which could help to achieve a better performance than [21], which uses the pilot symbols only. The second term in (22) reflects the uncertainty of the soft data estimates and can be regarded as interference, whose covariance can be taken into account into the channel estimator.

EURASIP Journal on Advances in Signal Processing
After some tedious manipulations, we can rewrite (22) as a function of h as where As a result of (24), the linear MMSE estimation of the BEM channel coefficients can be expressed by where where After the estimation of the BEM coefficients in h, the timedomain channel vector can be reconstructed by (13) as h t = (B⊗I L+1 ) h, whose elements form the estimated time-domain channel matrix H t . In [21], it has been shown that the BEM-based linear MMSE channel estimator can achieve a better performance by using a larger number of observation samples, that is, when all elements of y f are included in the observation vector y o f . Obviously, the same behavior is expected in our case: Indeed, our channel estimator additionally includes the reliability of the turbo-equalized data symbols, and hence additional benefit should be obtained by including more data locations into the observation window. However, the main complexity of our channel estimator comes from the matrix inverse in (25), which requires the observation vector length to be small. Thus, the smoothing parameter Δ allows for a trade-off between channel estimation complexity and performance.

Iterative Channel Estimation for SC Systems.
Unlike OFDM systems, where the pilot symbols are inserted in the frequency domain, and the frequency-domain channel matrix is cyclically banded only approximately, in SC systems the pilots are positioned in the time domain, and the time-domain channel matrix is banded, due to the FIR channel assumption. Therefore, using sufficiently long guard intervals, the ISI between pilots and data is completely eliminated [19], thereby simplifying the channel estimation procedure. Due where the smoothing parameter Δ is defined as in (21), but now operates in the time domain. Using the expressions (12) and (13), we can rewrite the time-domain received signal (1) as [21] where H t,m consists of the corresponding rows of H t , with indices from n m − Δ to n m + L p − 1 and The covariance matrix of the interference term R dt,m can be calculated as Stacking the M observation clusters together, we get the reduced-size time-domain received signal where R d o t and R n o t are similarly defined as in (25), and R d o t = diag(R dt,0 , . . . , R dt,M−1 ). It is easy to understand that, also in this case, a better performance is achieved by including a larger number of observation samples [21], that is, by increasing the smoothing parameter Δ, at the price of increased complexity.

Simulation Results
In this section, the proposed algorithms are examined and compared by simulations. We consider a block transmission system with block length N = 256. A rate 1/2 convolutional code with generator polynomials (5, 7) (in octal notation) and codeword length of 16384 is used. We employ random interleaving. The maximum channel delay spread and the CP length are equal to L = 7. The channel is assumed to be Rayleigh distributed with uniform power-delay profile E{|h n,l | 2 } = 1/(L + 1), l = 0, . . . , L, and with Jakes' Doppler spectrum [32,33]. We consider a high-mobility case where the normalized Doppler frequency is f d T = 0.15/N with f d the absolute Doppler frequency shift and T the symbol period. It can be interpreted as f d /ξ = 0.15 with ξ the subcarrier spacing in OFDM systems. The time-domain receiver window of [27], as well as the cyclically banded equalizers, are designed for a matrix bandwidth parameter B c = 3, unless otherwise stated. We use the generalized complex-exponential (GCE) BEM to model the time-varying channel at the receiver [21]. Note that the (critically sampled) complex-exponential (CE) BEM would produce a cyclically banded channel matrix estimate, where the number of BEM parameters Q + 1 coincides with the number of estimated diagonals. On the contrary, the GCE-BEM produces a full channel matrix estimate: Hence, this choice permits to increase the equalizer bandwidth B c , so that the number of equalizer diagonals 2B c + 1 can exceed the number of BEM parameters Q + 1. The channel decoder employs a linear approximation to the log-MAP decoding algorithm. Figure 2 shows the bit error rate (BER) performance of (the iterative block) Equalizer III [13], equipped with the proposed channel estimator for OFDM systems, as a function of the signal-to-noise ratio (SNR), which is defined as SNR = 1/σ 2 n . We insert ML p = 10 frequency-domain pilot symbols, grouped into M = 10 clusters, which means L p = 1, that is, there are no guard bands around the nonzero pilot in each cluster. Therefore, the efficiency is η = (N − ML p )/(N + L) = 0.94. We use the GCE-BEM with Q = 4. The observation length parameter is set to Δ = 2, which leads to a total observation length of M(L p + 2Δ) = 50 for each block. It is clear that most of the performance gain is obtained when passing from one iteration, which represents the noniterative equalizer, to two iterations. In addition, it is relevant that the performance gain obtained by iterative equalization with respect to the noniterative equalizer is higher in case of estimated CSI: For instance, at BER = 10 −3 , the performance gain is more than 3 dB. The performance gain with respect to noniterative approaches is confirmed by Figure 3, which displays the normalized mean square error (NMSE) of the iterative MMSE channel estimator, defined as NMSE = E{ h t − (B ⊗ I L+1 ) h 2 /N}. Notably, the first iteration of our channel estimator coincides with the linear MMSE channel estimator of [21].
We now compare the BER performance of the iterative block equalizer for OFDM [13], assuming different choices for the basis functions used by the channel estimation algorithm proposed in this paper. Here we assume B c = 2, while the other parameters are the same of Figure 2.   iterations. These results, obtained for soft-decision dataaided pilot-based channel estimation, are consistent with the results obtained in [21] for nondata-aided and covariancedata-aided pilot-based channel estimation. Figure 5 shows the BER performance comparison of the proposed iterative frequency-domain block turbo equalizer for SC systems with perfect CSI. We employ ML p = 52 time-domain pilot symbols, grouped into M = 4 clusters, which means there are (L p − 1)/2 = 6 guard time symbols on each side of the nonzero pilot in each cluster. In this case, the efficiency is η = (N − ML p )/(N + L) = 0.78.  We now use a BEM order Q = 2. The observation length parameter is set to Δ = −6, which leads to a total observation length of M(L p + L + 2Δ) = 32 for each block. It can be observed that the time-domain receiver windowing can improve the BER performance over the system without windowing [34] under doubly selective channels. With timedomain receiver windowing, the BER performances of the average approximation of v i = (1/N)( N k=1 v k − v i ) and the maximum approximation of v i = v = max{v k } N k=1 in (8) are almost the same after three iterations. Figure 6 illustrates the BER performance of the proposed iterative frequency-domain block turbo equalizer for SC systems with average approximation, for both perfect and estimated CSI. It is shown that the second iteration of our block turbo equalizer achieves about 1.5 dB gain with respect to the first iteration, which corresponds to the output of a noniterative equalizer. However, a third iteration does not help. Figure 7 plots the NMSE of the MMSE channel estimator. Note that, differently from the considered SC scenario, in OFDM systems we have not used any guard bands around the nonzero pilot tone. The reason is that in OFDM systems the ICI power has a rapid decay. On the contrary, since the considered channel has a uniform powerdelay profile, it generates significant ISI. Thus, the iterative process is not capable of suppressing a lot of interference, and large guard intervals are needed to accurately estimate the CSI.
It is interesting to compare the SC and OFDM systems in doubly selective channels. Previous work has shown some performance comparisons for frequency-selective channels [5,35]. Figures 8 and 9 illustrate the BER performance comparisons for SC and OFDM systems with different Doppler spreads and different channel lengths, respectively. The BER curves are for the turbo equalizers with perfect

SNR (dB)
Perfect CSI, 1st iteration Perfect CSI, 2nd iteration Estimated CSI, 1st iteration Estimated CSI, 2nd iteration CSI after three iterations. The bandwidth parameter B c = 3 is the same for both SC and OFDM systems. It can be shown that, as the values of Doppler spread and channel length increase, a lower BER can be achieved at high SNR. The simulation results confirm that both SC and OFDM systems benefit from channel coding, by obtaining delay and Doppler diversity, respectively, which is not fully exploited in the uncoded case. However, the achievable diversity gain is difficult to analyze, since the cyclic band approximation  error impairs the performance at high SNR, and the amount of diversity also depends on the specific error correction code used [5,35].

Conclusions
We have proposed a low-complexity frequency-domain block MMSE turbo equalizer for SC systems in doubly selective channels. We have exploited the cyclically banded structure of the frequency-domain channel matrix, as well as receiver windowing that enforces the cyclically banded structure, to limit the computational complexity, which is linear in the block length. For both OFDM and SC systems, we have developed two iterative MMSE pilotassisted channel estimators, where the soft data estimates 12 EURASIP Journal on Advances in Signal Processing from the turbo equalizers are used to improve the quality of the channel estimates. Combined with error correction coding, both OFDM and SC systems can effectively exploit the delay-Doppler diversity provided by doubly selective channels.