Skip to main content


CFO mitigation for uplink SFBC SC-FDMA


To mitigate inter-carrier interference due to large carrier frequency offset (CFO) in an uplink single-carrier frequency division multiple access (SC-FDMA) system, a three-tap adaptive frequency-domain decision feedback equalizer (AFD-DFE) is designed in this paper. Our design exploits the banded and sparse structure of the equivalent channel matrix. The block recursive least squares (RLS) algorithm is used to adapt the AFD-DFE. Consequently, by exploiting the matrix structure in the frequency-domain, the complexity of the block RLS is reduced substantially when compared to its time-domain counterpart. In addition, the design is extended to space-frequency block coded (SFBC) SC-FDMA systems. We show that our proposed AFD-DFE exhibits significant performance improvement when compared to a one-tap AFD-DFE while still enjoying a low computational complexity.


SC-FDMA is a multiple-access technique which has been adopted in wireless broadband communication systems such as the long term evolution (LTE) standard [1]. It has comparable complexity and performance to that of the orthogonal frequency division multiple access (OFDMA) [2] but with an additional benefit of having a low peak average power ratio (PAPR), which helps in reducing the power consumption and increasing battery life in mobile terminals. The sensitivity analysis of SC-FDMA is reported in [3] where it is shown that for large carrier frequency offset (CFO), the performance of SC-FDMA can become worse than that of orthogonal frequency division multiple access (OFDMA).

To improve reliability at the user terminal, transmit diversity is employed in LTE-advanced (LTE-A) [4]. Alamouti’s space time block codes (STBC) [5] cannot be applied to SC-FDMA systems, since in LTE, the frames contain an odd snumber of SC-FDMA symbols while in STBC this, number should be even. Moreover, in STBC, it is assumed that the channel remains constant for two consecutive SC-FDMA blocks which is not valid in the case of fast varying channels and consequently results in performance degradation. An attractive solution to this problem is using space-frequency block codes (SFBC) [6]. In SFBC, the number of symbols in each frame is not required to be even but when applied to a SC-FDMA system, it affects its low peak-to-average power ratio (PAPR) property. In [7] and [8], new schemes to deal with the aforementioned issues are proposed. However, their performances degrade at high signal-to-noise ratio (SNR). In [9], an embedded SFBC technique is proposed which preserves the low PAPR property of SC-FDMA as well as Alamouti’s structure in case of inter-carrier interference (ICI).

Several works studied the frequency-domain DFE [1015]. All of these DFE structures are non-adaptive and they require channel state information (CSI) at the receiver. Recently, unlike those in [1015], adaptive equalization schemes (RLS/particle swarm optimization (PSO)-based) are proposed for SC-FDMA system [16, 17], and impressive performance gains are achieved.

In [1017], a one-tap per subcarrier frequency-domain equalizer is used, which becomes highly suboptimal in the presence of ICI. In [16], distributed mapping with less number of users are assumed, but if the number of users increases or in case of localized mapping, the interference due to ICI will be more pronounced and the performance deteriorates. To overcome this sub-optimality, in this work a three-tap per subcarrier AFD-DFE is designed by exploiting the banded and sparse structure of the channel matrix. In this AFD-DFE, both the feedforward and feedback filters operate in the frequency-domain and as a performance-complexity tradeoff, the block RLS algorithm is used for adaptation as it is known to enjoy a fast convergence/tracking property. Generally, the complexity of the block RLS is high due to the matrix inversion operation involved [18], but when used in the frequency-domain, the inversion operation is simplified due to the special structure of the matrices resulting in a significant reduction in complexity. To further improve the performance, the three-tap AFD-DFE is efficiently integrated with SFBC technique. Thanks to the structure of the matrices, the combined structure also exhibits low computational complexity. In addition, a constraint is placed on the feedback filter to mitigate the effect of intersymbol interference.

Unlike [19] and [20], our work is of an adaptive nature and CFO and channel estimation are carried out together adaptively. This of course has the advantage of reducing the complexity and the overhead due to pilots. Furthermore, these works ([19] and [20]) are for OFDM in which the pilots are inserted in each symbol whereas in SC-FDMA a symbol full of pilots is sent after every few data symbols. Therefore, CFO estimation is difficult in SC-FDMA. Moreover, the work in [19] and [20] also relies on previous acquired per-user CFO and channel frequency response estimates. This is not the case in our work.

In summary, unlike [16], the contribution of this work is threefold: first, the development of a constrained- based RLS alogithm. Second, the formulation of a three-tap AFD-DFE for SISO and SFBC systems, and third, the complexity reduction of the three-tap RLS algorithm in SISO and SFBC cases.

The rest of the paper is organized as follows. Following this introduction, Section 2 is devoted to the system’s description. In Section 3, the formulation of a three-tap AFD-DFE is carried out for a single-input single-output SC-FDMA system with CFO. In this section, a reduced-complexity design of the AFD-DFE is also developed. While Section 4 extends the design to SFBC SC-FDMA system, simulation results are presented in Section 5. Finally, Section 6 draws the conclusions.

System description

In this section, the SC-FDMA transceiver is described. We assume K users and a total of N sub-carriers with M sub-carriers for each user, i.e., N=K M. For the m th user, M data symbols are grouped to form a block x (m). An M-point DFT is applied to transform x (m) to the frequency-domain symbol, \(\boldsymbol {\mathcal {X}}^{(m)}=\,[\!\mathcal {X}(0)^{(m)},\mathcal {X}(1)^{(m)},\ldots,\mathcal {X}(M-1)^{(m)}]^{T}\), where T denotes the transpose operation. Next, \(\boldsymbol {\mathcal {X}}^{(m)}\) is mapped to N sub-carriers, i.e., \(\boldsymbol {S}^{(m)}=\boldsymbol {R}^{(m)}\boldsymbol {\mathcal {X}}^{(m)} (m=1,2,\ldots,K)\), where R (m) is the N×M resource allocation matrix for the m th user. For the localized mapping scheme, R (m)=[ 0 M×(m−1)M I M 0 M×(Km)M ] where I M is an M×M identity matrix with columns I 1,I 2,…,I M and 0 M×M is an M×M all zero matrix. Note that R (m) is orthogonal for different users. Then, the block S (m) is transformed to the time-domain, s (m), by applying an N-point inverse DFT (IDFT), \(\boldsymbol {s}^{(m)} =\boldsymbol {F}_{N}^{H}\boldsymbol {R}^{(m)}\boldsymbol {\mathcal {X}}^{(m)}\), where F N is an N×N DFT matrix and H denotes the Hermitian operation.

We denote the impulse response of the channel for the m th user by \(\boldsymbol {h}^{(m)}=\left [h_{0}^{(m)},h_{1}^{(m)},\ldots,h_{L(m)}^{(m)}\right ]\). The cyclic prefix insertion at the transmitter and removal at the receiver is equivalent to circular convolution between the transmitted signal and the channel vectors. Applying an N-point DFT to the received signal gives \(\boldsymbol {\mathcal {\acute {Y}}}= \sum _{m=1}^{K}\boldsymbol {\hat {\Lambda }}^{(m)}\boldsymbol {R}^{(m)}\boldsymbol {\mathcal {X}}^{(m)}+\boldsymbol {\mathcal {N}}\), where \(\boldsymbol {\hat \Lambda }^{(m)}\) is a N×N diagonal matrix containing the DFT of h (m) as diagonal elements, \(\boldsymbol {\mathcal {N}}\) is noise vector with variance \(\sigma _{\mathcal {N}}^{2}\boldsymbol {I}_{N}\). After demapping, the m th user’s received signal is

$$\begin{array}{*{20}l} \boldsymbol{\mathcal{Y}}^{(m)}= \underbrace{\boldsymbol{R}^{(m)T}\boldsymbol{\hat\Lambda}^{(m)}\boldsymbol{R}^{(m)}}_{\boldsymbol{\Lambda}^{(m)}}\boldsymbol{\mathcal{X}}^{(m)}+\boldsymbol{\mathcal{N}}^{(m)} \end{array} $$

where Λ (m) is an M×M diagonal matrix. To simplify the notation, we will ignore the superscript m.

Let \(\boldsymbol {\mathcal {Z}}=\text {diag}(\boldsymbol {\mathcal {Y}})\) and denote the frequency-domain feedforward and feedback filter coefficients of AFD-DFE as \(\boldsymbol {\mathcal {F}}\) and \(\boldsymbol {\mathcal {B}}\), respectively. The output of the equalizer in the frequency domain at instant k is given by \( \boldsymbol {\check {\mathcal {X}}}_{k}=\left [\boldsymbol {\mathcal {Z}}_{k} \quad \boldsymbol {\mathcal {D}}_{k}\right ]\boldsymbol {\mathcal {W}}_{k-1},\) where \(\boldsymbol {\mathcal {W}}_{k} = \left [ \begin {array}{cc} \boldsymbol {\mathcal {F}}_{k} \\ \boldsymbol {\mathcal {B}}_{k} \\ \end {array} \right ]\).

The explicit knowledge of the filter coefficients is not needed for the development of the adaptive solution. The decision matrix \(\boldsymbol {\mathcal {D}}_{k} =\text {diag}(\mathcal {D}(0),\ldots, \mathcal {D}(M-1))\), with diagonal elements equal to F M x k for training mode, and \(\boldsymbol {F}_{M}\boldsymbol {\hat {x}}_{k}\) for decision-directed mode. \(\boldsymbol {\hat {x}}\) is the time-domain decision on \(\boldsymbol {\check {\mathcal {X}}}_{k}\).


In the above description, perfect frequency synchronization has been assumed between the transmitter and the receiver. However, CFO arises in practical SC-FDMA systems due to transmitter/receiver frequency oscillators’ misalignment and causes interference (energy leakage) from neighboring sub-carriers.

Algorithm development

Let the m th user’s CFO normalized by the sub-carrier spacing be denoted by Ω m , where −0.5≤Ω m ≤0.5. After applying the N-point DFT, the received signal with CFO affect is given by \(\boldsymbol {\mathcal {\acute {Y}}}= \sum _{m=1}^{K}\boldsymbol {\mathcal {C}}^{(m)}\boldsymbol {\hat {\Lambda }}^{(m)}\boldsymbol {R}^{(m)}\boldsymbol {\mathcal {X}}^{(m)}+\boldsymbol {\mathcal {N}} \)where \(\boldsymbol {\mathcal {C}}^{(m)}\) is a circulant matrix with entries \(\boldsymbol {\mathcal {C}}^{(m)}_{p,q}=\frac {1}{N} \sum _{n=0}^{N-1}e^{j2\pi (\Omega _{m}+p-q)n/N}, \ p,q=1,\ldots,N\). It is important to note here that the channel matrix \(\boldsymbol {\mathcal {C}}^{(m)}\boldsymbol {\hat {\Lambda }}^{(m)}\) has structure shown in Fig. 1, which shows that most of the energy of this matrix is in its three main diagonals. We assume that, except for the three main diagonals, all other entries are zero and based on this structure we formulate our three-tap equalizer in the frequency-domain. This is indeed true that there is some energy in the right upper corner and left lower corner in Fig. 1. However, here, we used approximation and ignored that energy in order to provide a low complexity solution. With this energy inclusion in the mathematical model, a low complexity solution cannot be achieved. Even ignoring this region, the performance is as good as without CFO (which will shown later in simulation section). Finally, the energy in this region is not significant relative to that of the middle diagonal.

Fig. 1

Structure of \(\boldsymbol {\mathcal {C}}^{(m)}\boldsymbol {\hat {\Lambda }}^{(m)}\)(normalized) matrix

After demapping and ignoring the superscript m, \(\boldsymbol {\mathcal {Y}}= [\!\mathcal {Y}(0),\mathcal {Y}(1), \ldots, \mathcal {Y}(M-1)]^{T}\). Now, assuming that the equalizer tap matrix has a similar structure to that of the channel matrix, i.e., we have only three main diagonals, consequently, the output of the three-tap AFD-DFE is given by

$$ {\begin{aligned} &\left[ \begin{array}{c} \check{\mathcal{X}}(0)\\ \check{\mathcal{X}}(1)\\ \check{\mathcal{X}}(2)\\ \vdots\\ \check{\mathcal{X}}(M-1)\\ \end{array}\right] = \left[ \begin{array}{cccccc} \mathcal{F}_{1}(0)&\mathcal{F}_{2}(0)& \\ \mathcal{F}_{1}(1)&\mathcal{F}_{2}(1)&\mathcal{F}_{3}(1) \\ &\mathcal{F}_{1}(2)&\ddots&\ddots \\ &&\ddots&\ddots&\mathcal{F}_{3}(M-2) \\ &&&\mathcal{F}_{1}(M-1)&\mathcal{F}_{2}(M-1) \\ \end{array}\right]\end{aligned}} $$
$$ {\begin{aligned} &~\left[ \begin{array}{c} \mathcal{Y}(0)\\ \mathcal{Y}(1)\\ \mathcal{Y}(2)\\ \vdots\\[1em] \mathcal{Y}(M-1) \\ \end{array}\right] + \left[ \begin{array}{cccccc} \mathcal{B}(0) \\ &\mathcal{B}(1) \\ &&\mathcal{B}(2) \\ &&&\ddots \\ &&&&&\mathcal{B}(M-1) \\ \end{array}\right] \left[ \begin{array}{c} \mathcal{D}(0)\\ \mathcal{D}(1)\\ \mathcal{D}(2)\\ \vdots\\ \mathcal{D}(M-1)\\ \end{array}\right] \end{aligned}} $$

where \(\mathcal {F}_{i}(j)\) and \(\mathcal {B}(j)\) represent the tap coefficients of the feedforward and feedback filter1, respectively. Denoting \(\boldsymbol {\mathcal {U}}_{i}=\left [{\mathcal {Y}}(i-1)\ {\mathcal {Y}}(i) \ {\mathcal {Y}}(i+1)\right ]\) for i=1,2,…M−2, \({\boldsymbol {\mathcal {U}}}_{0}=\left [{\mathcal {Y}}(0)\ {\mathcal {Y}}(1)\right ]\) and \(\boldsymbol {\mathcal {U}}_{M-1}=\left [{\mathcal {Y}}(M-2)\ {\mathcal {Y}}(M-1)\right ]\), then (2) can be setup alternatively as

$${\begin{aligned} \left[ \begin{array}{c} \check{\mathcal{X}}(0)\\ \check{\mathcal{X}}(1)\\ \check{\mathcal{X}}(2)\\ \vdots\\ \check{\mathcal{X}}(M-1)\\ \end{array}\right] &= \left[ \small \begin{array}{llllll} \boldsymbol{\mathcal{U}}_{0}& \\ & \boldsymbol{\mathcal{U}}_{1}& \\ && \boldsymbol{\mathcal{U}}_{2}&\\ &&&\ddots& \\ &&&& \boldsymbol{\mathcal{U}}_{M-1}& \\ \end{array}\right] \left[ \small \begin{array}{c} \mathcal{F}_{1}(0)\\ \mathcal{F}_{2}(0)\\ \mathcal{F}_{1}(1)\\ \vdots\\ \mathcal{F}_{2}(M-1)\\ \end{array}\right]\\ &+ \left[ \small \begin{array}{cccccc} \mathcal{D}(0) \\ &\mathcal{D}(1) \\ &&\mathcal{D}(2) \\ &&&\ddots \\ &&&&&\mathcal{D}(M-1) \\ \end{array}\right] \left[ \small \begin{array}{c} \mathcal{B}(0) \\ \mathcal{B}(1)\\ \mathcal{B}(2)\\ \vdots\\ \mathcal{B}(M-1) \\ \end{array}\right] \end{aligned}} $$

which can be written in compact notation, at instant k, as

$$\begin{array}{*{20}l} \boldsymbol{\check{\mathcal{X}}}_{k}=\boldsymbol{\mathcal{Z}}_{k} \boldsymbol{\mathcal{F}}_{k-1}+\boldsymbol{\mathcal{D}}_{k} \boldsymbol{\mathcal{B}}_{k-1} \end{array} $$

where \(\boldsymbol {\mathcal {F}}_{k-1}\) and \(\boldsymbol {\mathcal {B}}_{k-1}\) has size of 3M−2×1 and M×1, respectively. The block diagram of AFD-DFE is shown in Fig. 2.

Fig. 2

Block diagram of the proposed AFD-DFE

To cancel out the pre- and post-cursers but not the desired component, we will resort to a constrained approach similar to that in [14]. To explain this, let the time-domain feedback filter coefficients be b 0,b 1,…,f L . To prevent the contribution of the present decided symbol, b 0=0. In the frequency domain, this constraint translates to \(\sum _{i=0}^{M-1}\mathcal {B}(i)=0\). Ultimately, the cost function of the proposed algorithm is formulated as follows:

$$\begin{array}{*{20}l} J(i)=E\left[|\mathcal{D}(i)-\check{\mathcal{X}}(i)|^{2}\right]+2Re\left[\alpha^{*}\sum_{i=0}^{M-1}\mathcal{B}(i)\right] \end{array} $$

where α is the Lagrange multiplier and * denotes the conjugate operation. Following the approach of [18], the RLS update using (4) results in

$$\begin{array}{*{20}l} \boldsymbol {\mathcal{W}}_{k+1} = \boldsymbol {\mathcal{W}}_{k}+ \boldsymbol{\mathcal{P}}_{k+1}\left(\boldsymbol{\mathcal{A}}_{k+1}^{H}\boldsymbol{\mathcal{E}}_{k+1}-\alpha_{k+1}\boldsymbol{G}^{T}\right) \end{array} $$

where \(\boldsymbol {\mathcal {P}}_{k+1}\) is 2M×2M matrix and is given by

$$ {\begin{aligned} \boldsymbol{\mathcal{P}}_{k+1}&=\lambda^{-1}\left[\boldsymbol{\mathcal{P}}_{k}-\lambda^{-1}\boldsymbol{\mathcal{P}}_{k}\boldsymbol{\mathcal{A}}^{H}_{k+1}\left(\boldsymbol{I}_{2M}+\lambda^{-1}\boldsymbol{\mathcal{A}}_{k+1}\boldsymbol{\mathcal{P}}_{k}\boldsymbol{\mathcal{A}}_{k+1}^{H}\right)^{-1}\right.\\&\quad\left.\boldsymbol{\mathcal{A}}_{k+1}\boldsymbol{\mathcal{P}}_{k}{\vphantom{\frac{1}{2}}}\right], \end{aligned}} $$
$$\begin{array}{*{20}l} \boldsymbol {\mathcal{A}}_{k} = \left[\!\! \begin{array}{ccc} &\boldsymbol{\mathcal{Z}}_{k} &\boldsymbol{0} \\ &\boldsymbol{0} &\boldsymbol{\mathcal{D}}_{k} \\ \end{array}\right] \end{array} $$

and \(\boldsymbol {\mathcal {E}}_{k} = \left [ \begin {array}{c} \boldsymbol {D}_{k}-\boldsymbol {\check {\mathcal {X}}}_{k} \\ \boldsymbol {D}_{k}-\boldsymbol {\check {\mathcal {X}}}_{k} \\ \end {array}\right ]\) where D k contains the diagonal elements of \(\boldsymbol {\mathcal {D}}_{k}\) and λ is a forgetting factor chosen close to 1. Finally, G is given by G=[0 M 1 M ], where 0 M and 1 M are all zero and all ones row vectors of size M, respectively. Initially, \(\boldsymbol {\mathcal {W}}_{0} =\boldsymbol 0\), \(\boldsymbol {\mathcal {P}}_{0}=\text {diag}\left (\epsilon _{\mathcal {F}}^{-1}\boldsymbol {I}_{M} \ \epsilon _{\mathcal {B}}^{-1}\boldsymbol {I}_{M}\right)\), and α k is updated according to

$$\begin{array}{*{20}l} \alpha_{k+1}=\alpha_{k} + \mu \sum_{i=0}^{M-1}\mathcal{B}_{k}(i) \end{array} $$

where μ is a step size.

Reduced-complexity three-tap AFD-DFE

In the ensuing, we show that the computational complexity is significantly reduced and no matrix inversion is required. We can write \(\boldsymbol {\mathcal {P}}_{k+1} = \text {diag}\left (\left [\boldsymbol {P}_{k+1}^{1} \quad \boldsymbol {P}_{k+1}^{2}\right ]\right)\), where \(\boldsymbol {P}_{k+1}^{1}\) and \(\boldsymbol {P}_{k+1}^{2}\) are M×M matrices. Starting with k=0 and using \(\boldsymbol {P}_{0}^{1}=\epsilon ^{-1}\boldsymbol {I}_{3M-2}\), \(\boldsymbol {P}_{1}^{1}\) is given by

$$\begin{array}{*{20}l} \boldsymbol{P}_{1}^{1} &=\lambda^{-1}\left[\vphantom{\sum}\epsilon^{-1}\boldsymbol{I}_{3M-2}-\lambda^{-1}\epsilon^{-1}\boldsymbol{I}_{3M-2}\boldsymbol{\mathcal{Z}}_{1}^{H}\right.\\ &\left.\quad\times\left(\boldsymbol{I}_{M}+\lambda^{-1}\epsilon^{-1}\boldsymbol{\mathcal{Z}}_{1}\boldsymbol{\mathcal{Z}}_{1}^{H}\right)^{-1}\boldsymbol{\mathcal{Z}}_{1}\epsilon^{-1}\boldsymbol{I}_{3M-2}\right] \end{array} $$

It can easily be seen that \(\boldsymbol {\mathcal {Z}}_{1}\boldsymbol {\mathcal {Z}}_{1}^{H} = \text {diag}\left [|\boldsymbol {\mathcal {U}}_{0,1}|^{2}|\boldsymbol {\mathcal {U}}_{1,1}|^{2} \ \ldots \ |\boldsymbol {\mathcal {U}}_{M-1,1}|^{2}\right ]\) and (9) does not require matrix inversion. Now,

$$\begin{array}{*{20}l} {}\boldsymbol{\mathcal{Z}}_{1}^{H}\left(\boldsymbol{I}_{M}+\lambda^{-1}\epsilon^{-1}\boldsymbol{\mathcal{Z}}_{1} \boldsymbol{\mathcal{Z}}_{1}^{H}\right)^{-1}\boldsymbol{\mathcal{Z}}_{1}= \text{diag}(\textbf{\o}_{0}, \textbf{\o}_{1}, \ldots, \textbf{\o}_{M-1}) \end{array} $$

where as the entries ø i (i=1,…,M−2) are 3×3 matrices and ø i (i=0,M−1) are 2×2 matrices given by \(\textbf {\o }_{i}=\boldsymbol {\mathcal {U}}_{i,1}^{H}\left (1+\lambda ^{-1}\epsilon ^{-1}|\boldsymbol {\mathcal {U}}_{i,1}|^{2}\right)\boldsymbol {\mathcal {U}}_{i,1},i=0,\ldots,M-1\). Now, \(\boldsymbol {P}_{1}^{1}\) has the following structure, \( \boldsymbol {P}_{1}^{1}=\text {diag}\left (\boldsymbol {P}_{1,0}^{1} \, \ \boldsymbol {P}_{1,1}^{1} \, \ \ldots \,\ \boldsymbol {P}_{1,M-1}^{1}\right)\), where \(\boldsymbol {P}_{1,i}^{1}=\lambda ^{-1}\left [\epsilon ^{-1}\boldsymbol {I}_{d}-\lambda ^{-1}\epsilon ^{-2}\textbf {\o }_{i}\right ]\), d=2 for i=0,M−1 and d=3 for i=1,…,M−2. For k=1, we have

$$ {{} {\begin{aligned} \boldsymbol{P}_{2}^{1}=\lambda^{-1}\left[\boldsymbol{P}_{1}^{1}-\lambda^{-1}\boldsymbol{P}_{1}^{1}\boldsymbol{\mathcal{Z}}_{2}^{H} \left(\boldsymbol{I}_{M}\,+\,\lambda^{-1}\boldsymbol{\mathcal{Z}}_{2}\boldsymbol{P}_{1}^{1}\boldsymbol{\mathcal{Z}}_{2}^{H}\right)^{-1}\boldsymbol{\mathcal{Z}}_{2}\boldsymbol{P}_{1}^{1}\right] \end{aligned}}} $$

where \(\boldsymbol {\mathcal {Z}}_{2}\boldsymbol {P}_{1}^{1}\boldsymbol {\mathcal {Z}}_{2}^{H}=\text {diag}\left [\boldsymbol {\mathcal {U}}_{0,2}\boldsymbol {P}_{1,0}^{1}\boldsymbol {\mathcal {U}}_{0,2}^{H}\quad \boldsymbol {\mathcal {U}}_{1,2}\boldsymbol {P}_{1,1}^{1}\boldsymbol {\mathcal {U}}_{1,2}^{H} \ldots \boldsymbol {\mathcal {U}}_{M-1,2}\boldsymbol {P}_{1,M-1}^{1}\boldsymbol {\mathcal {U}}_{M-1,2}^{H}\right ]\) and \(\boldsymbol {\mathcal {U}}_{i,2}\boldsymbol {P}_{1,i}^{1}\boldsymbol {\mathcal {U}}_{i,2}^{H}\) is a scalar quantity; therefore, matrix inversion becomes just M scalar inversions. For k>1, \(\boldsymbol {P}_{k}^{1}\) has similar structure as for k<2 and, therefore, matrix inversion is avoided. Moreover, for \(\boldsymbol {P}_{k}^{2}\), k, it can be shown that \(\boldsymbol {P}_{k}^{2}\) has a diagonal structure. After finding \(\boldsymbol {\mathcal {P}}_{k}\), weights of the AFD-DFE are updated using (5).

Integration with SFBC

An attractive technique for spatial diversity is the conventional space-frequency block code (C-SFBC) strategy [6]. In the presence of CFO and high Doppler, severe ICI from adjacent carriers occurs which destroys the Alamouti structure and results in performance degradation. The authors in [9] propose an embedded SFBC (E-SFBC) which preserves the Alamouti structure even when there is ICI, and also, this technique does not affect the low PAPR property of SC-FDMA unlike C-SFBC. For the design of our AFD-DFE, we implement the E-SFBC at the block level without using pilots. In the E-SFBC, we define \(\boldsymbol {\mathcal {X}}^{(m)}_{1}=\left [\mathcal {X}(0)^{(m)},\mathcal {X}(2)^{(m)},\ldots,\mathcal {X}(M-2)^{(m)}\right ]^{T}\), and \(\boldsymbol {\mathcal {X}}^{(m)}_{2}=\left [\mathcal {X}(1)^{(m)},\mathcal {X}(3)^{(m)},\ldots, \mathcal {X}(M-1)^{(m)}\right ]^{T}\) i.e., \(\boldsymbol {\mathcal {X}}^{(m)}\) is divided into two blocks. Now, the sequence to transmit these sub-blocks will be \(\boldsymbol {\mathcal {\acute {X}}}_{1}=\left [\begin {array}{cc} \boldsymbol {\mathcal {X}}^{(m)}_{1} \\ -\boldsymbol {\mathcal {X}}^{*(m)}_{2} \end {array} \right ]\)and \(\boldsymbol {\mathcal {\acute {X}}}_{2}=\left [\begin {array}{cc} \boldsymbol {\mathcal {X}}^{(m)}_{2} \\ \boldsymbol {\mathcal {X}}^{*(m)}_{1} \end {array} \right ]\)for antenna 1 and 2, respectively. After mapping and applying the N-point IDFT, the transmitted signals from the two antennas are \(\boldsymbol {s}^{(m)}_{1}\) and \(\boldsymbol {s}^{(m)}_{2}\) corresponding to \(\boldsymbol {\mathcal {\acute {X}}}^{(m)}_{1}\) and \(\boldsymbol {\mathcal {\acute {X}}}^{(m)}_{2}\), respectively. The transmitted signals are circularly convolved with their respective channels and the received signal, after applying the N-DFT becomes \(\boldsymbol {\mathcal {\acute {Y}}}= \sum _{m=1}^{K}\left \{\boldsymbol {\hat {\Lambda }}_{1}^{(m)}\boldsymbol {R}^{(m)}\boldsymbol {\mathcal {\acute {X}}}_{1} +\boldsymbol {\hat {\Lambda }}_{2}^{(m)}\boldsymbol {R}^{(m)}\boldsymbol {\mathcal {\acute {X}}}_{2}\right \}+\boldsymbol {\mathcal {N}} \), where \(\boldsymbol {\hat {\Lambda }}_{i}^{(m)}\) is a N×N diagonal matrix, i.e., \(\boldsymbol {\hat {\Lambda }}_{i}^{(m)}=\text {diag}\left (\text {DFT}\left (\boldsymbol {h}_{i}^{(m)}\right)\right)\) for i=1,2 and \(\boldsymbol {\mathcal {N}}\) is the noise component with variance \(\sigma ^{2}_{\mathcal {N}}\boldsymbol {I}_{N}\). The received signal for m th user, after demapping, is expressed as

$$\begin{array}{*{20}l} \boldsymbol{\mathcal{Y}}^{(m)}&= \boldsymbol{R}^{(m)T}\boldsymbol{\hat\Lambda}_{1}^{(m)}\boldsymbol{R}^{(m)}\left[\boldsymbol{\mathcal{X}}^{(m)}_{1}\ -\boldsymbol{\mathcal{X}}^{*(m)}_{2}\right]^{T}\\ &\quad+\boldsymbol{R}^{(m)T}\boldsymbol{\hat\Lambda}_{2}^{(m)}\boldsymbol{R}^{(m)}\left[\boldsymbol{\mathcal{X}}^{(m)}_{2}\ \boldsymbol{\mathcal{X}}^{*(m)}_{1}\right]^{T} +\boldsymbol{\mathcal{N}}^{(m)} \end{array} $$

Let \(\boldsymbol {\Lambda }_{i}^{(m)}=\boldsymbol {R}^{(m)T}\boldsymbol {\hat \Lambda }_{i}^{(m)}\boldsymbol {R}^{(m)}\) then for i=1,2, then \(\boldsymbol {\Lambda }_{i}^{(m)}\) is M×M diagonal matrix. To simplify the notation we will drop the superscript m and define Λ 1=diag[Λ 11 Λ 22] and Λ 2=diag[Λ 12 Λ 21]. Now (12) can be written as

$$\begin{array}{@{}rcl@{}} \underbrace{\left[ \begin{array}{c} \boldsymbol{\mathcal{Y}_{1}} \\ \boldsymbol{\mathcal{Y}}^{*}_{2} \\ \end{array}\right]}_{\boldsymbol{\mathcal{Y}}} = \underbrace{\left[ \begin{array}{cc} \boldsymbol{\Lambda}_{11} &\boldsymbol{\Lambda}_{12} \\ \boldsymbol{\Lambda}^{*}_{21} &-\boldsymbol{\Lambda}^{*}_{22} \\ \end{array}\right]}_{\boldsymbol{\Lambda}} \underbrace{\left[ \begin{array}{c} \boldsymbol{\mathcal{X}_{1}}\ \\ \boldsymbol{\mathcal{X}_{2}} \\ \end{array}\right]}_{\boldsymbol{\mathcal{X}}_{12}} + \underbrace{\left[ \begin{array}{c} \boldsymbol{\mathcal{N}_{1}}\ \\ \boldsymbol{\mathcal{N}}^{*}_{2} \\ \end{array}\right]}_{\boldsymbol{\mathcal{N}}_{12}} \end{array} $$

To preserve Alamouti’s structure, we must have Λ 11=Λ 22 and Λ 12=Λ 21. To achieve this, we introduce a reordering of the sub-carriers before mapping at the transmitter as \(\boldsymbol {O}\boldsymbol {\mathcal {\acute {X}}}_{1}\) and \(\boldsymbol {O}\boldsymbol {\mathcal {\acute {X}}}_{2}\), where O=[I 1,I M/2+1,I 2,I M/2+2,…,I M/2,I M ] and assume that the channel does not change over two consecutive sub-carriers. At the receiver side, the reordering is done after demapping by using a matrix O T. Under CFO, the channel matrices, Λ ij in (13), lose their diagonal structures. We can approximate these matrices as banded (tridiagonal) matrices. Assuming the feedforward taps matrices have similar structure as the channel matrices, the equalized signal can be written as

$$\begin{array}{@{}rcl@{}} \left[ \begin{array}{c} \boldsymbol{\check{\mathcal{X}}}_{1} \\ \boldsymbol{\check{\mathcal{X}}}_{2} \\ \end{array}\right] = \left[ \begin{array}{cc} \boldsymbol{\Phi}_{1} &\boldsymbol{\Phi}_{2} \\ \boldsymbol{\Phi}^{*}_{2} &-\boldsymbol{\Phi}^{*}_{1} \\ \end{array}\right] \left[ \begin{array}{c} \boldsymbol{\mathcal{Y}}_{1} \\ \boldsymbol{\mathcal{Y}}^{*}_{2} \\ \end{array}\right] + \left[ \begin{array}{cc} \boldsymbol{\vartheta}_{1} &\boldsymbol{0} \\ \boldsymbol{0} &\boldsymbol{\vartheta}^{*}_{2} \\ \end{array}\right] \left[ \begin{array}{c} \boldsymbol{D}_{1} \\ \boldsymbol{D}_{2} \\ \end{array}\right] \end{array} $$

where Φ i is a tri-diagonal matrix and 𝜗 i is a diagonal matrix. D 1 and D 2 are \(\boldsymbol {\mathcal {X}}_{1}\) and \(\boldsymbol {\mathcal {X}}_{2}\), respectively, for the training mode or frequency-domain decisions on \(\boldsymbol {\check {\mathcal {X}}}_{1}\) and \(\boldsymbol {\check {\mathcal {X}}}_{2}\), respectively, for the decision-directed mode. Next, denoting \(\boldsymbol {\mathcal {U}}_{i}=\left [\boldsymbol {\mathcal {Y}}(i-2)\ \boldsymbol {\mathcal {Y}}(i) \ \boldsymbol {\mathcal {Y}}(i+2)\right ]\) for i=2,…M−3, \(\boldsymbol {\mathcal {U}}_{i}=\left [\boldsymbol {\mathcal {Y}}(i)\ \boldsymbol {\mathcal {Y}}(i+2)\right ]\) for i=0,1 and \(\boldsymbol {\mathcal {U}}_{i}=\left [\boldsymbol {\mathcal {Y}}(i-2)\ \boldsymbol {\mathcal {Y}}(i)\right ]\) for i=M−1,M−2, we can write (14) as

$$\begin{array}{*{20}l} \underbrace{\left[ \begin{array}{c} \boldsymbol{\check{\mathcal{X}}}_{1} \\ \boldsymbol{\check{\mathcal{X}}}_{2}^{*} \\ \end{array}\right]}_{\boldsymbol{\check{\mathcal{X}}}_{12}} =&\underbrace{\left[ \begin{array}{cc} \boldsymbol{Z}_{0} &\boldsymbol{Z}^{*}_{1} \\ -\boldsymbol{Z}_{1} &\boldsymbol{Z}^{*}_{0}\\ \end{array}\right]}_{\boldsymbol{\mathcal{Z}}} \underbrace{\left[ \begin{array}{c} \boldsymbol{\Upsilon}_{1}\\ \boldsymbol{\Upsilon}_{2} \\ \end{array}\right]}_{\boldsymbol{\mathcal{F}}} \\&\quad+\underbrace{\left[ \begin{array}{cc} \text{diag}(\boldsymbol{D}_{1}) &\boldsymbol{0} \\ \boldsymbol{0} &\text{diag}(\boldsymbol{D}^{*}_{2})\\ \end{array}\right]}_{\boldsymbol{\mathcal{D}}} \underbrace{\left[ \begin{array}{c} \boldsymbol{\Psi}_{1}\\ \boldsymbol{\Psi}_{2} \\ \end{array}\right]}_{\mathcal{B}} \end{array} $$

where \(\boldsymbol {Z}_{j}=\text {diag}\left [\boldsymbol {\mathcal {U}}_{j}\ \boldsymbol {\mathcal {U}}_{j+2}\ \ldots \ \boldsymbol {\mathcal {U}}_{j+M-2} \right ]\) for j=0,1. Υ 1 and Υ 2 (Ψ 1 and Ψ 2) are the vectors containing the diagonal elements of Φ 1 and Φ 2 (𝜗 1 and 𝜗 2). Moreover, the feedforward and feedback filter coefficients in the frequency-domain are \(\boldsymbol {\mathcal {F}}\) and \(\boldsymbol {\mathcal {B}}\) containing the elements {Υ 1,Υ 2} and {Ψ 1,Ψ 2}, respectively. At the k th instant, the output of the equalizer is

$$\begin{array}{*{20}l} \boldsymbol{\check{\mathcal{X}}}_{12,k}=\boldsymbol{\mathcal{Z}}_{k} \boldsymbol{\mathcal{F}}_{k-1} +\boldsymbol{\mathcal{D}}_{k}\boldsymbol{\mathcal{B}}_{k-1} \end{array} $$

The RLS AFD-DFE recursion is given as in (5) with G=[0 1×(3M−2) 1 M ] and error vector \( \boldsymbol {\mathcal {E}}_{k} = \left [ \begin {array}{cc} \boldsymbol {D}_{k}-\boldsymbol {\check {\mathcal {X}}}_{12,k} \\ \boldsymbol {D}_{k}-\boldsymbol {\check {\mathcal {X}}}_{12,k} \\ \end {array}\right ] \), where D k denotes the decisions at the k th instant, i.e., \(\boldsymbol {D}_{k} = \left [ \begin {array}{cc} \boldsymbol {D}_{1,k} \\ \boldsymbol {D}^{*}_{2,k} \\ \end {array} \right ]\) and \(\boldsymbol {\mathcal {P}}_{k}\) and \(\boldsymbol {\mathcal {A}}_{k}\) as in (6) and (7), respectively. The block diagram of SFBC AFD-DFE is shown in Fig. 3.

Fig. 3

Block diagram of the proposed SFBC AFD-DFE

Reduced-complexity three-tap CRLS AFD-DFE

Now, exploiting the special structure of SFBC, it can be seen that there is no matrix inversion involved altogether, and hence, complexity is significantly reduced. The matrix \(\boldsymbol {\mathcal {P}}_{k+1}\) has a structure as \(\boldsymbol {\mathcal {P}_{k+1}} = \text {diag} \left (\left [\boldsymbol {P}_{k+1}^{1} \quad \boldsymbol {P}_{k+1}^{2} \right ] \right)\). Starting with k=0 and using \(\boldsymbol {P}_{0}^{1}=\epsilon ^{-1}\boldsymbol {I}_{3M-4}\), \(\boldsymbol {P}_{1}^{1}\) is given by

$$ {{} {\begin{aligned} \boldsymbol{P}_{1}^{1}&=\lambda^{-1}\left[\boldsymbol{P}_{0}^{1}-\lambda^{-1}\boldsymbol{P}_{0}^{1}\boldsymbol{\mathcal{Z}}_{1}^{H} \left(\boldsymbol{I}_{M}\,+\,\lambda^{-1}\boldsymbol{\mathcal{Z}}_{1}\boldsymbol{P}_{0}^{1}\boldsymbol{\mathcal{Z}}_{1}^{H}\right)^{-1}\boldsymbol{\mathcal{Z}}_{1}\boldsymbol{P}_{0}^{1}\right]\\ &=\lambda^{-1}\left[\epsilon^{-1}\boldsymbol{I}_{3M-4}-\lambda^{-1}\epsilon^{-1}\boldsymbol{I}_{3M-4}\boldsymbol{\mathcal{Z}}_{1}^{H}\right.\\ &\quad\left. \times (\boldsymbol{I}_{M}+\lambda^{-1}\epsilon^{-1}\boldsymbol{\mathcal{Z}}_{1}\boldsymbol{\mathcal{Z}}_{1}^{H})^{-1}\boldsymbol{\mathcal{Z}}_{1}\epsilon^{-1}\boldsymbol{I}_{3M-4}\right] \end{aligned}}} $$


$$ {{} {\begin{aligned} \boldsymbol{\mathcal{Z}}_{1}\boldsymbol{\mathcal{Z}}_{1}^{H} = \left[ \begin{array}{cc} \boldsymbol{Z}_{0,1} \boldsymbol{Z}_{0,1}^{H}+\boldsymbol{Z}_{1,1}^{*} \boldsymbol{Z}_{1,1}^{T} &-\boldsymbol{Z}_{0,1} \boldsymbol{Z}_{1,1}^{H}+\boldsymbol{Z}_{1,1}^{*} \boldsymbol{Z}_{0,1}^{T} \\ -\boldsymbol{Z}_{1,1} \boldsymbol{Z}_{0,1}^{H}+\boldsymbol{Z}_{0,1}^{*} \boldsymbol{Z}_{1,1}^{T} &-\boldsymbol{Z}_{1,1} \boldsymbol{Z}_{1,1}^{H}+\boldsymbol{Z}_{0,1}^{*} \boldsymbol{Z}_{0,1}^{T} \\ \end{array} \right]. \end{aligned}}} $$

It can easily be seen that \(\boldsymbol {Z}_{0} \boldsymbol {Z}_{0}^{H}+\boldsymbol {Z}_{1}^{*} \boldsymbol {Z}_{1}^{T}=\text {diag}\left [|\boldsymbol {\mathcal {U}}_{0,1}|^{2}+|\boldsymbol {\mathcal {U}}_{1,1}|^{2}\ |\boldsymbol {\mathcal {U}}_{2,1}|^{2}+|\boldsymbol {\mathcal {U}}_{3,1}|^{2} \ldots |\boldsymbol {\mathcal {U}}_{M-2,1}|^{2}+| \boldsymbol {\mathcal {U}}_{M-1,1}|^{2} \right ]\) is a diagonal matrix and likewise other entries in (17). Therefore, \(\boldsymbol {\mathcal {Z}}_{1}\boldsymbol {\mathcal {Z}}_{1}^{H}\) is an M×M matrix containing \(4\ {M\over 2}\times {M\over 2}\) diagonal matrix. This structure allows us to easily find the inverse in (16) using block matrix inversion [21] where all the sub-matrices are diagonal. Now \(\boldsymbol {\mathcal {Z}}_{1}^{H}\left (\boldsymbol {I}_{M}+\lambda ^{-1}\epsilon ^{-1}\boldsymbol {\mathcal {Z}}_{1} \boldsymbol {\mathcal {Z}}_{1}^{H}\right)^{-1}\boldsymbol {\mathcal {Z}}_{1}=\left [ \begin {array}{cc} \textbf {\o }_{0}&\textbf {\o }_{1} \\ \textbf {\o }_{2}&\textbf {\o }_{3} \\ \end {array} \right ]= \textbf {\o }\) and \(\textbf {\o }_{i}=\text {diag}\left (\textbf {\o }_{i,0}\,\ \textbf {\o }_{i,1}\,\ \ldots \textbf {\o }_{i,{M\over 2}}\right)\), where the entries \(\textbf {\o }_{i,j}\left (j=1,\ldots,{M\over 2}-1\right)\) are 3×3 matrices and \(\textbf {\o }_{i,j}\left (i=0,{M\over 2}\right)\) are 2×2 matrices.

Now \(\boldsymbol {P}_{1}^{1}\) has a similar structure as ø, i.e., \(\boldsymbol {P}_{1}^{1}=\left [\begin {array}{cc} \boldsymbol {P}_{1,0}^{1} &\boldsymbol {P}_{1,1}^{1}\\ \boldsymbol {P}_{1,2}^{1} &\boldsymbol {P}_{1,3}^{1}\\ \end {array} \right ]\). Proceeding for k=1, we have

$$ {{} {\begin{aligned} \boldsymbol{P}_{2}^{1}=\lambda^{-1}\left[\boldsymbol{P}_{1}^{1}-\lambda^{-1}\boldsymbol{P}_{1}^{1}\boldsymbol{\mathcal{Z}}_{2}^{H} \left(\boldsymbol{I}_{M}\,+\,\lambda^{-1}\boldsymbol{\mathcal{Z}}_{2}\boldsymbol{P}_{1}^{1}\boldsymbol{\mathcal{Z}}_{2}^{H}\right)^{-1}\boldsymbol{\mathcal{Z}}_{2}\boldsymbol{P}_{1}^{1}\right] \end{aligned}}} $$

Let \(\boldsymbol {\mathcal {Z}}_{2}\boldsymbol {P}_{1}^{1}\boldsymbol {\mathcal {Z}}_{2}^{H}=\left [ \begin {array}{cc} \boldsymbol {\varphi }_{0}& \boldsymbol {\varphi }_{1} \\ \boldsymbol {\varphi }_{2}& \boldsymbol {\varphi }_{3} \\ \end {array} \right ]\) where \(\boldsymbol {\varphi }_{0}=\left (\boldsymbol {Z}_{0}\boldsymbol {P}_{1,0}^{1}+ \boldsymbol {Z}_{1}^{*}\boldsymbol {P}_{1,2}^{1}\right)\boldsymbol {Z}_{0}^{H}+ \left (\boldsymbol {Z}_{0}\boldsymbol {P}_{1,1}^{1}+\boldsymbol {Z}_{1}^{*}\boldsymbol {P}_{1,3}^{1}\right)\boldsymbol {Z}_{1}^{T}\), which is a diagonal matrix and similarly for other entries. Therefore, the inverse in (18) can be found easily similar to (17). For k>1, \(\boldsymbol {P}_{k}^{1}\) has similar structure as k=1 and \(\boldsymbol {P}_{k}^{2}\), k, is a diagonal matrix.

In summary, the RLS algorithm is updated according to (5) whereas the steps describing the avoidance of the matrix inversion are detailed in the respective complexity reduction section of the SISO and SFBC scenarios.

Simulation results

Similar to an LTE system, the carrier frequency and bandwidth are set to 2 GHz and 5 MHz, respectively. Other simulation parameters are M=16 and N=512; therefore, the maximum number of users that the system can support is K=32. The modulation scheme used is 16 QAM and the channel is frequency selective with 12 paths and each path fades independently according to the Rayleigh distribution. Our work is similar to [16], when to comes to stopping criteria; therefore, here convergence analysis is not presented.

Figures 4 and 5 show that the performance of a three-tap AFD-DFE is better than that of a one-tap AFD-DFE in a large CFO scenario, from which we conclude that a three-tap AFD-DFE is robust to ICI. In the simulations, channel coding is implemented using a nonsystematic rate 1/2 convolutional code (CC) with octal generator (133,171) and a constraint length =7, and only hard decisions are used in the feedback section, where log-likelihood-ratios (LLRs) and MAP decoder are used after the equalizer. Practical (correct decision feedback and known channel) and impractical (error decision feedback with known channel) MMSE-DFE are also shown. AFD-DFE is slightly better than partical DFE since no correlation estimation [14] is required here. For the SFBC SC-FDMA system, independent 12 paths Rayleigh fading channels are used for each transmit/receive antenna pair. Figure 5 also shows that if we use a super-block with L=4 [9], then the performance is improved significantly.

Fig. 4

SISO SC-FDMA system, user’s velocity 300 Km/h

Fig. 5

SFBC SC-FDMA system, user’s velocity 300 Km/h

To further investigate the robustness of the proposed technique, a five-tap filter is used and compared with the three-tap scenario. Using a five-tap filter also avoids matrix inversion if the approach mentioned in the paper is followed. However, similar performance is attained by both filters as shown in Fig. 6. In this regard, our design is optimal. In addition, Fig. 1 also confirms that most of the energy lies in the three main diagonals; therefore, using more than three diagonals will not improve the performance significantly. Furthermore, a five-tap filter will increase the computational complexity as compared to the three-tap filter. Hence, using a three-tap filter is a better compromise in terms of performance and complexity as compared to a one-tap filter.

Fig. 6

Comparison of three-tap AFD-DFE with five-tap AFD-DFE

To complete the discussion on the computational complexity of the algorithm with matrix inversion and without matrix inversions, these two scenarios are compared. For matrix inversion, the approach given in [22] is followed. Let the matrix to be inverted is denoted by Q with size M×M. First, the LU decomposition of Q is performed as follows

$$ {\begin{aligned} \text{Compute} \qquad \mathbf{Q}&=\mathbf{E}\mathbf{Q};&\\ \mathrm{for \ loop} \qquad \qquad i&=1:M-1\\ \qquad \qquad \text{rows}&=i+1:n;\\ \qquad \qquad \mathbf{Q}(\text{rows},i)&=\mathbf{Q}(\text{rows},i)/\mathbf{Q}(i,i);\\ \qquad \qquad \mathbf{Q}(\mathrm{rows,rows})&=\mathbf{Q}(\mathrm{rows,rows})-\mathbf{Q}(\text{rows},i)\mathbf{Q}(i,\text{rows});\\ \text{end}\qquad \qquad \quad \qquad& \end{aligned}} $$

where E is permutation matrix required for numerical stability [22]. Eventually, Q contains the upper triangular matrix U and the lower triangular matrix L when the diagonal part is replaced by ones [22]. Here, the LU decomposition requires \({(M-1)^{2}M\over 2}+{(M^{2}-M)(2M^{2}+6M+1)\over 6}+M^{3}\) multiplications. Finally, using Q Q −1=I M , where Q −1=[q 1,q 2,…,q M ] and I M =[I 1,I 2,…,I M ] are column partitions, the inversion is performed as

$$ \begin{aligned} \mathrm{for \ loop} \qquad \qquad \qquad i&=1:M&\\ \qquad \qquad \mathrm{Solve \ for} \ \mathbf{c}; \qquad \mathbf{L}\mathbf{c}&=\mathbf{E}\mathbf{I}_{i}\\ \qquad \qquad \mathrm{Solve \ for} \ \mathbf{q}_{i}; \quad \mathbf{U}\mathbf{q}_{i}&=\mathbf{c}\\ \text{end} \,\qquad \qquad \qquad \qquad& \end{aligned} $$

This requires M 2(M+1) multiplications. Table 1 shows the total number of multiplications required with matrix inversion (WMI) and without matrix inversion (WOMI) for the RLS algorithm. As an example M=2, 70 % multiplications can be saved in the SFBC case by avoiding the matrix inversion.

Table 1 Number of multiplications for AFD-DFE


In this paper, a three-tap RLS-based frequency-domain adaptive DFE is designed entirely in the frequency domain for SC-FDMA systems and extended to SFBC SC-FDMA systems. The equalizer operates without channel estimation at the receiver. The proposed algorithm delivers superior performance at low complexity due to the special structure of the matrices involved in computing the weights of the feedforward and feedback filters in the frequency domain. Simulation results demonstrate the significant performance gain and robustness of a three-tap frequency-domain adaptive equalizer, compared to a one-tap equalizer when dealing with ICI due to CFO. In [16], a one-tap strategy was used without concentrating on the mitigation of CFO. As shown in the simulation results and from the channel matrix structure shown in Fig. 1, a one-tap AFD-DFE will not perform well and thus becomes sub-optimum. Therefore, our proposed three-tap AFD-DFE is the solution.


1 Note that for the linear equalizer, the \(\mathcal {B}(j)\)’s in Eq. (2) are all zeros.


  1. 1

    C Cox, An Introduction to LTE: LTE, LTE-Advanced, SAE and 4G Mobile Communications (John Wiley & Sons, 2012).

  2. 2

    Q Wang, C Yuan, J Zhang, Y Li, Frequency domain soft-decision feedback equalization for SC-FDMA with insufficient cyclic prefix. Int. J. Comput. Sci. (IJCSI). 9(6), 103–108 (2012).

  3. 3

    K Raghunath, A Chockalingam, SC-FDMA versus OFDMA: sensitivity to large carrier frequency and timing offsets on the uplink. IEEE Global Telecommun Conference (GLOBECOM), 1–6 (2009).

  4. 4

    H Myung, D Goodman, Single Carrier FDMA (John Wiley & Sons, 2008).

  5. 5

    V Tarokh, H Jafarkhani, A Calderbank, Space-time block codes from orthogonal designs. IEEE Trans. Inf Theroy. 45:, 1456–1467 (1999).

  6. 6

    K Lee, D Williams, A Space-Frequency Transmitter Diversity Technique for OFDM Systems. IEEE. Global Telecommun Conference (Globecom). 3:, 1473–1477 (2000).

  7. 7

    Bell Alcatel Shanghai, Alcatel-Lucent, in R1-090058, 3GPP TSG RANWG 1 Meeting #55 bis. STBC-II scheme with non-paired symbols for LTE-advanced uplink transmit diversity, (2008).

  8. 8

    WY Lim, Z Lei, Space-time block code design for single-carrier frequency division multiple access. IEEE 20th Int. Symp. Pers. Indoor Mobile Radio Commun, 516–520 (2009).

  9. 9

    B Narasimhan, N Al-Dhahir, H Minn, SFBC design tradeoffs for mobile SC-FDMA with application to LTE-advanced. IEEE Int. Conf. Acoust. Speech Signal Process, 3458–3461 (2010).

  10. 10

    D Falconer, S Ariyavisitakul, A Benyamin-Seeyar, B Eidson, Frequency domain equalization for single-carrier broadband wireless systems. IEEE Commun Mag. 40:, 58–66 (2002).

  11. 11

    N Benvenuto, S Tomasin, On the comparison between OFDM and single carrier modulation with a DFE using a frequency-domain feedforward filter. IEEE Trans. Commun. 50:, 947–955 (2002).

  12. 12

    H Witschnig, M Kemptner, R Weigel, A Springer, Decision feedback equalization for a single carrier system with frequency domain equalization—an overall system approach. 1st Int. Symp. Wirel. Commun. Syst, 26–30 (2004).

  13. 13

    G Huang, A Nix, S Armour, Decision feedback equalization in SC-FDMA. 19th IEEE Int. Symp. Pers. Indoor Mobile Radio Commun, 1–5 (2008).

  14. 14

    N Benvenuto, S Tomasin, Iterative design and detection of a DFE in the frequency domain. IEEE Trans. Commun. 53:, 1867–1875 (2005).

  15. 15

    C Zhang, Z Wang, Z Yang, J Wang, J Song, Frequency domain decision feedback equalization for uplink SC-FDMA. IEEE Trans. Broadcast. 56:, 253–257 (2010).

  16. 16

    N Iqbal, N Al-Dhahir, A Zerguine, A Zidouri, Adaptive frequency-domain RLS DFE for uplink MIMO SC-FDMA. IEEE Trans. Veh. Technol. 64(7), 2819–2833 (2014).

  17. 17

    N Iqbal, A Zerguine, N Al-Dhahir, Adaptive equalisation using particle swarm optimisation for uplink SC-FDMA. Electron Lett. 50:, 469–471 (2014).

  18. 18

    S Haykin, Adaptive Filter Theory, 4th ed (Prentice Hall, Upper-Saddle River, NJ, 2002).

  19. 19

    Z Cao, U Tureli, Y-D Yao, P Honan, in IEEE Global Telecommunications Conference, GLOBECOM ’04, 2. Frequency Synchronization for Generalized OFDMA Uplink, (2004), pp. 1071–1075.

  20. 20

    M-o Pun, M Morelli, C-c Kuo, Iterative detection and frequency synchronization for OFDMA uplink transmissions. IEEE Trans. Wirel Commun. 6:, 629–639 (2007).

  21. 21

    T Kailath, Linear Systems (Prentice Hall, Englewood Cliffs, NJ, 1980).

  22. 22

    GH Golub, CF Van Loan, Matrix Computations (Johns Hopkins University Press, 2013).

Download references


The authors acknowledge the support provided by the Deanship of Scientific Research at KFUPM under Research Grant RG1415.

Author information

Correspondence to Azzedine Zerguine.

Additional information

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Iqbal, N., Zerguine, A. CFO mitigation for uplink SFBC SC-FDMA. EURASIP J. Adv. Signal Process. 2016, 40 (2016).

Download citation


  • Carrier Frequency Offset
  • Matrix Inversion
  • Orthogonal Frequency Division Multiple Access
  • Recursive Little Square
  • Space Time Block Code