 Research
 Open Access
 Published:
FBMC receiver for multiuser asynchronous transmission on fragmented spectrum
EURASIP Journal on Advances in Signal Processing volume 2014, Article number: 41 (2014)
Abstract
Relaxed synchronization and access to fragmented spectrum are considered for future generations of wireless networks. Frequency division multiple access for filter bank multicarrier (FBMC) modulation provides promising performance without strict synchronization requirements contrary to conventional orthogonal frequency division multiplexing (OFDM). The architecture of a FBMC receiver suitable for this scenario is considered. Carrier frequency offset (CFO) compensation is combined with intercarrier interference (ICI) cancellation and performs well under very large frequency offsets. Channel estimation and interpolation had to be adapted and proved effective even for heavily fragmented spectrum usage. Channel equalization can sustain large delay spread. Because all the receiver baseband signal processing functionalities are proposed in the frequency domain, the overall architecture is suitable for multiuser asynchronous transmission on fragmented spectrum.
1 Introduction
The current generation of cellular physical layers such as used by Long Term Evolution (LTE) and LTE Advanced have been optimized to deliver highbandwidth pipes to wireless users but require strict synchronism and orthogonality between users within a single cell. The advent of the smart phone and the expected explosion of MachineType Communication (MTC) are posing new and unexpected challenges. Fast dormancy necessary for these types of equipment to save battery power has resulted in significant control signalling growth. Furthermore, as the availability of large amounts of contiguous spectrum is getting more and more difficult to guarantee, the aggregation of noncontiguous frequency bands has to be considered. Therefore relaxed synchronization and access to fragmented spectrum are key parameters for future generations of wireless networks [1]. This requirement of spectrum agility has encouraged the study of alternative multicarrier waveforms such as filter bank multicarrier (FBMC) to provide better adjacent channel leakage performance without compromising spectral efficiency [2]. frequency division multiple access (FDMA) has already been considered for FBMC and provides promising performance without strict synchronization [3] between users. However, efficient frequency domain FBMC receiver algorithms have to be considered to fully benefit from the scheme. A first attempt at a practical implementation in the frequency domain for FBMC has been proposed in [4]. Synchronization and channel estimation are also based on the use of a training sequence but performance is limited by intercarrier interference (ICI) in the presence of large carrier frequency offset (CFO). This paper presents an architecture of a FBMC receiver suitable for asynchronous multiuser FDMA with very relaxed synchronization constraints: timing synchronization, CFO compensation and channel equalization are addressed.
Correction of CFO has been largely documented for orthogonal frequency division multiplexing (OFDM) [5] and FBMC [6]. A coarse CFO compensation in the time domain is necessary prior to a phase tracking in the frequency domain in order to limit ICI. As a different level of frequency offset is associated with each user, these solutions require as many receivers as users and are not suitable for practical implementation. Alternative signal processing methods for CFO compensation in the frequency domain should be developed. Iterative ICI cancellation has been proposed through iterative inversion techniques [7] or turbo processing [8] but is extremely complex to implement. The authors of [9] proposed an interesting method in the context of an orthogonal frequency division multiple access (OFDMA) uplink network. This method compensates CFO after fast Fourier transform (FFT) using a circular convolution. A good tradeoff between performance and complexity is demonstrated. A low complexity CFO compensation method for FBMC  FDMA based on this method has been investigated in this paper.
Channel estimation is a necessary process before demodulation as the radio channel is frequency selective. For multicarrier modulation systems, the estimation is usually performed by sending a training data sequence, called pilot tones, on a set of carriers known to the receiver. The channel is then estimated at the pilot frequencies using the classical least squares (LS) or minimum meansquare error (MMSE) estimators [10]. When pilot tones are not available on every carrier, a process of interpolation is required to recover the complete channel response [11]. The interpolation may be done in the time domain [12]. A timedomain channel impulse response is obtained using an inverse Fourier transform of the channel estimated at the pilot frequencies. A filter may then be applied to reduce noise effects and border effects. An interpolated frequency domain channel is then processed by Fourier transform. Another solution may be to interpolate the channel in the frequency domain. Many algorithms, such as linear interpolation, lowpass filtering and spline cubic interpolation have been proposed [13, 14]. For most interpolation schemes, the channel is poorly interpolated on the carrier located on the edges of the frequency band. This effect may be neglected when the number of carriers per contiguous frequency band is large but may lead to significant performance degradation of the overall system when the multicarrier modulation is applied to a fragmented spectrum. A robust interpolation scheme for the complete spectrum, including the edges is therefore critical. A solution has already been proposed in [12] using interpolation in the time domain. Performance is increased at the price of higher complexity. Furthermore, instability issues are not addressed and may lead to noise enhancement. A stable and robust scheme based on interpolation filters in the frequency domain is proposed in this paper.
Conventional equalization techniques for FBMC are suitable for multiuser reception. Good spectral containment introduced by the matched filter of the receiver helps to avoid distortion from nonsynchronous users. Channel equalization may therefore be independently processed per user. However, if the conventional polyphase network FBMC implementation is employed, then equalization has to be carried out in the time domain. To cope with this issue, frequencyspreading FBMC is considered [15]: at the cost of a significantly larger FFT (the size of the FFT is multiplied by the overlapping ratio), equalization may be efficiently done using a onetap complex coefficient per subcarrier. A highperformance equalization scheme is described in this paper. Performance of the receiver is evaluated and discussed.
The paper is organized as follows: In Section 2, the overall context for a multiuser asynchronous environment is presented and the benefits FBMC waveforms have in this context are developped. Then in Section 3, the complete architecture of a FBMC receiver adapted to these scenarios is described and performance results are given. In Section 4, the main features and results are summarized and some perspectives are provided.
2 Asynchronous multiuser context
2.1 FBMC
Singlecarrier (SC) FDMA has been chosen for the LTE to provide radio resource access for mobile users to the base station. This uplink multiuser access technique provides flexible resource allocation and is spectrally efficient. However, frequency offset between users should be strictly contained and received data signal should be aligned. If these synchronization requirements are not fulfilled, the orthogonality condition is not respected: intercarrier and intersymbol interference dramatically deteriorate performance. In order to guarantee mobility, the base station constantly monitors the time of arrival of received transmission signals. Signaling is sent to the mobile user via the downlink channel in order to synchronize this time of arrival at the base station. This constant exchange of control information introduces an overhead on the network that could be significant for asynchronous data communication services such as web access or machinetomachine communications.
A multicarrier system can be described by a synthesisanalysis filter bank, i.e. a transmultiplexer structure. The synthesis filter bank is composed of all the parallel transmit filters and the analysis filter bank consists of all the matched receive filters. The most widely used multicarrier technique is OFDM, based on the use of inverse and forward FFT for the analysis and the synthesis filter banks. The prototype filter of OFDM is a rectangular window whose size is equal to the duration of the FFT. At the receiver, perfect signal recovery is possible under ideal channel conditions thanks to the orthogonality of the prototype filters. Nevertheless, under more realistic multipath channels, a data rate loss is induced by the mandatory introduction of a cyclic prefix (CP), longer than the impulse response of the channel. FBMC waveforms utilize a more advanced prototype filter design to better localize the subcarriers. The prototype filter used in this paper is based on the frequency sampling technique. This technique gives the advantage of using a closedform representation that includes only a few adjustable design parameters. The most significant parameter is the duration of the impulse response of the prototype filter also called overlapping factor, K. The impulse response of the prototype filter is given by [16]
where ${G}_{P}\left(0\mathrm{..}3\right)=\left[1,\phantom{\rule{2.22144pt}{0ex}}0.97195983,\phantom{\rule{2.22144pt}{0ex}}\frac{1}{\sqrt{2}},\phantom{\rule{2.22144pt}{0ex}}1{G}_{P}{\left(1\right)}^{2}\right]$ for an overlapping factor of K=4 and N is the number of carriers. In the following sections, the term carrier will refer to one of these N carriers and the term subcarrier will refer to one of the KN frequency domain FFT outputs at the receiver (see Section 3.1). The larger the overlapping factor K, the more localized the signal will be in frequency. In filter bankbased systems, transmit pulses are localized in time and in frequency. The orthogonality between carriers is maintained by introducing half a symbol period delay between the inphase and the quadrature components of every complex symbol. The welladjusted frequency localization of the prototype filter guarantees that only adjacent carriers interfere with each other. This justifies the use of FBMC waveforms in a nonsynchronous context and particularly for the fragmented scenario. Nevertheless, adjacent carriers significantly overlap with this kind of filtering. In order to keep adjacent carriers orthogonal, real and pure imaginary values alternate on successive carrier frequencies and on successive transmitted symbols for a given carrier at the transmitter side. In order to maximize spectral efficiency of the offset QAM (OQAM) modulation, the symbol period T is halved.
FDMA access schemes have already been considered for FBMC modulations [3], but access on fragmented spectrum has not been envisaged yet. Fragmented spectrum may be viewed as the consequence of relaxed extension of channel aggregation for mobile communications. A user may access more than one frequency band being contiguous or not in frequency at any given time. The relaxed synchronization between multiple users, i.e., users are not strictly synchronized in time, could lead to the generation of a heavily fragmented spectrum both in time and frequency. An example of the resulting spectrum is depicted in Figure 1. This phenomenon should be further exacerbated as messages get shorter as is often the case in machinetomachine communications.
FBMCFDMA access schemes appear therefore to be a very promising flexible multicarrier waveform. Efficient implementations of the FBMCFDMA receivers should thus be considered in such multiuser asynchronous environments. A solution for this context is investigated in this paper.
2.2 Proposed burst structure
In order to keep a flexible frequency and time block allocation, a preamblebased burst approach is considered. Synchronization and channel estimation is performed using the training sequence. Its structure has been defined and is illustrated in Figure 2. It is composed of a preamble of duration PFBMC symbols (P is set to 4 in Figure 2). The preamble has been designed to accurately detect the start of the burst and gives an estimate of the channel frequency response while preserving the localization properties of the FBMC signal. It is mainly composed of pilot carriers spaced every D active carriers for the whole duration of the preamble (D is set to 4 in Figure 2). The pilot carriers are designed so that the signal transmitted on each pilot carrier is constant for the duration of the preamble. Synchronization carriers are added on the first multicarrier symbol but are more sparsely distributed than pilot carriers. These are designed to accurately estimate the start of burst. By implementing at the receiver, all the baseband signal processing functions in the frequency domain, the proposed scheme may be extended without loss of generalization to the aforementioned multiuser asynchronous environment.
2.3 Notations
Bold letters denote vectors and matrices. Uppercase and lowercase letters denote frequency domain and time domain variables, respectively. The following notations are used:

(.)^{t} Transpose

(.)^{H} Hermitian transform

E[. ] Expectation operator

t r[. ] Trace operator

. lnorm
F stands for the N×NDFT (discrete Fourier transform) matrix defined as:
where ${w}_{N}={e}^{\phantom{\rule{0.3em}{0ex}}j\frac{2\mathit{\pi}}{N}}$. Matlab notation was used to index the matrix. Therefore, A=B(:,1:U) means that A is built with the first U columns and all the rows of B.
3 Receiver architecture
3.1 Overview
A flexible architecture for multiuser asynchronous reception on fragmented spectrum is able to exploit the advantages of FBMC if the signal is efficiently demodulated in the frequency domain without a priori knowledge of the FFT timing alignment (i.e. the location of the FFT block, this property is called asynchronous FFT). A receiver architecture based on this assumption is depicted in Figure 3. An asynchronous FFT of size KN is processed every block of N/2 samples generating KN points, i.e. if r_{ m } is the m th received vector, a KNpoint FFT is computed for samples k=(n+m×N/2) with n=0,1,…,N K−1. These successive KN points are stored in a memory unit.
The detection of a start of burst is then achieved on the frequency domain (i.e. at the output of the FFT) using a priori information from the preamble. CFO is first estimated using the pilot subcarrier information of the preamble by computing the phase of the product between two consecutive FBMC symbols at the location of the pilot subcarriers. The propagation channel is assumed static for the duration of the burst. As described in [17], when large CFO correction is required, a first step in the estimation process consists of scanning the subcarriers around the pilot subcarrier locations to determine the subcarrier with the highest energy. A tracking algorithm of the CFO may complete the synchronization process when the duration of the burst is large and the accuracy of the preamblebased detection algorithm does not meet the required level [18]. CFO compensation is then performed in the frequency domain using a feedforward approach.
The channel coefficients may be estimated on the pilot subcarriers of the preamble. Authors of [6, 19] have already considered a similar approach by introducing a phase term to correct the CFO. In Section 3.2, this technique is completed by an efficient algorithm that compensates intercarrier interference. The channel is then estimated on the pilot subcarriers before being interpolated on every active subcarrier. The use of a KNpoint FFT makes the interpolation particularly specific to this receiver and a description of the proposed algorithm is detailed in Section 3.3.
Once the channel is estimated on all the active subcarriers, a onetap per subcarrier equalizer is applied before filtering by the FBMC prototype filter (Section 3.4). Demapping and loglikelihood ratio (LLR) computation complete the inner receiver architecture. A softinput forward error correction (FEC) decoder recovers finally the original message.
The asynchronous frequency domain processing of the receiver combined with the high stopband attenuation of the FBMC prototype filter provides a receiver architecture that allows for multiuser asynchronous reception. FFT and Memory Unit are common modules, while the remaining of the receiver should be duplicated as many times as the number of parallel asynchronous users the system may tolerate.
3.2 Carrier frequency offset compensation
3.2.1 Problem formulation
When the received symbol ${r}_{m}^{\mathit{\text{CFO}}}$ is affected by CFO, the signal ${r}_{m}^{\mathit{\text{CFO}}}$ can be written as
where ${r}_{m}^{\mathit{\text{CFO}}}\in {\u2102}^{\mathit{\text{KN}}\times 1}$ is the received time domain vector, ${r}_{m}^{i}\in {\u2102}^{\mathit{\text{KN}}\times 1}$ is the time domain vector associated with user i, δ^{i} is the CFO for user i relative to the carrier spacing (assuming N carriers), Φ^{i} is a random phase for user i and z_{ m } is the noise vector ${d}^{i}\in {\u2102}^{\mathit{\text{KN}}\times \mathit{\text{KN}}}$ a diagonal matrix defined for user i by
The considered scenario is an uplink asynchronous access with multiple different CFOs. Since CFOs associated with different users are different, correction must be performed separately for each user. The correction of the CFO for one given user cannot then be realized in the time domain with reasonable complexity as described by the authors of [9]. For this reason, we propose a frequency domain processing. Thanks to the very good frequency localization of the FBMC carriers, the asynchronous time reception of users does not cause any frequency interference between users (providing a frequency guard band of only one carrier). Thus, after FFT, the correction of the CFO in the frequency domain is described without loss of generality for one user. In the following, subscript index i is dropped. After KN fast Fourier transform, (3) becomes
where ${Z}_{m}\in {\u2102}^{\mathit{\text{KN}}\times 1}$ is the additive white Gaussian noise vector and $\mathbf{C}\in {\u2102}^{\mathit{\text{KN}}\times \phantom{\rule{0.3em}{0ex}}\mathit{\text{KN}}}$ is a Toeplitz matrix defined by
The coefficients of C measure the level of intercarrier interference (ICI) and may be written as
From Equation 7, it can be noted that when $\mathrm{K\delta}=q,\phantom{\rule{1em}{0ex}}q\in \mathbb{Z},\mathbf{C}$ is a subdiagonal matrix. q is denoted interger part of the CFO. The fractional part is labeled ε. The parameter δ may then be decomposed into its integer and its fractional part:
$q\in \mathbb{Z}$ and $\epsilon \in \mathbb{R}$ with ε∈[−1/(2K);1/(2K)]. Regardless of the integer part, q, ICI is only present when ε≠0. Assuming q is known, the required range of detection for δ may be very small. For instance, in the case of FBMC with a prototype filter of duration K=4, the maximum level of ICI is introduced when ε=12.5%.
3.2.2 Proposed correction scheme
As described in the previous section, CFO may be decomposed into integer and fractional parts. The integer part is easily corrected by a shift of q subcarriers at the output of the KNpoint FFT. The phase term should then be compensated by a phase correction factor. A simple and efficient way to reduce ICI introduced by the factional part of the CFO, ε, may be achieved by complex filtering of the received sequence. In order to derive the complex coefficients of filter W that mitigates ICI, two criteria are considered:

Zeroforcing criterion (): using Equation 5, and omitting the phase term, which has been corrected, we define W such as:
$$\mathbf{WC}=\mathbf{I}$$(9) 
By substituting C with the expression in (6), (9) may be rewritten as
$${\mathbf{WFdF}}^{H}=\mathbf{I}$$(10) 
As F F^{H}=F^{H}F=I and d d^{H}=I, the filter W may be derived by
$$\mathbf{W}={\mathbf{Fd}}^{H}{\mathbf{F}}^{H}$$(11) 
Minimum mean square error criterion (): the filter may also be optimized to minimize the mean square error (MSE), taking the noise level at the receiver into account:
$${\mathbf{W}}_{\mathbf{\text{est}}}={argmin}_{\mathbf{W}}{\left\left\mathbf{W}{\widehat{\mathbf{R}}}_{\mathbf{m}}{\mathbf{R}}_{\mathbf{m}}\right\right}^{2}$$(12) 
By taking the derivative of the expectation of the trace, the minimization problem becomes:
$$\begin{array}{c}\frac{\partial}{\partial \mathbf{W}}E\left[\text{tr}\left({\mathbf{\text{WCR}}}_{\mathbf{m}}{{\mathbf{R}}_{\mathbf{m}}}^{H}{\mathbf{C}}^{H}{\mathbf{W}}^{H}+{\mathbf{\text{WZ}}}_{m}{{\mathbf{Z}}_{\mathbf{m}}}^{H}{\mathbf{W}}^{H}\right.\right.\\ \phantom{\rule{2em}{0ex}}\phantom{\rule{3em}{0ex}}\left(\right)close="]">\left(\right)close=")">{\mathbf{\text{WCR}}}_{\mathbf{m}}{{\mathbf{R}}_{\mathbf{m}}}^{H}{\mathbf{R}}_{\mathbf{m}}{{\mathbf{R}}_{\mathbf{m}}}^{H}{\mathbf{C}}^{H}{\mathbf{W}}^{H}\\ =0\end{array}$$(13) 
In the presence of the additive white Gaussian noise (AWGN), $E\left[{\mathbf{Z}}_{\mathbf{m}}{{\mathbf{Z}}_{\mathbf{m}}}^{H}\right]={\sigma}_{n}^{2}\mathbf{I}$ and if Ω_{ R } is defined by E[R_{ m }R_{ m }^{H}]=Ω_{ R }, (13) becomes
$$\mathbf{C}{\mathbf{\Omega}}_{\mathbf{R}}{\mathbf{C}}^{H}{\mathbf{W}}^{H}+{\sigma}_{n}^{2}{\mathbf{W}}^{H}\mathbf{C}{\mathbf{\Omega}}_{\mathbf{R}}=0$$(14) 
When ${\mathbf{\Omega}}_{\mathbf{R}}={\sigma}_{R}^{2}\mathbf{I}$, the solution is straightforward, and can be expressed by:
$$\mathbf{W}=\frac{{\sigma}_{R}^{2}}{{\sigma}_{R}^{2}+{\sigma}_{n}^{2}}\mathbf{F}{\mathbf{d}}^{H}{\mathbf{F}}^{H}$$(15)
For both cases, the matrix W is Toeplitz and is therefore characterized by only KN complex coefficients. In general, Ω_{ R } is not a diagonal matrix but rather a band diagonal matrix and depends on the considered FBMC prototype filter. A closedform expression of W may be obtained if and only if Ω_{ R } is invertible. However, W may be derived by a nonlinear optimization process or by a using a pseudoinverse based for instance on singular value decomposition.
The correction of the fractional part of the CFO with either or filters requires a KN complextap filter. Nevertheless the complexity may be significantly reduced as ICI introduced by the CFO rapidly decreases with the index of the interfering subcarriers (since ε<1/(2K)). In Figure 4, the power of the W coefficients for a 256point FFT and a CFO of 10% is plotted. In that example, almost all the power is located around the diagonal. We proposed to approximate matrix W by a sparse band Toeplitz matrix with 2Q+1 terms on each column and centred on the diagonal. This is done by forcing all the other coefficients to 0. A tradeoff between complexity and accuracy of the frequency offset correction should be found for parameter Q. The 2Q+1 coefficients may be extracted from W through equation (11) or (15).
3.2.3 Performance and practical implementation of the CFO correction algorithm
The performance of the proposed algorithm is analysed and a practical implementation is derived in this section using the following FBMC parameters: N=1,024 and K=4. Figure 5 gives the architecture of the proposed implementation. The first step, performed by the Shift module, consists in correcting the integer part q. The result is then filtered by the 2Q+1tap filter W. A phase correction performed on each FBMC symbol completes the CFO correction process.
The relative mean square error (RMSE) for δ values from 1% to 12% and for different values of Q is given in Figure 6. RMSE has been defined by
RMSE is a measure of the signal to interference level as seen on the constellation. In the example, a QPSK modulation has been considered and RMSE decreases as Q becomes larger.
When Q=0, the algorithm only performs phase correction. The benefit of ICI mitigation is clearly demonstrated. Parameter Q could be chosen as a function of the system signaltonoise ratio (SNR) so as to limit the system by thermal noise but not by interference.
In practice, the estimation of the CFO is never perfect. As illustrated in Figure 6, a residual CFO of 1% generated by the correction mismatch exhibits a RMSE power level below −30 dB and does not degrade performance significantly. This result is interesting to note for practical implementations. Indeed, the determination of matrix W requires complex computation (matrix inversion or nonlinear optimization). Therefore, a possible practical implementation is to precompute a set of filters and use a combination of them. Figure 7 depicts the implemented filtering architecture based on a cascade of P=4 precomputed filters. Using the MUX command, it is possible to apply or bypass the filter correction. When a negative CFO correction should be applied, the INV command is activated. As filters for negative CFO are derived from filters for positive CFO through a simple permutation and complex conjugation of its coefficients, the operation is performed with little complexity.
For instance, W_{ i } can be optimized for an incremental drift of 1% by
In case of Q=3 and P=4, 28 complex coefficients must be stored. A correction range of [−15%;15%] is then possible. As an example, a CFO of 11.7% is corrected by setting MUX0 =1, MUX1 =1, MUX2 =0 and MUX3 =1. The effective CFO corrected is 1%+2%+8%=11%.
The RMSE performance of the proposed scheme is illustrated in Figure 8. We considered P=4 filters with Q=3 and using the optimization process of (17) and perfect phase estimation. A step shape of the curve is observed and comes from the finite resolution of the compensation (i.e. a step of δ=1%). When δ>12.5% a shift of p=1 subcarrier is performed. These results demonstrate the tradeoff between performance and complexity of the algorithm combined with the proposed cascade filters architecture.
A technique to mitigate ICI generated by CFO has been introduced in this section. All the operations have been carried out in the frequency domain and the scheme has given satisfactory performance results while being adapted for implementation.
3.3 Channel estimation and interpolation
Another important function that should be performed in the frequency domain is to recover the channel coefficients on each active subcarrier. This operation is achieved after the channel coefficients have been estimated on the pilot subcarrier location.
3.3.1 Problem formulation
One of the main advantages of multicarrier modulation techniques over singlecarrier modulation is a greatly simplified equalization process. In the case of OFDM, as long as the duration of the channel impulse response is shorter than the guard interval and the channel is constant over the duration of the OFDM symbol, a frequencyselective wideband channel converts to a number of subcarrier channel with flat fading. For FBMC, Hirosaki [20] showed that this property may be preserved if the equalizer at each subcarrier channel is fractionally spaced.
Therefore, under these assumptions, the received vector R may be written after Fourier transform as
where diag(H) is a N_{ H }×N_{ H } diagonal matrix of the channel frequency response, X the N_{ H }×1 vector of symbols to transmit and Z the N_{ H }×1 vector of additive white Gaussian noise. N_{ H } is the number of active subcarriers.
We labelled U_{ P }, the indices of the pilots (located at the central location, as a KNpoint FFT is considered at the front end of the receiver) and U_{ H }, the indices of the active subcarriers. N_{ p } is the number of pilot subcarriers. The LS channel estimation of P, the pilot vector, at indices U_{ P } is given by [10]
Accurate estimation, $\widehat{\mathbf{H}}$, of the channel coefficients H may be derived from the observation of $\widehat{\mathbf{P}}$ using an interpolation filter W of size N_{ H }×N_{ p }.
To construct the filter W, the following minimization problem should be solved:
Let Ω be defined as
Then, since $\widehat{\mathbf{P}}$ is a least square estimate of P, the following equation may be written:
where Z_{ p } is a N_{ p }×1 noise vector. Equation 22 may then be rewritten as
Then, if h is the N×1 vector of the channel impulse response, H and P may be expressed as
Finally (22) may be rewritten as
where Φ_{ h } is the time domain channel autocorrelation matrix of size N×N and ${\sigma}_{{Z}_{P}}^{2}$ the noise power. By taking the partial derivative of Ω with respect to W and making it equal to zero, (27) becomes
where I is the identity matrix of size N_{ p }×N_{ p }. In many cases, the matrix Δ is illconditioned and impossible to invert. However, a pseudoinverse using a singular value decomposition (SVD) may be computed to derive W, but this method may lead to an unstable result. The power distribution of matrix W coefficients is given in Figure 9 for a channel distribution with a square delay profile in the time domain (N_{ H }=256 and N_{ p }=64).
In this example, the power distribution of matrix W coefficients is mainly located around the diagonal. It should be noted that the coefficients located at the center of the matrix have very similar values. On the contrary, at the edges of the matrix, the coefficients are significantly different. This property has been exploited to construct a new optimization criterion. The criterion imposes complexity constraints on matrix W coefficients as a function of their location within the matrix.
From a practical point of view, implementing a complex filter with a large set of complex coefficients may be extremely costly. The overall complexity should often be kept under control in order to fit implementation area constraints. The following constraints have thus been added to the minimization problem:

W is a matrix with real Q coefficients per row instead of complex N_{ p } coefficients per row

The pilot carrier distribution follows a regular pattern, i.e. the sampling of the channel is uniform^{a}

At least Q pilots are active per set of contiguous subcarriers
The structure of matrix W is divided into three subfilter blocks, a left subfilter block, a middle subfilter block and a right subfilter block, so that matrix W may be rewritten as
where W_{ l } is the N_{ l }×Q matrix representing the left subfilter block, W_{ m } is the N_{ m }×Q matrix representing the middle subfilter block and W_{ r } is the N_{ r }×Q matrix representing the right subfilter block.
The minimization problem of (27) may then be decomposed into three constraint minimization problems function of the three subfilter blocks W_{ l }, W_{ m } and W_{ r }. Further constraints on the coefficients may be added on the minimization algorithm to guarantee stability. This optimization process may be computationally demanding but allows for a control of the complexity of the implemented interpolation process.
Filters W_{ l }, W_{ m } and W_{ r } are obtained through the following minimization problem by using a priori knowledge of Δ and ${\sigma}_{Z}^{2}$:
By applying (27) into (30), the minimization may be realized considering an a priori knowledge of the autocorrelation matrix of the time domain channel impulse response. By applying this constraint, the level of implementation complexity may be traded off against the target performance of the channel estimation module.
The optimization process is not necessarily implemented on the real time system. A stable and efficient interpolation scheme adapted to the channel conditions is provided while complexity is kept under control.
3.3.2 Application to the proposed preamble structure
The estimation of the channel coefficients is performed before applying the FBMC prototype filtering. The LS estimates of the channel coefficients are computed by applying a CFO phase correction on the received signal according to the location of the pilot symbol on the prototype filter. Performance may be further improved by considering all the positions of the prototype filter and by applying a maximum ratio combining algorithm after interpolation.
Therefore, by considering LS estimates at the center of the FBMC prototype filter, the structure of the pilot pattern is depicted in Figure 10. A pilot subcarrier is located every KD active subcarriers. The first pilot is located on subcarrier K−1. No pilot subcarrier is active on the right edge of the spectrum, therefore K−1+D channel coefficients should be derived. The number of active subcarriers is equal to N_{ H }=2(K−1)+N_{ p }K D, where N_{ p } is the number of active pilots.
Under these assumptions, and considering Q is even, N_{ r }, N_{ m } and N_{ l } may be expressed as follows:
The channel taps are assumed to be uncorrelated and as a consequence Φ_{ h } is a diagonal matrix. The most difficult task of the problem formulation consists in defining the a priori time domain channel autocorrelation delay profile. Different profiles may be considered and Figure 11 gives examples of possible distributions. For instance, an a priori autocorrelation Φ_{ h } profile adapted to single frequency network is characterized by a 0dB echo channel and is given in Figure 11d. The most conservative shape, the rectangular distribution (see Figure 11a), has been considered in the following section of the paper.
The optimization process is computationally intensive, making it difficult for a real time implementation. In order to maximize the interpolation performance while maintaining a reasonable level of complexity, the interpolation architecture depicted in Figure 12 has been considered. A set of x filters is optimized according to a target SNR, time domain channel duration, time domain profile, etc. The choice of the filter to apply is controlled by an entity that estimates the received channel conditions and decides which filter is the most suitable to the channel conditions measured at the receiver. Estimated pilot subcarriers are fed through the three filter blocks and generate three interpolated channel estimates. The choice of the channel estimate is controlled by a multiplexer according to the subcarrier index.
3.3.3 Performance results
The following parameters have been taken for a numerical evaluation:

K=4

D=4

N=1,024
The RMSE of the channel frequency response versus the channel delay spread expressed in number of time domain samples^{b} for a SNR of 15 dB is given in Figure 13. The RMSE is defined by
Filters have been first optimized for Q=10, a channel duration of 64 samples and a SNR of 15 dB. The performance is estimated using a Monte Carlo simulation based on the draw of 1,000 randomly generated channels with rectangular time profiles.
The RMSE is below the noise level for the three filter blocks. However, only a marginal gain is achieved for the right filter block. This is explained by the pattern considered in the pilot scheme; on the right edge, K D+K−1 channel coefficients have to be interpolated while no pilot subcarriers are located at the edge.
From a practical point of view, it is interesting to determine the number of coefficients required depending on the channel delay spread. In Figure 14, the variation of RMSE versus channel delay spread for filter block designed for a channel delay spread length of 64 time domain samples, SNR =15 dB and for various interpolation length Q is given. When filter interpolation length Q is increased, performance improves. Right filter blocks designed with Q=4 and Q=6 exhibit poor performance due to the channel delay spread. On the other hand, for left filter blocks, the variation on performance is much smaller and less than 1 dB between the case Q=4 and the case Q=10. Increasing interpolation length Q also increases the number of coefficients to store. This number of coefficient exhibits polynomial complexity in O(Q^{2}) making large numbers of Q costly for implementation.
The influence of the SNR on the channel estimation is given in Figure 15. Three filter blocks have been designed for a channel delay spread length of 16 time domain samples and SNR values of 15, 25 and 60 dB. The distance between the thermal noise and RMSE is given as a function of the SNR. For low SNR of 0 dB, a gain of 5 dB is given by a filter block optimized for a SNR of 15 dB. As the noise level is increased, the filter blocks optimized for a SNR of 25 dB become more optimal up to around 36 dB, when the filter blocks optimized for 60 dB become more appropriate. These results showed the importance of using filter blocks with key characteristics designed around the actual working SNR in order to maximize channel estimation performance.
To conclude this section, a set of filter blocks for a channel delay spread lengths of 16, 32 and 64 time domain samples at a SNR =15 dB and with Q=8 has been optimized. The RMSE across the active subcarrier band (this includes the left and right edges) as a function of the channel delay spread for a fragment of 262 active subcarriers has been plotted in Figure 16. When the channel delay spread is short, the use of an adequate filter block improves the RMSE of the interpolated channel. This figure illustrates the gain offered by the filtering approach if the channel delay spread is known and/or estimated. A comparison between an interpolation structure based on the state of the art and the proposed filter block structure adapted to the edges of the active spectrum is also given.
The proposed method gives a performance improvement of up to 7 dB on the RMSE of the estimated channel. Interpolation errors at the edge of the fragment dominate the RMSE. For large frequency fragment the gain becomes negligible. Performance improvement is mainly due to the small number of pilot subcarriers located in the frequency fragment making the proposed approach particularly adapted to multiuser asynchronous fragmented spectrum usage.
3.4 Inner receiver architecture
3.4.1 Proposed architecture
The proposed equalizer is placed directly after the FFT and before the FBMC prototype filtering. This strategy is suited for asynchronous processing of multiusers in the frequency domain using a shared FFT processor. Assuming the channel delay spread, L, small compared to the KNpoint FFT, the symbols at the output of the FFT may be written as follows:
where k is the subcarrier index, p the index of the FBMC symbol, H(k) the complex channel coefficients for subcarrier k, Z(k,p) the AWGN sample and Z_{ I }(k,p) the intersymbol interference (ISI) contribution. The channel here is again assumed static for the duration of the burst. When the matrix is diagonal, the ISI term can be omitted. This assumption is a good approximation when the delay spread of the channel is small compared to the size of the FFT. To illustrate the validity of this assumption, we depicted in Figure 17 the power of the channel matrix coefficients in decibels for a channel delay spread size of K N/64 and K N/16 time domain samples (KN is set to 512). An average over 1,000 channel realizations was done, each tap following a Rayleigh process.
When the channel is short, the power is located on the diagonal term validating the proposed assumption. When the channel is longer, power spreads over the diagonal term of the matrix, and as a result, performance of the onetap equalizer starts showing its limitations.
The goal of the equalizer is to recover X(k,p) from the observation R(k,p) and the estimated channel coefficient H(k). The ZF or MMSE criteria are classically used and may be expressed as follows:
where Q(k) is the optimized factor according to the ZF or MMSE criterion:
where ${\sigma}_{X}^{2}\left(k\right)$ is the expectation of the power of X on subcarrier k and ${\sigma}_{Z}^{2}$ is the expectation of the power of Z on subcarrier k. For the following, noise power is assumed to be constant over all the subcarriers.
Once the signal is equalized, the samples are filtered by the FBMC prototype filter before downsampling by a factor K. After payload extraction, OQAM inverse transform, LLR values are obtained. The LLR is defined as
where ${\mathrm{\Omega}}_{n}^{0}$ (resp. ${\mathrm{\Omega}}_{n}^{1}$) is the set of symbols comprising a bit 0 (resp. 1) for bit n, $\u0176\left(n\right)$ is the n th real symbol after filtering and inverse OQAM transform and b(m,n) is the m th bit mapped to the n th symbol. $\u0176\left(n\right)$ is written as
where Γ(n) is a factor depending on the criterion of the equalizer and Z_{ Y }(n) is the noise associated with symbol n of variance ${\sigma}_{{Z}_{Y\left(n\right)}}^{2}$:
where G(p) is the p th prototype filter coefficient expressed in the frequency domain, ${\sigma}_{{Z}_{Y}}^{2}$ the noise variance and ${\mathbb{S}}_{n}\left(p\right)$ is a set of subcarrier indices. We group into this set the indices of subcarriers used for the computation of symbol n. Γ(n) may be expressed by
We assumed that the conditional pdf of $\u0176\left(n\right)$ is Gaussian and is expressed by
By applying Bayes rules, assuming equal distribution of symbols and using the max log approximation $\left(log\sum _{i}{\beta}_{i}=\underset{i}{max}log{\beta}_{i}\right)$, equation (37) may be rewritten as
Further, complexity simplification of (42) may be considered with negligible performance loss such as the techniques described in [21].
3.4.2 Simulations results
The proposed equalization scheme has been evaluated by simulation using the parameters derived from the LTE mode 10MHz parameters and can be found in Table 1.
First, the effects of timing offset (i.e. location of the FFT) on the proposed equalizer has been simulated. The MSE of a QPSK constellation with ZF equalization has been estimated using the proposed equalization technique for various time offset of the KNpoint FFT location. The following channel has thus been considered:
where parameter k is the timing offset introduced by the FFT processor. An interpolation filter optimized for a SNR of 15 dB and a delay spread of 32×1/F_{ e } was considered for channel estimation. The simulation results are depicted in Figure 18.
The MSE on the constellation depends on the timing offset. The worstcase MSE is found when k is equal to N /4; in this case, the receiver is affected the most by the interference from the previous or next multicarrier symbols. The best case comes as expected when the FFT is perfectly aligned, i.e. timing offset is close to p×N/2 ∀p. This confirms that the proposed frequency domain equalizer scheme combined with channel estimation does not require FFT synchronization and is therefore adapted to asynchronous multiuser reception.
The combined performance of the proposed receiver algorithm is then evaluated. A working SNR of 5 dB has been considered and the bit error rate (BER) after Viterbi decoding for various channel delay spread length L of time domain samples has been measured. The channel impulse response has been defined as
where L is the number of taps in the channel impulse response and α_{ i } are complex coefficients following a Rayleigh distribution. With these assumptions, the delay spread of the channel is equal to L/F_{ e }. The resulting BER at the output of the receiver has been evaluated and averaged for 10,000 channel realizations. Figure 19 gives the simulated performance of the receiver for various channel interpolation filters as a function of the channel delay spread.
The performance obtained with the interpolation filter optimized for a channel delay spread length of N / (4F_{ e }) time domain samples is of particular interest as it demonstrated the limitations of the proposed scheme. When the delay spread of the channel is greater than 0.15×N/F_{ e }, performance is limited by the proposed equalization scheme rather than the channel estimation processing as the onetap equalizer becomes inefficient. On the other hand, for channel delay spread below 0.15×N/F_{ e }, onetap equalization is sufficient and channel estimation performs beyond requirement when an appropriate interpolation filter is considered. These results compare with the performance of an OFDM receiver with a guard interval length of at least 1/8×N. In Figure 19, the guard interval length for an equivalent OFDM system with a FFT size of N is given. Performance of the receiver may be further improved by considering multitap equalization.
The choice of the interpolation filter also impacted on performance. When a filter optimized for a channel exhibiting a large delay spread is applied to a channel with a short delay spread, a significant amount of noise is not filtered. Performance is then better if the interpolation filter is adapted to the actual channel delay spread.
The proposed frequency domain receiver architecture seems very attractive for asynchronous multiuser processing. Fair performance has been obtained by the proposed equalizer scheme; the receiver is robust for large delay spread environment. As a comparison, the normal LTE guard interval is set to 4.69A μs or approximately 1/14×N.
4 Conclusion
This paper presented a novel architecture and algorithms for FBMC reception. All the baseband signal processing functions are implemented in the frequency domain and no strict synchronization requirement on the FFT, the first element of the receiver, is required. This asynchronous frequency domain processing of the receiver combined with the high stopband attenuation of the FBMC prototype filter provides a receiver architecture that allows for multiuser asynchronous reception.
Particular attention has been paid to CFO compensation in order to relax synchronism requirements beyond one carrier spacing. An algorithm to mitigate ICI has been proposed and simulated. The performance of a practical reduced complexity implementation is simulated. Depending on the receiver target SNR, complexity may be traded off to keep RMSE introduced by CFO below thermal noise. Channel interpolation has also been carefully considered. As multiuser asynchronous FDMA generates heavily fragmented spectrum blocks, channel estimation should be optimized for the edges of the receiver active carrier bands. A performance improvement of up to 7 dB on the RMSE of the estimated channel may result on some simulated scenarios. Finally, a new equalizer scheme has been thoroughly presented. Its complexity is contained while good performance for channel exhibiting large delay spread is achieved. As a comparison, using the 10MHz LTE parameters, the receiver performs well for channels with delay spread of up to 8.3 μs. This compares with standard LTE that has been designed for channels with delay spread of up to 4.7 μs. Moreover, we have demonstrated that the proposed equalizer does not require FFT synchronization and is therefore adapted to asynchronous multiuser reception.
Further work should consider implementation performance and complexity estimation with a comparison to similar flexible systems using traditional OFDM techniques. Eventual cost overhead of FBMC implementation and finite precision effects in this context would further complete the study. At the system level, radio resource simulations are necessary to understand the benefits of the multiuser asynchronous approach. Its potential gains in system capacity and energy consumption compared to already existing solutions could be evaluated.
Endnotes
^{a} The proposed method also applies to nonuniform sampling, but the problem is more complex to formulate.
^{b} Channel delay spread is the time delay between the arrival of the first received signal component and the last received signal component associated with a single transmitted pulse.
References
 1.
Wunder G, Kasparick M, ten Brink S, Schaich F, Wild T, Gaspar I, Ohlmer E, Krone S, Michailow N, Navarro A, Fettweis G, Ktenas D, Berg V, Dryjanski M, Pietrzyk S, Eged B: 5GNOW: Challenging the LTE design paradigms of orthogonality and synchronicity. In IEEE 77th Vehicular Technology Conference (VTC Spring), 2013. Piscataway, NJ: IEEE; 2013.
 2.
Noguet D, Gautier M, Berg V: Advances in opportunistic radio technologies for TVWS. EURASIP J. Wirel. Commun. Netw 2011, 2011(1):170. 10.1186/168714992011170
 3.
Medjahdi Y, Terre M, Le Ruyet D, Roviras D, Dziri A: Performance analysis in the downlink of asynchronous OFDM/FBMC based multicellular networks. IEEE Trans. Wireless Commun 2011, 10(8):26302639.
 4.
Stitz TH, Ihalainen T, Renfors M: Practical issues in frequency domain synchronization for filter bank based multicarrier transmission. In 3rd International Symposium on Communications, Control and Signal Processing, 2008. ISCCSP 2008. Piscataway, NJ: IEEE Signal Processing Society; 2008:411416.
 5.
Speth M, Fechtel SA, Fock G, Meyr H: Optimum receiver design for wireless broadband systems using OFDM.I. IEEE Trans. Comm 1999, 47(11):16681677. 10.1109/26.803501
 6.
Stitz TH, Viholainen A, Ihalainen T, Renfors M: CFO estimation and correction in a WiMAXlike FBMC system. In IEEE 10th Workshop on Signal Processing Advances in Wireless Communications, 2009. SPAWC ’09. Piscataway, NJ: IEEE Signal Processing Society; 2009:633637.
 7.
Molisch AF, Toeltsch M, Vermani S: Iterative methods for cancellation of intercarrier interference in OFDM systems. IEEE Trans. Veh. Tech 2007, 56(4):21582167.
 8.
Salari S, Ardebilipour M, Ahmadian M, Cances JP, Meghdadi V: Turbo receiver design with carrierfrequency offset estimation for LDPCcoded MIMO OFDM systems. In The 9th International Conference on Advanced Communication Technology. Piscataway, NJ: IEEE; 2007:19111915.
 9.
Choi J, Lee YH, Lee C, Jung HW: Carrier frequency offset compensation for uplink of OFDMFDMA systems. In IEEE International Conference on Communications, 2000. ICC 2000. Piscataway, NJ: IEEE; 2000:4254291.
 10.
van de Beek JJ, Edfors O, Sandell M, Wilson SK, Ola Borjesson P: On channel estimation in OFDM systems. In IEEE 45th Vehicular Technology Conference, 1995. Piscataway, NJ: IEEE; 1995:8158192.
 11.
Tsai PY, Chiueh TD: Frequencydomain interpolationbased channel estimation in pilotaided OFDM systems. In IEEE 59th Vehicular Technology Conference, 2004. VTC 2004Spring. Piscataway: IEEE; 2004:4204241.
 12.
Doukopoulos XG, Legouable R: Robust channel estimation via FFT interpolation for multicarrier systems. In IEEE 65th Vehicular Technology Conference, 2007. VTC2007Spring. Piscataway, NJ; 2007:18611865.
 13.
Coleri S, Ergen M, Puri A, Bahai A: Channel estimation techniques based on pilot arrangement in OFDM systems. IEEE Trans. Broadcast 2002, 48(3):223229. 10.1109/TBC.2002.804034
 14.
Hsieh MH, Wei CH: Channel estimation for OFDM systems based on combtype pilot arrangement in frequency selective fading channels. IEEE Trans. Consum. Electron 1998, 44(1):217225. 10.1109/30.663750
 15.
Bellanger M: FSFBMC: An alternative scheme for filter bank based multicarrier transmission. In 5th International Symposium on Communications Control and Signal Processing (ISCCSP), 2012. Piscataway, NJ: IEEE Signal Processing Society; 2012:14.
 16.
Bellanger M, LeRuyet D, Roviras D, Terré M, Nossek J, Baltar L, Bai Q, Waldhauser D, Renfors M, Ihalainen T, Viholainen A, Stitz TH, Louveaux J, Ikhlef A, Ringset V, Rustad H, Najar M, Bader C, Payaro M, Katselis D, Kofidis E, Merakos L, Merentitis A, Passas N, Rontogiannis A, Theodoridis S, Triantafyllopoulou D, Tsolkas D, Xenakis D, Tanda M, Fusco T, Huchard M, Vandermot J, Kuzminskiy A, Schaich F, Leclair P, Zhao A: FBMC physical layer: a primer. 2010.http://www.ictphydyas.org , Accessed December 2013
 17.
Cassiau N, Kténas D, Doré JB: Time and frequency synchronization for downlink CoMP with FBMC. In The Tenth International Symposium on Wireless Communication Systems 2013. ISWCS 2013. Piscataway, NJ: IEEE; 2013:4650.
 18.
Amini P, FarhangBoroujeny B: Packet format design and decision directed tracking methods for filter bank multicarrier systems. EURASIP J. Adv. Signal Process 2010, 2010: 71711.
 19.
Fusco T, Petrella A, Tanda M: Dataaided symbol timing and CFO synchronization for filter bank multicarrier systems. IEEE Trans. Wireless Comm 2009, 8(5):27052715.
 20.
Hirosaki B: An orthogonally multiplexed QAM system using the discrete Fourier transform. IEEE Trans. Comm 1981, 29(7):982989. 10.1109/TCOM.1981.1095093
 21.
Tosato F, Bisaglia P: Simplified softoutput demapper for binary interleaved COFDM with application to HIPERLAN/2. IEEE International Conference on Communications, 2002. ICC 2002 2002, 6646682.
Acknowledgements
The research leading to these results has received funding from the European Commission’s seventh framework program FP7 ICT Call8 under grant agreement 318555 also referred to as 5GNOW.
Author information
Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Doré, JB., Berg, V., Cassiau, N. et al. FBMC receiver for multiuser asynchronous transmission on fragmented spectrum. EURASIP J. Adv. Signal Process. 2014, 41 (2014). https://doi.org/10.1186/16876180201441
Received:
Accepted:
Published:
Keywords
 FBMC
 FDM
 FDMA
 Equalization
 Synchronization
 Channel estimation
 Asynchronous processing
 Multiuser
 Fragmented spectrum
 Carrier frequency offset