Cyclic block filtered multitone modulation

A filter bank modulation transceiver is presented. The idea is to obtain good sub-channel frequency confinement as it is done by the family of exponentially modulated filter banks that is typically referred to as filtered multitone (FMT) modulation. However, differently from conventional FMT, the linear convolutions are replaced with circular convolutions. Since transmission occurs in blocks, the scheme is referred to as cyclic block FMT (CB-FMT). This paper focuses on the principles, design, and implementation of CB-FMT. In particular, it is shown that an efficient realization of both the transmitter and the receiver is possible in the frequency domain (FD), and it is based on the concatenation of an inner discrete Fourier transform (DFT) and a bank of outer DFTs. Such an implementation suggests a simple sub-channel FD equalizer. The overall required implementation complexity is lower than in FMT. Furthermore, the orthogonal filter bank design is simplified. The sub-channel frequency confinement in CB-FMT yields compact power spectrum and lower peak-to-average power ratio than in OFDM. Furthermore, the FD equalization allows the exploitation of the transmission medium time and frequency diversity; thus, it potentially yields lower symbol error rate and higher achievable rate in time-variant frequency-selective fading.


Introduction
Filter bank modulation (FBM) systems, also referred to as multicarrier modulation systems, have been successfully applied to a wide variety of digital communication applications over the past several years. The research of improved solutions is still large because of the increasing demand for broadband telecommunication services both over wireline and wireless channels.
Wideband channels are characterized by frequency selectivity which translates in time-dispersive impulse responses that cause significant inter-symbol interference (ISI) in digital communication systems. FBM transceivers employ a transmission technique where a set of narrow band signals (low data rate sequences) are transmitted simultaneously over a broadband channel [1]. In particular, each low rate data sequence is transmitted through a sub-channel that is shaped according to a sub-channel pulse. If the sub-channels are sufficiently narrowband, they will experience an overall flat frequency response so that the medium frequency selectivity does not introduce significant inter-carrier interference (ICI) and ISI.
*Correspondence: tonello@uniud.it University of Udine, Wireless and Power Line Communications Lab, Via delle Scienze 208, Udine 33100, Italy Therefore, the channel equalization task can be simplified. More in general, FBM architectures aim to increase the system spectral efficiency, to enable the agile use of spectrum and the flexible adaptation of available resources.
Two baseline FBM solutions are orthogonal frequency division multiplexing (OFDM) [2] and filtered multitone (FMT) modulation [3]. FMT consists of an exponentially modulated filter bank that privileges the sub-channel frequency confinement rather than the time confinement, as for example OFDM does. In FMT, with frequency confined pulses, the sub-channels are quasi-orthogonal to each other which prevents the system to suffer from ICI, while the ISI introduced by the frequency-selective medium can be mitigated with sub-channel equalization [3,4] or with the use of outer OFDM modules, one per sub-channel, as described in the concatenated OFDM-FMT scheme in [5]. Clearly, to obtain high subchannel frequency confinement, long prototype pulses are required. In such a case, the implementation complexity may increase significantly. Therefore, the efficient FMT implementation as well as the design of good pulses is an important aspect [6][7][8][9][10].
In this paper, a novel FBM scheme is described. The ambitious goal is to merge the strengths of both OFDM http://asp.eurasipjournals.com/content/2014/1/109 and FMT. We refer to it as cyclic block filtered multitone modulation (CB-FMT). Similarly to conventional FMT, CB-FMT aims at generating well frequency localized subchannels. However, differently from it, CB-FMT transmits data symbols in blocks, and the filter bank does not use linear convolutions but cyclic convolutions. Similarly to OFDM, the block transmission can reduce latency, but the sub-channel frequency confinement is much higher, more similarly to FMT. This translates in higher spectral selectivity, more confined power spectral density, and lower peak-to-average-power ratio (PAPR) than in OFDM given the same target spectral efficiency. Furthermore, the implementation complexity is lower than that in FMT with the same number of sub-channels and even with longer pulses. In fact, an efficient realization can be devised if both the synthesis and the analysis filter banks in CB-FMT are implemented in the frequency domain (FD) via a concatenation of an inner (with respect to (w.r.t.) the channel) discrete fourier transform (DFT) and a bank of outer DFTs. Such a FD architecture enables the use of a FD sub-channel equalizer designed according to the zero forcing (ZF) or the minimum mean square error (MMSE) criterion [11]. In particular, the ZF solution will restore perfect orthogonality if a cyclic prefix (similarly to OFDM) is appended to each block of signal coefficients that are transmitted over a dispersive (frequency selective) channel. In the presence of channel time variations (because of mobility), the ICI is small due to the subchannel frequency confinement so that the sub-channel equalizer is sufficient to cope with the ISI experienced by the data symbols transmitted in a block. This equalization scheme is capable to coherently collect the sub-channel energies so that frequency and time diversity, offered by the fading channel, can be exploited. Consequently, this can provide better performance, i.e., lower symbol error rate and higher achievable rate, than OFDM.
The CB-FMT idea and principles were originally presented in [12]. Some aspects related to the FD implementation were disclosed more recently in [13], while in [14] a preliminary analysis of the robustness of the scheme in fading channels was reported. In this paper, we provide a detailed description of CB-FMT with emphasis to the design, implementation, equalization, and performance aspects.
Another FBM scheme referred to as generalized OFDM (GFDM) has been independently presented in [15]. According to [15], GFDM is a FBM scheme that uses a non-orthogonal design where the sub-channel spacing is smaller than the sub-channel Nyquist band. It can be viewed as an FMT scheme operating beyond the critical sample rate. An extended CP is used to take into account the pulse tails and allow the implementation of a so-called tail biting convolution. The tail biting convolution in turn can be implemented with a circular convolution. Since the design is not orthogonal, ICI is present also in an ideal channel which requires some form of equalization to mitigate it. This may yield performance that is worse than OFDM. However, it is shown in [15] that other benefits are offered by GFDM as spectrum agility and lower PAPR. CB-FMT is a more general architecture than GFDM that shares the idea of using circular convolutions instead of linear convolutions. As FMT essentially represents a general exponentially modulated filter bank with linear convolutions, CB-FMT represents a general exponentially modulated filter bank with circular convolutions. An orthogonal CB-FMT system can be designed (as done in this paper) without requiring the use of a CP unless more robustness is desirable in time dispersive channels. Orthogonal CB-FMT can offer lower BER and higher spectral efficiency compared to OFDM.
The specific contributions of this paper can be summarized as follows: • The CB-FMT key elements are described in Section 2. These include the derivation of an efficient frequency domain implementation starting from the time domain signal representation (Section 2.1). • The relations between CB-FMT and conventional FMT/OFDM are briefly described in Section 2.2 to better understand the differences. • The complexity analysis is carried out in Section 2.3.
• The conditions under which the CB-FMT scheme is orthogonal are studied in Section 3. Herein, a simple orthogonal pulse design is also proposed. • The analytic derivation of the CB-FMT power spectral density (PSD) and the PAPR are discussed in Section 4. • Equalization in time-variant frequency-selective fading is discussed in Section 5. A sub-channel FD MMSE equalizer is herein proposed. • Several numerical examples, which include pulse shapes, complexity, PSD and PAPR, as well as symbol error rate (SER) and achievable data rate comparisons with OFDM, are collected in Section 6.
Finally, the conclusions follow.

Cyclic block FMT modulation
Cyclic block filtered multitone modulation is a multicarrier modulation scheme. As such, a high data rate information sequence is split into a series of K low data rate sequences. We denote the k-th data sequence with a (k) ( N), ∈ Z, which corresponds to a stream of complex data symbols belonging to a certain constellation, e.g., M-QAM or M-PSK, transmitted with symbol period NT, where T is the sampling period in the system. A normalized sampling period is assumed, i.e., T = 1. The data sequences are transmitted in parallel http://asp.eurasipjournals.com/content/2014/1/109 sub-channels obtained by partitioning the wideband transmission medium in K sub-bands.
The principle of CB-FMT is depicted in Figure 1. In this scheme, the low data rate data sequences are interpolated by a factor N and, then, filtered with a prototype pulse that is identical for all sub-channels. Differently from conventional FMT, the convolutions in the filter bank are circular. The filter outputs are multiplied by a complex exponential to obtain a spectrum translation. Finally, the K modulated signals are summed together yielding the transmitted discrete time signal.
The circular convolution involves periodic signals, and it can be efficiently realized in the frequency domain via the discrete Fourier transform (DFT). To use the circular convolution, a blockwise transmission is needed. Thus, we gather the low data rate sequences in blocks of L symbols a (k) ( N), ∈ {0, . . . , L − 1}, for each sub-channel. Then, we consider the prototype pulse g(n) to be a causal finite impulse response (FIR) filter, with a number of coefficients equal to M = LN. If the length is lower than M, we can extend the pulse length to M with zero-padding, without loss of generality. The CB-FMT transmitted signal can be written as where ⊗ denotes the circular convolution operator and g((n) M ) denotes the cyclic (periodic) repetition of the prototype pulse g(n) with a period equal to M, i.e., g((n) M ) = g(mod(n, M)) where mod(·, ·) is the integer modulo operator. W −nk K = e j2πnk/K is the complex exponential function and j is the imaginary unit.
The signal x(n) is digital-to-analog converted and, then, transmitted over the transmission medium. At the receiver, after analog-to-digital conversion, the discrete time received signal, denoted with y( ), is defined as where h CH (s, ) is the time-variant channel impulse response and η( ) is the background noise. In the following, we assume the channel response to be ideal. The more general case will be discussed in Section 5.
Similarly to the synthesis stage, we can apply the circular convolution to the analysis filter bank. The k-th sub-channel output is obtained as follows: where h((n) M ) denotes the periodic repetition of the prototype analysis pulse h(n) with period M. Each sub-channel conveys a block of L data symbols over a time period equal to LNT. Therefore, the transmission rate equals R = K/(NT) symbols/s. More in general, a cyclic prefix can be added (but it is not mandatory) to the transmitted signal, as explained in Section 5. In this case, the rate equals where μ is the cyclic prefix length in samples.

Frequency domain implementation
One of the goals in CB-FMT is the reduction of the computational complexity in the filtering operation w.r.t. the conventional FMT scheme. This can be achieved by exploiting a frequency domain implementation of the system as described in the following. Firstly, we define a constant integer Q subject to Then, the M-point DFT of the transmitted signal in (1) is computed. We obtain Under the assumption M = KQ, we can write We now multiply and divide (7) by W to obtain where in (8) we denoted with G(i) the M-point DFT of the pulse g(n).
Since M = LN, it should be noted that the summation with index in (8) is the L-point DFT of the data block a (k) (Nn) cyclical shifted by kQ, i.e., Finally, substituting (9) in (8), we obtain This suggests an implementation of the CB-FMT synthesis filter bank as shown in Figure 2. For each subchannel block of data, we apply an L-point DFT (referred to as outer DFT). We extend cyclically the block of coefficients at the output of the outer DFT from L points to M points. Then, each sub-channel block of M coefficients is weighted with the M coefficients G(i) of the prototype pulse DFT. Next, each sub-channel block is cyclically shifted by a factor −kQ, where k is the sub-channel index. Finally, the shifted blocks are summed together, and an Mpoint IDFT (referred to as inner IDFT) is applied to obtain the signal to be transmitted.
If we assume the M-point DFT of the prototype pulse to be confined in Q points, i.e., G(i) = 0 for i ∈ {0, . . . , Q−1} and G(i) = 0 for i ∈ {Q, . . . , M − 1}, we can simplify (10) into This suggests an efficient implementation of the CB-FMT synthesis filter bank as shown in Figure 3. For each sub-channel block of data, we apply an L-point DFT (referred to as outer DFT). We extend cyclically the block of coefficients at the output of the outer DFT from L points to Q points. Then, each sub-channel block of Q coefficients is weighted with the Q non-zero coefficients G(i) of the prototype pulse DFT. Finally, we apply an Mpoint IDFT (referred to as inner IDFT) to obtain the signal to be transmitted.
The analysis filter bank can also be implemented in the frequency domain. We start from (3) and we substitute the signal y( ) with the IDFT of its frequency response as follows: Assuming M = KQ, we can write Now, we multiply and divide (13) by W where in (14) we can recognize the shifted version of the M-point DFT of h( ) defined as where the second equality holds since the signals are peri- , and we can write The index i can be decomposed into two indexes p and q as i = p+Lq, with p ∈ {0, . . . , L−1} and q ∈ {0, . . . , N −1}, to obtain In (17), we can recognize the L-point IDFT of the signal Z (k) (p) that is given by This is the periodic repetition with period L of the signal Y (i)H(i − kQ). Therefore, the receiver can be summarized as follows ( Figure 2): Firstly, the received signal y(n) is processed with an M-point DFT. Then, the output coefficients are weighted with the prototype pulse M-point DFT coefficients H(i). Next, a periodic repetition with period L is performed for each sub-channel block of coefficients to obtain (18), where Y (p) is the M-point DFT of the received signal. Finally, to obtain the k-th sub-channel output, an Lpoint IDFT is applied to (18), i.e., Assuming that the analysis pulse has only Q non-zero coefficients, i.e., H(i) = 0 for i ∈ {Q, . . . , M − 1}, as when the pulse is matched to the synthesis pulse and equal to G * (i), the pulse weighting and the periodic repetition take place on each sub-channel as graphically depicted in Figure 3. In particular, assuming Q > L, (18) corresponds to adding at the beginning of the block of coefficients With the orthogonal design that we describe in Section 3, i.e., when H(i) = G * (i) and assuming that orthogonality conditions are satisfied, we obtain which shows that the output sample is equal to the n-th data symbol of the k-th sub-channel weighted by the pulse energy, plus a noise contribution. More in general, when the transmission medium is not ideal, equalization can be performed (see Section 5) so that in (20) the coefficient that weights the data symbol is given by the sub-channel energy. Also the FMT system can be implemented in the frequency domain. However, the presence of linear convolutions renders it more complex since an overlap-and-add operation has to be carried out as shown in [16].
As a final remark, the use of inner and outer DFTs appeared also in the concatenated OFDM-FMT architecture presented in [5].

Relation with FMT and OFDM
In the following, we briefly highlight the main differences of CB-FMT w.r.t. FMT [3] and OFDM [2]:

Relation with FMT
Similarly to conventional FMT, CB-FMT targets the use of frequency confined sub-channels. However, while the filter banks in FMT deploy linear convolutions, in CB-FMT, the filter banks use cyclic convolutions. Therefore, the FMT transmitted signal does not read as in (1) but as follows: where g(n) is the prototype pulse. Furthermore, while in FMT the transmission is typically continuous, in CB-FMT, data signals are transmitted in blocks, each of the K subchannels transmits a block of L data symbols. In FMT, the rate equals The efficient implementation of FMT can exploit a polyphase DFT filter bank architecture [10]. Nevertheless, the FMT complexity is higher than the CB-FMT complexity, as shown in Section 2.3 assuming the same number of sub-channels and the same prototype pulse length.
In CB-FMT, very simple frequency domain design can be followed to obtain an orthogonal solution as shown in Section 3. In FMT, the orthogonal design is more convoluted [9,10]. Thus, non-orthogonal solutions are often adopted in FMT as for instance the use of a truncated root-raised-cosine prototype pulse [3] or ad hoc frequency localized pulses [7,8].
When transmission is in time/frequency-selective fading channels, the good sub-channel frequency confinement in FMT provides robustness to ICI, while the residual ISI can be mitigated with sub-channel linear equalization [3] or maximum a posteriori sequence estimation [4,17]. In CB-FMT, instead, simple frequency domain equalization can be adopted as described in Section 5.

Relation with OFDM
In OFDM, the filter bank privileges the sub-channel time domain localization, rather than the frequency domain localization. OFDM can be seen as a particular case of both FMT and CB-FMT. In fact it corresponds to an FMT system with N = K and the prototype pulse being a rectangular window, i.e., g(n) is equal to 1 for n ∈ {0, . . . , N − 1} and 0 otherwise. It follows that the transmitted signal can be expressed as Starting from CB-FMT, we obtain OFDM by setting N = K, L = Q = 1, and G(0) = 1, so that we also have K = M.
In OFDM, the rate equals where μ is the cyclic prefix length in samples. It should be noted that cyclic prefix is not necessary equal in CB-FMT and in OFDM. The same applies to the number of data sub-channels that can be different in the two systems. http://asp.eurasipjournals.com/content/2014/1/109

Computational complexity
In this section, we evaluate the computational complexity of CB-FMT in terms of number of complex operations [cop] (additions and multiplications) per sample. Let us assume that an M-point DFT (or IDFT) block has complexity equal to αM log 2 (M) [cop] where, for instance, At the transmitter, K outer DFTs of L-points are used, together with one M-point inner IDFT. Furthermore, the inner IDFT input signals are weighted by the DFT components of the prototype pulse. Similarly, this applies at the receiver. The operations performed by the cyclic extension is negligible. Let us assume the number of non-zero DFT coeffients of the prototype pulse to be equal to Q 2 = QC, C ∈ {1, . . . , K}. Since the transmitted block comprises LN coefficients, the number of complex operations per sample is equal to where S = (2C − 1)M for the transmitter. At the receiver, the periodic repetition increases the complexity by S = 2MC − KL for (Q − L)C < L and by S = (M + KL)C otherwise. When the prototype pulse has only Q non-zero coefficients, i.e., C = 1, S = M for the transmitter and S = 2M − KL for the receiver. As a comparison, we consider the complexity of FMT efficiently implemented with a polyphase DFT filter bank as described in [10]. This is equal to assuming K sub-channels, an interpolation factor N, and a prototype pulse with length L g coefficients. This complexity does not take into account the operations required by the equalization stage. If, for example, we assume K = 64 and N = 80 for both systems and furthermore FMT with a pulse length equal to 20N while CB-FMT with L = K, Q = N resulting in a longer filter length equal to M = 64N coefficients, the receiver complexity will be equal to {45.8, 21.7} [cop] respectively for FMT and CB-FMT. This shows the gain in complexity of CB-FMT yet having a longer pulse. More results about the complexity are reported in Section 6.2.

Orthogonality conditions and prototype pulse design
The frequency domain implementation of CB-FMT allows us to deduce the system orthogonality conditions. A filter bank system is orthogonal when it has the perfect reconstruction property and the transmit-receive filters are matched, i.e., g(n) = h * (−n) and H(i) = G * (i), so that the system exhibits neither ISI nor ICI [18]. When the prototype pulse satisfies the following two conditions, the CB-FMT system will be orthogonal [19]: 1. The M -point DFT of the prototype pulse has only Q non-zero coefficients, i.e., G(i) = 0 for i ∈ {Q, . . . , M − 1} (sufficient condition). 2. The correlation between g(n) and g * (−n), computed with the circular convolution and sampled by a factor N, is the Kronecker delta, i.e., where δ(n) is the Kronecker delta function, i.e., δ(n) is equal to 1 for n = 0 and 0 otherwise.

Proof of the orthogonality conditions
To prove the orthogonality conditions, we proceed in two steps. Firstly, we prove the condition to have orthogonality between different sub-channels, i.e., no ICI. Then, we prove the condition to have orthogonality between the data symbols of each sub-channel, i.e., no ISI.

Sub-channel orthogonality
The sub-channels will be orthogonal if the M-point DFT of the prototype pulse, G(i), is equal to zero for i ∈ {Q, . . . , M − 1}. In this case, there is no ICI.

Block orthogonality
No ISI will be present between the L data symbols in each sub-channel block, when the prototype pulses are matched and (27)  Proof. Under the sub-channels orthogonality (previous condition) and with matched filters, (28) becomes To have perfect reconstruction, we need to have [19] N−1 In (30), the summation represents a periodic repetition of G(p+qL)G * (p+qL). In the time domain, this translates in sampling by a factor M/L = N. Thus, if we apply an L-point IDFT to (30), we will obtain (27).

Orthogonal prototype pulse design
The frequency domain implementation and the orthogonality conditions suggest to synthesize the pulse in the frequency domain with a finite number of frequency components.
We start by choosing a pulse that belongs to the Nyquist class with roll-off β, Nyquist frequency 1/(2NT), total bandwidth 1/(KT), and frequency responseĜ( f ). Then . It should also be noted that there is a limiting condition on the choice of the roll-off β. In fact, the maximum roll-off to prevent the pulse tails from exceeding the bandwidth 1/(KT) of Qpoints is β max = (Q − L)/Q. Therefore, we must choose β ≤ β max .
We recall that the parameters of CB-FMT are related to each other through the relation M = LN = KQ, which can be rearranged as where p and q are relative prime integers. Therefore, once we have chosen M, the number of sub-channels K, and the ratio N/K, we obtain the rest of the parameters N, Q, L. Some examples of pulse responses are reported in Section 6.1. An optimal pulse design method that targets the maximization of the in-band-to-out-of-band pulse energy has been recently presented in [19]. In particular, complex asymetric pulses are also considered. It is also interesting to note that a trivial orthogonal solution is obtained by using a rectangular FD window of Q nonzero coefficients [20]. In such a case, the CB-FMT scheme becomes the dual of the OFDM system that uses, instead, a rectangular window in the time domain.

PSD-related aspects
In this section, we study the power spectral density (PSD) of the transmitted CB-FMT signal. The PSD is an important aspect to evaluate the confinement of the transmitted spectrum. The objective is to limit the out-of-band emissions. More in general, the PSD must comply to regulatory aspects that typically set an upper limit, also known as spectrum mask, e.g., as in the IEEE 802.11 WLAN standard [21] or in the HomePlug PLC system [22].
To derive an analytic expression for the PSD of CB-FMT, we can start by expressing the (continuous) transmission of samples x(n) as follows: where the sub-channel symbol period M 1 = M + μ takes into account the fact that a cyclic prefix (CP) of length μ samples can be used for the equalization, as it will be explained in Section 5. The CP is appended to the block of coefficients at the inner IDFT output. Furthermore, X (k) (mM 1 ) are the coefficients at the input of the inner IDFT in the transmitter at time instant mM 1 , and g P (n) is the rectangular window that is equal to 1 for n ∈ {0, . . . , M 1 − 1} and zero otherwise. Essentially, X (k) (mM 1 ) represents the block of coefficients defined in (11) that is transmitted in the m-th CB-FMT block. The data symbols a (k) ( N) are assumed to be independent, identically distributed with zero mean and power equal to To convert the signal in (32) from discrete time to continuous time, an interpolation filter with response g I (t) is needed. The interpolated signal can be expressed as To obtain the signal PSD, the correlation of x(t), namely is the expectation operator, needs to be computed. Since the interpolated signal is cyclo-stationary, the correlation is periodic [23], i.e., r x (t + T, τ ) = r x (t, τ ). To remove the dependency on the variable t, the mean correlation r x (τ ) = 1 T T 0 r x (t, τ )dt has to be computed, from which the mean PSD is obtained via a Fourier transform. The mathematical expressions involved in the PSD computation are convoluted. In the following, we report the main steps to obtain the PSD. The final result is given in (43) to (45). http://asp.eurasipjournals.com/content/2014/1/109 The correlation can be written as where E [x * (n)x(m)] represents the correlation of the discrete time transmitted signal before the analog interpolation filter. To compute (34), we rewrite (32) as where g (k) P (n) = g P (n)e j2πnk/M . The correlation of (35) is given by where the term r X (k 1 , represents the correlation between the signal coefficients at the input of the inner IDFT of the transmitter. We obtain The correlation in (36) is cyclo-stationary, i.e., r(n, m) = r(n + M 1 , m). Thus, we compute the mean correlation as Consequently, the mean PSD is where R( f ) is the discrete Fourier transform of (38).
Assuming that the prototype pulse DFT has only Q nonzero coefficients, (37) can be rewritten as Then, the PSD can be written as where f k,q = (q + kQ)/(MT), S ={(x, y)|x, y ∈{0, . . . , Q−1}, x−y ∈ {−L, 0, L}} and G P ( f ) is the periodic sinc function, defined as In (41), we can split the summation with indexes (q 1 , q 2 ) in the two sums and thus in two resulting terms: For q 1 = q 2 , we obtain the term while for q 1 = q 2 , we obtain the term where G I ( f ) is the Fourier transform of the analog interpolation filter. The first term, P 1 ( f ), is a sum of sinc functions, each centered in f k,q and weighted by the prototype pulse DFT coefficients. The main lobe of the sinc function has a bandwidth equal to 1/ ((M + μ)T). The second term, P 2 ( f ), is related to the correlation between the signal coefficients X(i) at the input of the inner IDFT, defined in (11). In detail, we may reconsider (11). In fact, such coefficients can be written as i ∈ {kQ, . . . , (k + 1)Q − 1}. http://asp.eurasipjournals.com/content/2014/1/109 The correlation between (46) and (47) will not be null if Q > L. This is due to the fact that the block of coefficients A (k) (i), at the output of the outer DFT, is cyclically extended. Equation 45 takes this correlation into account.

PAPR-related aspects
The peak-to-average power ratio (PAPR) is a measure of the transmitted signal x(t), defined as The PAPR indicates how much the signal peak power is higher than the mean power value. A signal with high PAPR exhibits high dynamic range. Consequently, this poses a challenge to the analog components of the front end which may introduce distortions. For example, if the signal exceeds the power amplifier dynamic range, the output signal will be clipped to the supply voltage level. In turn, unintentional out-band interference due to spurious emissions is generated as well as the signal distortion may cause a performance loss in the receiver stage. In OFDM, the high PAPR is a known drawback. It grows exponentially with the sub-channel number [24]. Generally, the PAPR cannot be expressed in closed-form.
In OFDM, |x(t)| can be approximately modeled as a Rayleigh process as shown in [25] so that pseudo-closed expressions for the distribution of the PAPR can be derived. In CB-FMT, the problem is more complex, so that we have to resort to a numerical approach to evaluate the PAPR as it will be discussed in Section 6.3.

Equalization in time-variant frequency-selective channels
The orthogonality conditions, described in Section 3, render CB-FMT free from interference when the channel is static and has a flat frequency response. When the channel is frequency selective and/or time variant, some interference may be present. However, orthogonality can be restored with an equalizer. From the frequency domain implementation of CB-FMT (Figure 3), we note that the chain comprising the M-point inner IDFT at the transmitter, the transmission medium, and the M-point inner DFT at the receiver is similar to the OFDM system. This suggests to append a CP of μ samples to the transmitted block of samples, as shown in Figure 4.
To proceed, let us denote the time-variant channel response as follows where P < μ is the channel impulse response length in samples, and α s (n) is the s-th channel tap at time instant n. Then, the received signal can be written as To simplify the notation, we focus on the first received block, without loss of generality. After CP removal, under the assumption that the channel duration (in samples) is shorter than the CP, (50) becomes a circular convolution between the transmitted signal and the channel impulse response. In matrix form, we can write If we apply an M-point DFT to (51), which is what the receiver does through the inner DFT, we will obtain where , and X is the vector of coefficients at the input of the inner IDFT at the transmitter side. Essentially, (53) describes the relation that exists between the coefficients at the input of the inner IDFT at the transmitter side and the coefficients at the output of the inner DFT at the receiver side. Such a relation suggests the use of a frequency domain equalizer applied at the output of the receiver inner DFT as shown in Figure 4. To proceed, we need to derive an expression for the elements of the matrixĤ CH .
We start from (50). Without loss of generality, we can extend the sum from P to M by zero-padding the channel impulse response. The block of received samples can be written as where H 1 (p, n) is the M-point DFT of the channel impulse response at time instant n, i.e., computed along the variable. By computing the M-point DFT of (54), we obtain the elements of the vector (53): Finally, it follows that the elements ofĤ CH are defined as To derive the FD equalizer, we distinguish between the case of having a time-invariant channel and the case of having a time-variant channel.

Time-invariant channel
When the channel is time-invariant, h CH ( , n) does not depend on the time instant n. Thus, H 2 (p, q) is not null only for q = 0. Consequently, the channel matrixĤ CH is a diagonal matrix. The M-point DFT output, at the receiver stage, can be simply written as follows: This shows that there is absence of ICI, i.e., interference among the sub-channels. Therefore, the application of a simple 1-tap frequency domain equalizer is enabled [11].
In particular, with zero forcing, the equalizer output is given by where H EQ,ZF (i) is the i-th coefficient of the FD zero forcing equalizer. In Figure 4, the matrix H EQ,k , associated to the k-th sub-channel equalizer, is diagonal. Its p-th diagonal element is equal to H EQ,ZF (p + kQ). In such a case, perfect orthogonality is achieved in the system. That is, after zero forcing equalization, the sub-channel signal is multiplied with the conjugate of the pulse frequency response G * (p), and it is finally processed by the other stages depicted in Figure 3. Then, the output reads as in (20). Alternatively, the equalizer coefficients can be designed according to the MMSE principle and they read where σ 2 is the noise variance. This solution provides better performance at low signal-to-noise ratios than the zero forcing solution. http://asp.eurasipjournals.com/content/2014/1/109

Time-variant channel
When the channel is time variant, the channel matrix H CH has non-zero elements outside the main diagonal. The number of non-zero elements off the diagonal grows with the channel Doppler spread. The q-th inner DFT output coefficient at the receiver can be written as Relation (61) shows that a simple 1-tap equalizer cannot fully remove the interference introduced by the timevariant channel and represented by the second additive term in (61). Thus, we propose to use a sub-channel block equalizer that mitigates the interference between the L symbols transmitted in each of the K-th sub-channels considering the fact the the ICI between distinct subchannels is small due to their good frequency response confinement.
We start by splitting the matrixĤ CH in blocks of Q × Q elements, so that (53) can be written as where X k and Y k are Q × 1 vectors whose elements are the Q coefficients associated to the k-th sub-channel and N is the background noise vector. B i,j is a Q × Q matrix defined as The k-th sub-vector in (62) can be written in order to separate the term of interest from the interference as follows Now, the k-th sub-channel block equalizer output vector is given by where the sub-channel equalizer matrix is computed according to the MMSE criterion. Such a matrix is obtained as where is the Q × Q correlation matrix between the vector X k and the vector Y k , and It should be noted that the signals X i , i = k, are treated as noise by the k-th sub-channel block equalizer since the interference that they generate is small due to the sub-channel spectral confinement.
After the sub-channel equalization, the output coefficients are weighted with the prototype pulse FD coefficients G * (i) and, finally, processed by the others stages, as shown in Figures 3 and 4. 6 Numerical results

Pulse design examples
In Figure 5, we report two examples of pulses obtained with the method described in Section 3.2. Several combinations of parameters are considered. The pulses have been obtained starting from a root-raised-cosine spectrum with roll-off equal to 0.2. The pulses are designed for M = 320 and M = 640. Furthermore, K = 8 and N = 10 or K = 16 and N = 20 are considered, respectively.
In Table 1, we report the ratio between the in-band and the out-of-band energy of the interpolated prototype pulse for several choices of the parameters. Despite the simple design method, the pulses exhibit good frequency confinement which increases for larger values of M.
In all numerical results that will follow, a common configuration is related to the case M = 320 and K ∈ {8, 16, 32, 64}.

Complexity comparisons
In Figure 6, we show the complexity of OFDM, FMT, and CB-FMT as a function of the prototype pulse length (in samples) and assuming it has Q non-zero DFT coefficients. In all FMT, OFDM, and CB-FMT, the pulse length L g is set equal to M. It should be noted that OFDM uses a rectangular window of length M equal to the number of sub-channels. The complexity is presented in terms of cop/sample for different combinations of N, K. In CB-FMT, we show the complexity at the receiver side when a 1-tap equalizer is used. The figure shows that CB-FMT has significant lower complexity than conventional FMT. Clearly, OFDM is the simplest solution. However, CB-FMT and OFDM have a more comparable complexity, i.e., CB-FMT is more complex than OFDM by a factor of about 1.5. As it will be shown in the next sections, this extra complexity pays back since CB-FMT can offer better PSD confinement, lower PAPR, and better performance in fading channels.

Power spectral density and PAPR
In this section, we consider the PSD and the PAPR of CB-FMT. For the OFDM system, the PSD derivation is reported in [26]. In Figure 7, we report an example of PSD of CB-FMT assuming the parameters equal to K = 8, N = 10, Q = 40, L = 32 (therefore M = 320), β = 0.2 and the cyclic prefix equal to μ = 8 samples. The interpolation filter is a root-raised-cosine (RRC) pulse with roll-off equal to 0.1. If the interpolation pulse were ideal, i.e., the filter was a perfect low-pass filter, the out-band emissions would be null. However, a real interpolation filter introduces out-band emissions. For the parameters specified, the ratio between the useful signal power and out-band emissions power is equal to 25.48 dB. In OFDM, assuming the number of sub-channels equal to K = 320, this ratio is equal to 22.80 dB. CB-FMT has slightly better in-band/out-band power ratio due to a higher sub-channel frequency selectivity w.r.t OFDM, under comparable complexity assumption. If we set the number of sub-channels in OFDM equal to K = 8, then its in-band/out-of-band power ratio will decrease even further to 20.1 dB. We now consider the PAPR. The complementary cumulative distribution function (CCDF) of the PAPR for CB-FMT and OFDM is shown in Figure 8. The PAPR is influenced by the inner IDFT block size. In Figure 8a, we perform a comparison under similar complexities, i.e., the number of sub-channels in OFDM is set equal to K = 320, and in CB-FMT, we set M = 320 and K ∈ {4, 8, 16, 32}. In CB-FMT, the PAPR is significantly lower than in OFDM for low values of K, N. In Figure 8b, we perform a comparison under an equal number of sub-channels. In CB-FMT, we keep the IDFT block size equal to M = 320. In this case, OFDM outperforms CB-FMT due to the smaller inner IDFT size. In the simulations, a 4-PSK constellation is used for both systems. As it is shown in the next section, CB-FMT can offer higher spectral efficiency than OFDM with a smaller number of sub-channels. In turn, this allows to obtain a lower PAPR.
In Table 2, the mean PAPR for CB-FMT is shown when M = 320 and for several combinations of parameters. In OFDM, the mean PAPR is equal to 11.28 dB, i.e., higher  than in CB-FMT for all parameter combinations herein considered.

Performance in fading channels
In order to evaluate the performance of CB-FMT, we consider the transmission over a wireless fading channel. We consider both static and time-variant channels. In particular, the channel coefficients α (n) in (49) are modeled according to Clarke's isotropic scattering model [27]. Therefore, they are assumed to be independent stationary zero-mean complex Gaussian processes with correlation defined as where f D and J 0 (·) are the maximum Doppler frequency and the zero-order Bessel function of the first kind, respectively. We assume an exponential power decay profile, i.e., = 0 e − /γ , where 0 is a normalization constant to obtain unit average power, and γ is the normalized, w.r.t. the sampling period, delay spread. The channel impulse response is truncated at −10 dB.
First, we show the performance in terms of average SER versus signal-to-noise ratio (SNR) varying the delay spread γ , considering CB-FMT and OFDM both using a CP and a 1-tap MMSE equalizer at the receiver. 4-PSK modulation is assumed. Then, we show the maximum achievable data rate as a function of Doppler spread for different SNR values.
In Figure 9a, the CB-FMT system has parameters K = 8, N = 10, M = 320, and CP with length 8 samples. For the OFDM system, we consider the number of sub-channels equal to K = 64, as in the IEEE 802.11 WLAN standard. In OFDM, the CP length is set equal to 18 samples so that the two systems have identical transmission rate assuming an identical transmission bandwidth. We consider channels with normalized delay spread equal to γ = {1, 2, 3} and no Doppler spread (static channels). The results reveal that CB-FMT can significantly lower the SER, especially for high values of delay spread γ . A 10-dB SNR gain is found at SER = 10 −4 . This is due to the fact that CB-FMT in conjunction with the MMSE equalizer can exploit the frequency diversity introduced by the channel, and thus, the more dispersive the channel, the higher the gain is for CB-FMT. In OFDM, the performance is identical for all values of γ considered, since the sub-channels see flat Rayleigh fading [17].
In Figure 9b, we show the SER for several combinations of parameters in CB-FMT. In all cases, CB-FMT has M = 320, K ∈ {8, 16, 32, 64}, N ∈ {10, 20, 40, 80}, and a CP equal to 8 samples so that it has the same data rate of OFDM. The normalized delay spread is set to γ = 2. The SER grows with the number of sub-channels K, and it approaches that of OFDM, i.e., the performance of 4-PSK in flat Rayleigh fading. This is because when K increases, Q = M/K decreases and, consequentially, the ability of coherently capturing the sub-channel energy (thus exploiting diversity) with the MMSE equalizer is reduced.
In Figure 10, we show the average maximum achievable rate (Shannon capacity [23]) assuming time-variant frequency-selective fading and additive white Gaussian noise. The system parameters for OFDM and CB-FMT are equal to those assumed for the SER analysis in Figure 9a. equalization provides higher achievable rate than OFDM for a Doppler below 400 Hz. For higher Doppler, the performance is dominated by the interference. Therefore, the MMSE sub-channel block equalizer provides significantly higher performance than the single-tap equalizer. In particular, at the maximum Doppler considered that is equal to 4 kHz, the gain in achievable rate of CB-FMT over OFDM is 6%, 20% for an SNR equal to 15 and 25 dB, respectively. This shows that CB-FMT has the potentiality of bettering the performance of OFDM also in the presence of channel time variations introduced by mobility of the nodes. The gains in Figure 10 are due to the fact that CB-FMT is more robust to the channel time variations due to the use of frequency confined pulses that allow to lower the ICI compared to OFDM. Furthermore, as shown also in the BER curves, CB-FMT with the FD equalizer can exploit the sub-channel frequency diversity.

Conclusions
In this paper, a filter bank architecture referred to as cyclic block filtered multitone (CB-FMT) modulation is presented. This scheme can be derived from the FMT architecture philosophy. However, linear convolutions are substituted with circular convolutions, and data are processed in blocks, which justifies the acronym CB-FMT. The efficient implementation of CB-FMT in the frequency domain has been discussed, and the performance analysis has been carried out. The main conclusions can be summarized as follows: • The computational complexity analysis shows that CB-FMT can significantly lower the complexity compared to conventional FMT with even longer pulses. • The orthogonal CB-FMT design can be done in the frequency domain, and a simple pulse design procedure can be followed by sampling in the FD a band-limited Nyquist pulse. Optimal frequency localized orthogonal pulses for CB-FMT can also be designed in the frequency domain as recently shown in [19]. • The orthogonal CB-FMT transmitted signal shows high frequency compactness and potentially lower PAPR than OFDM if a lower number of data sub-channels is used (still offering the same or higher spectral efficiency). • Sub-channel FD MMSE equalization provides good performance in double-selective fading channels. In particular, lower symbol error rate and higher spectral efficiency than OFDM in multipath time-variant fading channels has been found depending on the choice of parameters.