Preamble-based channel estimation in single-relay networks using FBMC/OQAM

Preamble-based channel estimation in filter bank-based multicarrier (FBMC) systems using offset quadrature amplitude modulation (OQAM) has been extensively studied in the last few years, due to the many advantages this modulation scheme can offer over cyclic prefix (CP)-based orthogonal frequency division multiplexing (OFDM) and in view of the interesting challenges posed on the channel estimator by the interference effect inherent in such an FBMC system. In particular, preambles of short duration and of both the block (full) and comb (sparse) types were designed so as to minimize the channel estimation mean squared error (MSE) subject to a given transmit energy. In the light of the important role that relay-based cooperative networks are expected to play in future wireless communication systems, it is of interest to consider FBMC/OQAM, and in particular questions associated to preamble-based channel estimation, in such a context as well. The goal of this paper is to address these problems and come up with optimal solutions that extend existing results in a single relay-based cooperative network. Both low and medium frequency selective channels are considered. In addition to optimal preamble and estimator design, the equalization/detection task is studied, shedding light to a relay-generated interference effect and proposing a simple way to come over it. The reported simulation results corroborate the analysis and reveal interesting behavior with respect to channel frequency selectivity and signal-to-noise ratio.


Introduction
Future wireless communication systems are expected to adhere to very stringent requirements including high data rates, extended coverage, efficient interference handling and the quality of service anticipated by the end-user. The idea of cooperation is expected to play a key role towards meeting the aforementioned demands. Some examples are the cooperation of multiple base stations and the application of relays in order to mitigate interference and increase the service at the cell edges [1,2]. Cooperation is also expected to be present in infrastructure-less networks like in ad-hoc and sensor networks [3,4].
By utilizing relaying nodes, cooperative communication systems are able to offer capacity and spatial diversity gains with simple single-antenna terminals [3,5]. As in single-link systems, multipath is commonly combatted via *Correspondence: maurokef@ceid.upatras.gr 1 Computer Technology Institute and Press -"Diophantus", Patras University Campus, Patras 26504, Greece Full list of author information is available at the end of the article the adoption of cyclic prefix (CP)-based orthogonal frequency division multiplexing (OFDM), which is known to be able (under ideal conditions) to transform the channel into a set of parallel flat subchannels with independent noises. This greatly simplifies the receiver's tasks such as channel estimation and equalization [6]. However, the use of CP entails a power and spectral efficiency loss (which could be as high as 25%). Moreover, the subcarrier filters, though perfectly localized in time, spread out in the frequency domain, resulting in spectral leakage. This is responsible for the system's increased sensitivity to frequency offsets, Doppler effects and difficulties in user synchronization. Notably, the latter is of great importance in cooperative systems, where synchronization is a very critical issue [7].
Multicarrier schemes based on filter banks (FBMC) have recently shown the potential of overcoming such drawbacks [8,9], thus providing an attractive alternative to OFDM, at the cost of some additional complexity http://asp.eurasipjournals.com/content/2014/1/66 and delay [10]. When combined with offset quadrature amplitude modulation (OQAM), prototype filters with good localization in both time and frequency are possible, resulting in the so-called FBMC/OQAM modulation scheme [11]. The latter avoids the use of CP and has the potential of a maximum spectral efficiency while facilitating the accommodation of multiple asynchronous users. Recently, impressive improvements in the throughput of cognitive radio relaying networks employing FBMC/OQAM were demonstrated over their CP-OFDM counterparts [12].
However, FBMC/OQAM suffers from an imaginary intercarrier/intersymbol interference, which complicates receiver tasks that can be straightforward in CP-OFDM. Channel estimation is one of them. A multitude of training designs and associated channel estimation methods are known today for FBMC/OQAM based systems [13]. The design of optimal preambles for the purpose of estimating the channel in FBMC/OQAM single-antenna single-link channels was investigated in [14] (see also [13,15]). Both full (i.e., with pilots at all the subcarriers) and sparse (i.e., with isolated pilot subcarriers surrounded by nulls) preambles a were considered and their performances were analyzed. FBMC-based techniques were shown to outperform CP-OFDM particularly when a full preamble is employed.
This paper aims at addressing this problem for the first time in a cooperative network. To this end, the simple yet important system of Figure 1 is considered. Single-antenna transmitters and receivers are assumed. A single one-way half-duplex relay is employed to assist the transmission, following a two-phase amplify-andforward (AF) protocol. In the first phase, the source transmits to the relay and the destination. In the second phase, the source transmits a new piece of information to the destination and the relay forwards to the destination an amplified version of the signal transmitted by the source during the first phase. This allows the firstphase signal to be received through two different links, thus enhancing the diversity of the system. In a manner analogous to a CP-OFDM-based system, filter banks are employed at the relay terminal to help amplify the received signal per subcarrier. However, the aforementioned imaginary interference along with the real nature of the input symbols versus the complex nature of the filter bank and the wireless channel complicate processing at the relay and the destination, and hence, these need to be appropriately adapted to the characteristics of the FBMC modulation employed. The aim is to estimate the channels in both paths leading to the destination node.
The problem of the optimal sparse preamble design for a CP-OFDM-based system of this type was recently studied in [16]. Optimality was defined in terms of the mean squared error (MSE) of the least squares (LS) channel estimator subject to a constraint on the total transmitted energy. The same problem is addressed in this paper, but for the more challenging case where FBMC modulation is used. An approach similar to the one of [16] and [14] is followed. The resulting optimality conditions are analogous to those derived in [16] and dictate that the source should allocate the whole of its training energy to the first phase, to equispaced and equipowered pilots [17]. Moreover, the relay should also uniformly allocate its energy to the pilot subcarriers to forward the corresponding training signal. The reported simulation results corroborate the analysis and demonstrate a performance similar to that of the CP-OFDM system. An interesting question, that stems from the intrinsic interference effect, comes up when detecting the transmitted signal at the destination in the second phase. To answer it, a simple interference cancellation idea is developed and tested.
In sparse preambles, the pilot symbols are guarded by the surrounding nulls and therefore do not interfere with each other. As a result, no pilot symbol energy increase is present at the received signals (as observed in the case of full preambles) and the system turns out to be similar to that based on CP-OFDM in terms of both design conditions and estimation performance. It is thus of interest to also investigate the full preamble case where the situation is quite different and more challenging. In such a scenario, neighboring pilot symbols interfere with each other, resulting, effectively, in an energy increase of each pilot symbol at the receiver [18]. This can be advantageous as it attenuates the noise and results in a more accurate estimate for the channel [19]. For the FBMC/OQAM single-link case, it was shown in [15] that equal symbols maximize this energy increase, offering an estimation performance superior to that of CP-OFDM for practical signal-to-noise ratio (SNR) values. Analogous results about the optimal values of the pilot symbols are shown to hold in the present context [20]. The commonly made assumption of (almost) flat subchannels [13] will be adopted here, for the sake of simplicity. Let us recall that this holds true for channels that are not too frequency selective relatively to the size of the filter bank. An additional assumption underlying classical FBMC/OQAM channel estimation techniques and aiming at their simplification is that the coherence bandwidth of the channel is large enough to consider the channel frequency response (CFR) invariant over a neighborhood of the subcarrier of interest [18]. Solutions will be also given here when this assumption is relaxed and the differences in the results obtained will be discussed. As a byproduct, the optimal full preamble design for the CP-OFDM-based cooperative system will be derived, through its connection to the FBMC/OQAM system. Simulation results are presented for both mildly and highly frequency selective channels, which corroborate the analysis and demonstrate significant performance gains of the FBMC/OQAM full preamble-based channel estimator over its CP-OFDM counterpart, particularly at practical SNR values.
The rest of the paper is organized as follows. The FBMC/OQAM modulation system and the cooperative communications system employed here are described in Section 2. The problems of (a) sparse and (b) full preamble design are addressed in Sections 3 and 4, respectively. Simulation results verifying our theoretical analysis are presented in Section 5 and concluding remarks are made in Section 6.
Notations: In the following, bold lower case and upper case letters denote column vectors and matrices, respectively, unless otherwise stated. F denotes the DFT matrix of appropriate order. X T , X H and X −1 denote transposition, conjugate transposition and inversion of X.
Moreover, X −H = X −1 H . diag(x) is a diagonal matrix with x on its main diagonal, Tr(X) is the trace of X and E{.} denotes statistical expectation. Moreover, . denotes the Euclidean norm of a vector, c * the conjugate of the complex number c and means elementwise multiplication. 0 denotes a zero matrix or vector of appropriate size. Finally, x ∼ CN (μ, ) x ∼ CN μ, σ 2 denotes a complex Gaussian random vector (scalar) with mean μ(μ) and covariance matrix (variance) σ 2 .

System description
The following two sections will provide all the information about FBMC/OQAM and the cooperative system under consideration that is required for the sequel.

The FMBC/OQAM system
The output signal of the synthesis filter bank (SFB) is given by [11] s(l) = where a(m, n) are real OQAM symbols, produced by the complex to real OQAM modulator (C2R block in Figure 1) and with g being a real symmetric prototype filter impulse response of length L g = MK and unit energy. M is the (even) number of subcarriers, K is the overlapping factor and φ(m, n) = (m + n)(π/2) − mnπ [11]. Finally, the pair (m, n) corresponds to a frequency-time (FT) point with subcarrier index m and time index n. The signal s(l) is transmitted through a frequency selective channel of length L h that is modeled by the impulse response h = [h(0), h(1), . . . , h (L h − 1)] T . Applying the commonly made assumption that the channel is (approximately) frequency flat at each subcarrier [18], the signal at the FT point (p, q), after the receiver's analysis filter bank (AFB), is given by [14] y(p, q) = H(p)a(p, q) + j where H(p) is the CFR at the point (p, q). The noise term η(p, q) = l w(l)g * p,q (l) is a filtered version of the complex Gaussian channel noise w(l) at the output of the pth subchannel. Assuming that w(l) is independent and identically distributed as w(l) ∼ CN 0, σ 2 , then η(p, q) is also CN 0, σ 2 . However, now η(p, q) is correlated among adjacent subcarriers (see, e.g., [15] and [21]).
The summation term in (3) is the associated intrinsic interference coming from the neighboring FT points and weighted by l g m,n (l)g * p,q (l) = j g p,q m,n It is often assumed that, for time-frequency well-localized pulses g(.), this interference is limited to the first-order http://asp.eurasipjournals.com/content/2014/1/66 neighborhood p,q around (p, q), i.e. p,q ∈ {(p, q ± 1), (p ± 1, q), (p ± 1, q ± 1)}, and (3) Finally, (5) can be further simplified if the additional assumption is made that the neighboring CFRs are assumed equal to each other (H(p) H(p − 1) H(p + 1)). This is the case when the channel order is much smaller than the number of subchannels or else for channels with large coherence bandwidth. In this case, (5) can be written as where is the virtually transmitted symbol.
For preamble-based channel estimation, the transmitter sends either a sparse or a full preamble at the beginning of a frame, to assist the receiver in estimating the channel. Preambles consisting of two FBMC symbols will be considered. The first one is a vector of pilot symbols a(p, 0) while the second one is a zero vector, i.e. a(p, 1) = 0, for all p, which serves as a guard against interference from the data. For the sake of the analysis, and without loss of generality, the all zeros FBMC symbol that is also commonly sent before the pilots will be omitted here (as in [15]). Its absence, in practice, can be justified, for example in wireless transmissions that involve inter-frame gaps. It should be also noted here that the two-symbol FBMC preamble is of the same duration as one CP-free OFDM symbol; hence, no extra bandwidth is spent for training. In this case, Equations 6, and 7, with q = 0, correspond to the received signal and the virtually transmitted pilot, respectively, that are associated with channel estimation. It is pointed out, here, that in case no such inter-frame gaps exist, a preceding all-zero symbol is necessary, which results in a preamble of 1.5 CP-free OFDM symbols, i.e., only slightly more than one CP-OFDM symbol. However, the preamble design analysis that follows will not be affected by such a change.

The cooperative system
The cooperative system under consideration is schematically shown in Figure 1. In analogy with an OFDM-based system, the source S and the destination D utilize an SFB and an AFB, respectively. In order to support persubcarrier processing, the relay R receives through the AFB, amplifies the subcarrier signals and forwards them to the destination through its SFB.
A two-phase transmission protocol (first proposed in [22]) is adopted. As shown in [23], this protocol offers the optimal diversity/multiplexing trade-off among all the AF half-duplex protocols. The source, the destination, and the relay are single-antenna terminals. For the sake of simplicity, it is assumed that S and R are synchronized when transmitting to D during the second phase.
The channel impulse responses h i are modeled as L i × 1 complex Gaussian random vectors with independent elements, i.e. h i ∼ CN (0, C i ), where C i is diagonal and i ∈ {SD, SR, RD}. For the sake of the analysis, these channels are assumed (almost) time invariant for the duration of the two phases. Moreover, they are assumed to be short enough to satisfy (6) above.
During the first phase, S transmits the symbols a 1 (p, q) to R and D. These are received at the outputs of the corresponding AFBs as respectively, where b 1 (p, q) is defined as in (7). As presented in Section 2.1, the noise terms are correlated complex Gaussian with zero mean and variances σ 2 R , σ 2 D , respectively. The signal y R (p, q) is amplified by the relay as where the per-subcarrier amplification factor λ(p, q) is used to 'regulate' the transmitted energy per FT point (p, q). It is pointed out here that the inputs to R's SFB are complex valued as opposed to the real a 1 (p, q) that are sent by S. This is feasible and it is adopted in order to keep the processing as simple as possible and in-line with the AF paradigm.
In the second phase, S and R send the symbols a 2 (p, q) and x R (p, q), respectively, to D. These are received as where η D 2 (p, q) is statistically described similarly with η D 1 (p, q), and b 2 (p, q), b R (p, q) are as in (7) but with a 2 (p, q) and (complex) x R (p, q) now being the transmitted symbols.

Channel estimation using sparse preambles
As mentioned earlier, S is assumed to employ a 2-symbol sparse preamble at the beginning of each phase, to assist the estimation of the channels at D. The first FBMC symbol has non-zero pilots at some positions described by the index set P = {p 1 , p 2 , . . . , p L } and zeros everywhere else i.e. a(p, 0) = 0 for p ∈ P. The number of pilot symbols, L, is assumed to be the minimum possible one, namely L = max (L SD , L R ), where L R = L SR +L RD −1 is the length of the S-R-D channel, and, of course, L M. The second http://asp.eurasipjournals.com/content/2014/1/66 FMBC symbol is set to zero. This way, the interference term in (5) is zeroed.
In the first phase, the training signals received at R and D, are respectively, with p ∈ P. The AF operation at the relay during training is here defined as follows. The relay feeds its SFB with the amplified signals ((10) with q = 0) at the pilot subcarriers, whereas it loads the remaining subcarriers with nulls. Moreover, the next FBMC symbol at the relay (i.e. for q = 1) is transmitted as all zeros. This 'recovers' the original preamble structure as sent by the source, yet with complex-valued inputs at the pilot subcarriers. Thus, the received signal at the destination in the second phase and at the pilot subcarriers can be expressed as where is the CFR of the S-R-D channel b (of length L R ) and The latter has zero mean and variance where θ 2 RD (p) = E |H RD (p)| 2 = Tr(C RD ) and is therefore independent of p (for uncorrelated channels as assumed here).
It will be convenient to write Equations 12 and 14 in the following compact form: where for k = 1, 2 and Moreover, η D 1 and w 2 are defined similarly to (19). Additionally, in (18), the L × L matrix F L×L results from the Mth-order DFT matrix F by keeping its first L columns and its L rows indexed by P. Here, it is assumed for simplicity and without loss of generality that L SD = L R = L and that M/L is an integer. If necessary, these conditions can be satisfied by appending an appropriate number of zeros to the impulse responses. With a straightforward matching of terms, (18) can be written as where the matrix is square of order 2L and obviously nonsingular. The noise term w is a zero mean random vector with covariance matrix where . From (22), the LS estimate of h and its covariance matrix are expressed as [24] The per-subcarrier amplification factor during the preamble period will be set to (see also [16]) with p ∈ P, where E R (p, 0) is the energy assigned by the relay when forwarding the pth pilot signal and θ 2 SR is defined in a manner analogous to θ 2 RD (and is hence independent of p).
In the following, the optimal preamble design for the aforementioned setup will be provided. The aim is to appropriately choose the pilot symbols a k (p, 0) and their positions p ∈ P, so that the normalized MSE = 1 2L Tr Cĥ is minimized subject to a constraint on the total energy spent for transmitting (and forwarding) the preambles in the two phases. One can show that the MSE here only depends on the energies of the pilot symbols and not on their specific values. This fact will soon become apparent.

Problem formulation
Defining α k (l) = a 2 k (p l , 0) for k = 1, 2 and e(l) = E R (p l , 0), one can formulate the preamble optimization problem as follows min α 1 ,α 2 ,e,E 1 ,E 2 ,P 1 2L Tr Cĥ (28) such that (s.t.) where α k and e are L × 1 vectors containing the α k 's and e's, respectively, E k is the energy allocated to training in phase k and E S , E R are given energy budgets at the source http://asp.eurasipjournals.com/content/2014/1/66 and the relay, respectively. The optimal placement P of the pilot symbols is also to be optimized. A simplification of the above cost function will be quite helpful in the sequel. Using the formula for Cĥ from (26) and the definitions from (22), one can write where the well-known property of the trace operator for matrix products has been employed. Next, the application of the matrix inversion lemma in the 2 × 2 block (with diagonal blocks) matrix X H C w X −1 results in where U is a diagonal matrix with its lth diagonal element, l = 1, 2, . . . , L, given by . (31) L, the minimization can be based on the following equivalent expression for the cost function

Optimal energy allocation between phases
First, the optimal spliting of the total energy at the source in the two phases is investigated. Writing α k (l) as a frac- and setting E 2 = E S −E 1 , the MSE in (32) can be expressed as a function of E 1 only (namely, MSE = f (E 1 )) and the minimization is performed for 0 ≤ E 1 ≤ E S . Because df (E 1 ) /dE 1 is negative, this function is monotonically decreasing and its minimum value is for E 1 = E S , implying that E 2 = 0 and α 2 = 0.

Optimal pilot placement
After incorporating the results of the previous subsection, the minimization problem is transformed into The optimization with respect to P can benefit from the following lower bound [25] Tr Equality holds in (34) when the positioning set P is constructed by equispaced pilot positions (e.g. p l = M L (l − 1), for l = 1, 2, . . . , L). This is true for any allocation of the pilot energies.

Optimal allocation of energy to pilots
By considering equispaced pilot positions, the minimization problem reduces to This can be readily solved using Lagrange multipliers. The conclusion is that the energy allocation should be uniform across the pilot symbols. This leads to From the previous analysis, the following remarks can be made: 1. The above results are in line with those derived in [16] for the CP-OFDM-based system. This is not unexpected in view of the fact that the type of the preamble considered (i.e., sparse) removes intrinsic interference and brings the problem quite closely to that for CP-OFDM. Analogous results, for singlelink (not relaying) FBMC/OQAM systems, were also shown in [14]. 2. It is interesting to observe that the estimator in (25) takes a simple form when the optimal preamble is used. Indeed, Equation 25, assuming also for simplicity equal symbols, becomeŝ . From (36), it is clear that the estimation of the two branches is actually decoupled. 3. In order to better appreciate the needs of D for channel information, a simple, per-subcarrier, single-tap zero forcing (ZF) equalizer for recovering the transmitted data at the destination node is considered. It turns out that an additional interference term is present at D, due to the use of FBMC/OQAM for forwarding at the relay. A simple way to cancel this term out is then described. http://asp.eurasipjournals.com/content/2014/1/66 D needs to first estimate the virtual symbols b k (p, q) and from them detect the corresponding input symbols a k (p, q), k = 1, 2. The detected symbols are then OQAM-demodulated (real to complex (R2C) block in Figure 1). Data recovery is performed at the end of the second phase. To see how this can be done, rewrite first Equations 9 and 11: where and is the composite noise at D in the second phase. As it will be verified in the simulation results, I(p, q) needs to be canceled out in (38) for a better detection performance. However, Equation 39 implies that D would also need an estimate of the S-R and R-D channels in order to cope with this interference term. This is more than commonly required from the destination node in the channel estimation literature for such systems (cf. e.g., [16]), namely estimates of the overall channels in the two paths from S to D only. One can see, however, that I(p, q) can be approximated by using the assumption (underlying (11)) that H SR (m) ≈ H SR (p), for (m, n) ∈ p,q . This way, H SR (p) is factored out of the summation in (39) and the known channel H R (p) appears. Moreover, b 1 (m, n) can be estimated from (9) based on the H SD estimate, while the quantities g p,q m,n are a priori known from the adopted prototype filter g [13]. Once I(p, q) has been canceled out, and an estimate of b 1 (p, q) is available, a ZF equalizer can be applied in (38) too, to estimate b 2 (p, q).

Channel estimation using full preambles
In the following, the full preamble design will be presented first under the commonly adopted assumption of equal neighboring subcarrier CFRs (e.g. [18]) that is valid for channels of large coherence bandwidth. As already mentioned, the energy increase at the pseudo-pilots that are generated at the receiver leads to a better estimation performance than CP-OFDM. Then, the more realistic case where no such assumption is made for the neighboring subcarrier CFRs will be addressed. Simulation results will show that this only brings improvement at the weak noise regime, where the model inaccuracy becomes apparent. At low SNRs however, the strong noise hides the incorrectness of the assumption, while the (artificial) pseudopilots are of sufficiently large magnitude to successfully cope with this noise.

Channels of large coherence bandwidth
As in the case of the sparse preamble, the source transmits two known FBMC symbols. The first one is a vector of training symbols a(p, 0) (which is full) while the second one is a zero vector, i.e. a(p, 1) = 0, for all p, which as previously serves as a guard against interference from the data. Due to the good frequency localization of the prototype filter employed in the filter bank, and in view of this preamble structure, the interference to the pilot at subcarrier p only comes from its adjacent subcarriers, p ± 1. In order to facilitate the presentation that follows, we will make a slight abuse to the OQAM definition, by incorporating the phase factors e jφ(p,0) in the training symbols, resulting in x(p, 0) = a(p, 0)e jφ(p,0) (this is adopted as in [15] in order to assist the forthcoming analysis and especially the minimization problem (54)). Of course, g p,0 also needs to be accordingly modified, namely, in (2), the factor e jφ(m,n) is omitted for (m, n) = (p, 0). This results in meaning that the interference corresponding to j g p,0 m,0 in (7) for m = p ± 1 is purely real with β > 0 defined in [13]. Some indicative values for the prototype filter used in the simulations with overlapping factor K = 3 are β = 0.2497 and β = 0.25 for M = 64 and M = 256, respectively. Hence, the corresponding virtual pilot symbol b(p, 0) is given by It is the presence of these interfering terms in a full preamble that, with an appropriate choice of the x(p, 0)'s, can increase (preferably maximize) the energy of the b(p, 0)'s and permit significant gains in estimation performance over both the FBMC/OQAM sparse preamble and the CP-OFDM full preamble. http://asp.eurasipjournals.com/content/2014/1/66 In the following, the time index will be dropped for convenience. The received signals in the first phase are for R and D, where b 1 (p)'s are defined according to (42). During the second phase, the source and the relay send a new two-symbol full preamble of the above structure, using this time x 2 (p) and x R (p) (in the first FBMC symbol). The relay employs the following amplification factors which set the mean energy per subcarrier p at the input of the relay SFB to e(p). The received signal at D, during the second phase, can be written as where the b 2 (p)'s are defined according to (42) and for all p. Moreover, is a zero mean random variable with variance equal to where θ 2 RD is defined as in the sparse preamble case while our knowledge of the correlation of adjacent noise components (equal to σ 2 R β [15]) has also been used. Putting Equations 44 and 46 together results in where and similarly for H i , η D 1 , w 2 , and B l = diag (b l (0), b l (1), . . . , b l (M − 1)), for l = 1, 2, 3. Equivalently, with straightforward matching of terms, where it is natural to assume the matrix B to be nonsingular. The noise term w is zero mean with covariance C w = diag C η D 1 , C w 2 . The diagonal blocks of C w are not diagonal matrices. However, as it will be observed later on, we are only interested in their diagonal elements, which are C η D 1 pp = σ 2 D and C w 2 pp = σ 2 w 2 (p), respectively. The LS estimate of H and the associated error covariance matrix are given by [24]

Preamble design
The training design consists of (a) the source training energy allocation between the two transmission phases, (b) determination of the source training symbols x k (p) = a k (p)e jφ k (p) , k = 1, 2 and (c) distribution to the subcarrier signals the transmit energy available at the relay in the second phase. In a manner analogous to the sparse preamble case, the preamble optimization criterion will be to minimize the normalized MSE = 1 2M Tr CĤ subject to sum energy constraints at the source and the relay, namely where x k and e are M × 1 vectors containing the x k (p)'s and e(p)'s, respectively, E k is the source energy allocated to training in phase k and E R , E S are given energy budgets. The two energy constraints, defined at the output of the source and the relay SFBs, follow the analysis that was presented in the extended version of [14]. In (57), the relay energy is constrained in a mean sense. It will be convenient to re-write the cost function above in an alternative form. Specifically, by applying the matrix inversion lemma to the 2 × 2 block matrix B −1 with diagonal blocks, it can be shown that the trace in (54) is applied on a sum of diagonal matrices. The normalized MSE can then be written as MSE Some comments concerning the energies are in order. First, the constraints of the minimization problem correspond to the energies at the output of the SFBs of the http://asp.eurasipjournals.com/content/2014/1/66 source or the relay. Due to the interference effect not being negligible in this scenario of a full preamble, these are in general different from the energies at the SFB input (see also [14,15]). Second, due to the orthogonality of the SFBs, the energies at the inputs of the SFBs can be constrained as M−1 p=0 x 1 (p) 2 ≤ E 1 and M−1 p=0 x 2 (p) 2 ≤ E 2 for the source and M−1 p=0 e(p) ≤ E R for the relay (see also [14,15]). Furthermore, it can be shown [15] that

Optimal energy allocation between the phases
The optimal allocation of the total energy E S (equivalently E S ) at the source between the two phases dictates that E 1 = E S , implying that E 2 = 0 and x 2 = 0. The proof of these results follows the same steps as in the sparse preamble case and, hence, it is not repeated here.

Solution for x 1 (p) s and e(p)'s
The optimization problem is now written as follows The cost function in (60) has a complicated form with respect to the unknown parameters. This is due to the fact that the amplification factors λ(m), for m = p − 1, p, p + 1, which appear at both the numerator and the denominator of the second term in (60), are in turn nonlinear functions of the b 1 's and e's. It thus seems that an analytical, closed-form expression for the optimal parameters is difficult to be found. However, targeting such a solution, we first derive a lower bound, which will suggest a simplification allowing us to come up with an analytical solution.
Indeed, by using the triangle inequality at the denominator of the second term in (60), we can write where the equality holds iff the b 1 's have the same phase. Applying the Cauchy-Schwartz inequality in its numerator leads to with equality iff λ(p) = λ(p − 1) = λ(p + 1). The cost function in (60) can then be lower bounded as where the equality holds under the aforementioned conditions.
The above suggests that letting the λ's at a subcarrier p and its immediate neighbors be equal is a plausible choice. This is a first approach to be analyzed below. In a second approach to simplifying the problem, the relay is assumed to operate at a high SNR regime, i.e., σ 2 R ≈ 0.

Assuming λ(p) = λ(p − 1) = λ(p + 1)
We The MSE can then be lower bounded as In view of the constraints M−1 4 and M−1 p=0 e(p) ≤ E R , and resorting to the arithmetic-harmonic mean (AHM) inequality for the first and second terms, (66) is written as (67) Using the Lagrange multipliers for the rest, the above lower bound is minimized for b 1 (p)  2 can be written, in a similar way as

The relay operates at high SNR
The MSE can then be bounded as where the first inequality is due to the AHM inequality and the second to the above constraints. The bound holds with equality for b 1 These values can be obtained for x 1 (p) = √ E S /Me jφ and e(p) = E R /M, for all p, a choice that also minimizes (60). This is the desired solution for (60) if the constraints (61), (62) hold with equality, which is true if, additionally, E S (1 + 2β) = E S and E R (1 + 2β) = E R , respectively.
Remark 1: In both of the cases analyzed above, the choice of all equal x 1 (p) is shown to be a solution. It is interesting to recall that this is in line with the optimal preamble design in single-link FBMC/OQAM systems [13], where it was shown to maximize the virtual pilot energies in (42). Moreover, this choice of the pilot symbols, in conjuction with the uniform energy allocation at the relay, also leads to all equal λ's, something that was only assumed in the first approach. Note also that the matrix B in (51) becomes diagonal (and indeed nonsingular) and has a form similar to the one for the sparse case.
Remark 2: As indicated in [18], the condition of equal pilots can lead to a high peak-to-average power ratio (PAPR) for the modulated preamble signal. This is a wellknown problem, already discussed in earlier works on FBMC/OQAM (see, e.g., [13]) and CP-OFDM [26].

The CP-OFDM case
It is of interest to note that the corresponding problem for CP-OFDM can be formulated as previously by simply setting β = 0. Hence, a solution to this problem can be easily derived here as a by-product.
Denoting the energies x k (p) 2 by α k (p) for k = 1, 2, the associated cost function and constraints can be written as and M−1 p=0 α k (p) ≤ E k , for k = 1, 2, M−1 p=0 e(p) ≤ E R and E 1 + E 2 = E S , respectively. As observed, in the CP-OFDM case, only the pilot energies are of interest in the preamble design, not their values. This problem can be optimally solved and the result is that S uniformly allocates all its energy to the first phase and R forwards the pilot signals by assigning uniform energy per subcarrier. A similar problem was studied in [27], although there the relay plays no significant role in the design as its amplification is not performed per subcarrier as it is here.

Channels of low coherence bandwidth
Here, we check what happens in more realistic scenarios where the channel is more frequency selective than the model in (5) requires. Focusing on x(p, 0) as in the previous section and dropping the time index, Equation 5 can be written as By collecting all y(p) into a single vector y = y(0) y(1) · · · y(M − 1) T , the following linear system can be written [15] where B is a circulant matrix with first row equal to In more detail, during the first phase and focusing on the first (non-zero) preamble FBMC symbol, S transmits the symbols x 1 (p) to R and D. These are received as respectively. The noise terms are described as η R ∼ CN 0, σ 2 R B and η D 1 ∼ CN 0, σ 2 D B . The remaining terms in (73) and (74) are defined as in (71). http://asp.eurasipjournals.com/content/2014/1/66 Let the received signal at R be first multiplied by B −1 . It is pointed out here that this multiplication can be efficiently performed by exploiting the circulant nature of B −1 , i.e. by utilizing FFT/IFFT operations and the (known) eigenvalues of the matrix. Finally, R amplifies the outcome by the factors = diag (λ(0), λ(1), . . . , λ(M − 1)). Thus, where ii . This is a direct consequence of the fact that B −1 is a circulant matrix.
The pth element λ(p) of the diagonal matrix is given by where it is defined that e 1 (p) = x 1 (p) 2 and for future ref- Moreover, e R (p) is the mean energy per subcarrier that is allocated by R. During the second phase and focusing again on the first (non-zero) preamble FBMC symbol, S transmits the symbols x 2 (p) to D and R transmits the symbols x R (p), i.e. the elements of x R (it is reminded that these symbols are followed by an all-zeros one). The received signal is where The noise term w 2 is a zero mean random vector and its pth element has variance equal to σ 2 In compact form and with direct matching of terms, (78) can be written as The noise term w is zero mean with covariance C w = diag C η D 1 , C w 2 . The diagonal blocks of C w are not diagonal matrices. However, as it will be observed later on, we are only interested in their diagonal elements, which are C η D 1 pp = σ 2 D and C w 2 pp = σ 2 w 2 (p), respectively.
Finally, the LS estimate of H and the associated error covariance matrix are given bŷ

Preamble design
Similarly to the previous section, the training design problem is defined as follows: where x k , e R , E k , E R and E S are defined as in (54). The cost function can also be simplified following the procedure of the previous sections. The normalized MSE can then be written as The simplified cost function is similar to the one of CP-OFDM, i.e. (68). Using the corresponding energy constraints at the SFB inputs (i.e. M−1 p=0 e 1 (p) ≤ E 1 and M−1 p=0 e 2 (p) ≤ E 2 , M−1 p=0 e R (p) ≤ E R and E 1 + E 2 ≤ E S ), the minimization problem is identical to the corresponding problem that is defined for the CP-OFDM case. This is a direct consequence of the B −1 operation at the relay which removes the interchannel interference that is commonly present in FMBC/OQAM systems. The solution http://asp.eurasipjournals.com/content/2014/1/66 here is as previously described. In more detail, the source should set E 1 = E S , E 2 = 0, x 1 (p) = √ E S /Me jφ 1 (p) and x 2 (p) = 0. Moreover, the relay should set e R (p) = √ E R /M. Finally, this solution is also the solution to the original problem, i.e. with the constraints at the outputs of the SFBs, when those constraints are tight. This is achieved when additionally the phases of the pilot symbols are equal to each other, namely, x 1 (p) = √ E S /Me jφ .

Simulation results
In the following, the energy budgets for training at S and R are chosen equal to the number of pilot symbols that are used in each case so as to have mean energy per pilot symbol equal to 1. QPSK data are transmitted (with a unit energy per bit). Moreover, the noise signals at the destination and the relay are assumed of equal power. Filter banks, designed as in [28], are employed, with overlapping factor K = 3. The performance of the corresponding CP-OFDM system is included, for the sake of the comparison, where a CP of minimum length (equal to the channel order) was assumed. Results are shown for two channel models, with all the channels undergoing Rayleigh block fading. In the first case, they are generated with an exponential profile (of unit decay) and lengths L SD = 4, L SR = 3, and L RD = 2. The ITU Veh-A profile is assumed in the second case, with a sampling rate equal to 0.9 GHz. The resulting channels have the same lengths L SD = L SR = L RD = 11. In this second case, the SRD channel is much longer than the direct one, namely L R = 21. Thus, to conform with the assumptions made earlier, one may assume that the SD impulse response is appended with 10 zeros here.

The sparse preamble case
M = 256 subcarriers are used. First, channels of low frequency selectivity will be considered using the exponential channel profile as described above. This channel approaches the requirement of equal neighboring CFRs. In Figure 2, the normalized MSE (NMSE) / h 2 is plotted versus SNR, for both optimal (E 1 = E S ) and suboptimal (E 1 = E 2 = 0.5E S ) source energy allocations between the two phases. All other training conditions hold as dictated by the optimal training design. As expected, the performance is significantly better when the optimal design is employed. Moreover, the two multicarrier systems perform similarly.
In Figures 3 and 4, the (uncoded) bit error rate (BER) performances at the destination detector with QPSK input are shown for phases 1 and 2, respectively. It is pointed out here that these results are given in order to observe the impact of the additional interfering term that was identified in (38) and not to propose a new detection scheme for the adopted transmission protocol. This is due to the fact that no diversity combining techniques were utilized to increase the detection performance, meaning that the signals received at the destination, in the two phases, are actually processed separately. The SNR loss incurred by the CP redundancy in CP-OFDM was taken into account when calculating the corresponding BER. In the FBMC/OQAM-based relay, the amplification factors were chosen so as to have unit energy per information bit at the channel inputs of the S-R-D chain. One can observe a significant performance gain (of about 2 to 3 dB) over the suboptimal source energy allocation. Moreover, and not unexpectedly, the two multicarrier systems perform similarly in the detection of the first phase data (cf. Figure 3). In Figure 4, however, the destructive effect of the identified interference term (see Equation 39) and the importance of its (approximate) cancellation are demonstrated.
Observe the severe error floor in the optimal case without cancellation. On the other hand, no cancellation seems to be the best choice at low SNR values, because of the errors incurred then at the interference approximation due to channel estimator errors and a 1 decision error propagation. FBMC/OQAM performs slightly worse at the weaker noise region. This is attributed to the non-perfect interference cancellation (which becomes more apparent in this noise region) and the composite noise term w 2 (p, 0) in (14) at the FBMC destination receiver because of the interference effect. One should add to this the effect of the residual interference caused by the fact that the subchannels in (12) and (14) are only approximately frequency flat.
The same experiment was performed for the Veh-A channel profile and thus, in this case, L = 32. Figures 5, 6, and 7 shows the NMSE and the BER curves for the first and the second phase transmissions. The conclusions drawn for Figures 2 to 4 are still valid here. However, as observed in Figure 7, FMBC/OQAM performance starts to floor at lower SNR values than previously, because the overall associated channel (S-R-D) is now longer and hence the assumption that leads to (6) is less well approximated.

The full preamble case
In this section, simulation results are reported for the full preamble structure. Veh-A channels were assumed for M = 64 and M = 256 subcarriers.

Assuming equal neighboring CFRs
Three scenarios were examined. In the first one, the derived optimal training conditions were respected. In the second and third scenarios, E 1 = E 2 . The third scenario also permits the relay to depart from the uniform energy allocation and employ randomly chosen λ's. The results are depicted in Figures 8 and 9 for M = 64 and M = 256, respectively. The normalized MSE performance is plotted versus SNR. In Figure 8, as expected, the FBMC/OQAM performance is considerably better at practical SNRs. One can also see that the violation of the training conditions deteriorates the performance for both multicarrier systems. Moreover, at weak noise regimes, the inaccuracies of the assumed input-output model, which relies on the assumption of relatively low channel frequency selectivity, become more apparent, resulting in the well-known error floors in the FMBC/OQAM performance [13]. In Figure 9, similar conclusions can be drawn. However, in this case, the error floors are not present anymore because increasing the number of subcarriers leads to lower channel frequency selectivity and hence higher model accuracy.

Dropping the previous assumption
Here, the FBMC estimation performance is examined when the assumption of invariant CFRs is dropped. The simulation results are plotted in Figures 10 and 11 for M = 64 and M = 256, respectively, namely for conditions of high and low relative frequency selectivity. As one can observe, the estimation performances of FBMC are better than the CP-OFDM one for practical values of SNR. Moreover, in both figures, and at low SNR values, the use of the assumption of locally invariant CFR provides better estimates. This entails the use of 'pseudo-pilots' whose magnifying effect on the pilots' magnitude attenuates the channel estimation error, something which is more important when the noise power is high. However, in the weak noise regime, the model inaccuracies become apparent and the relaxation of the above assumption leads to a better performance (a lower error floor -see Figure 10) because it describes the system more accurately. Relying on the assumption of a constant CFR can be advantageous at higher SNRs too, provided that the channel meets this requirement closely enough (see Figure 11).

Conclusions
In this paper and for the first time in such systems, preamble-based channel estimation was studied in an FBMC/OQAM-based cooperative network of a sourcedestination pair that is supported by an AF relay. Both sparse and full optimal preamble designs were addressed. In the former case, the solution was given for both the optimal power allocation and pilot placement problems. In the latter case, we considered the design of the channel estimator and the associated preamble for both channels of large and smaller coherence bandwidth. The corresponding problem for CP-OFDM was also addressed, viewing CP-OFDM as a special case of the FBMC/OQAM-based system. The effects of subchannel frequency selectivity on the attained estimation and detection performance were evaluated via simulation results.
Future research in this context will be directed towards the more realistic scenarios of frequency-selective subchannels and lack of synchronism among nodes in the network.

Endnotes
a Sometimes referred to in the OFDM literature as block type and comb type, respectively.