 Research
 Open Access
 Published:
Channel estimation for MIMO multirelay systems using a tensor approach
EURASIP Journal on Advances in Signal Processing volume 2014, Article number: 163 (2014)
Abstract
In this paper, we address the channel estimation problem for multipleinput multipleoutput (MIMO) multirelay systems exploiting measurements collected at the destination only. Assuming that the source, relays, and destination are multipleantenna devices and considering a threehop amplifyandforward (AF)based training scheme, new channel estimation algorithms capitalizing on a tensor modeling of the endtoend communication channel are proposed. Our approach provides the destination with the instantaneous knowledge of all the channel matrices involved in the communication. Instead of using separate estimations for each matrix, we are interested in a joint estimation approach. Two receiver algorithms are formulated to solve the joint channel estimation problem. The first one is an iterative method based on a trilinear alternating least squares (TALS) algorithm, while the second one is a closedform solution based on a Kronecker least squares (KRLS) factorization. A useful lowerbound on the channel training length is derived from an identifiability study. We also show the proposed tensorbased approach is applicable to twoway MIMO relaying systems. Simulation results corroborate the effectiveness of the proposed estimators and provide a comparison with existing methods in terms of channel estimation accuracy and bit error rate (BER).
1 Introduction
Cooperative communications have been considered as a promising concept to improve the link performance in modern wireless communication systems due to spatial diversity gains, enhanced coverage, and increased capacity [1–4]. In this context, relaying has been commonly accepted as a key technique to improve system performance by overcoming channel impairments, such as fading, shadowing, and path loss, in wireless fading channel environments [4–6]. By resorting to relayassisted cooperation, multiple wireless links between mobile stations and base stations are established to create a virtual multipleinput multipleoutput (MIMO) system [7]. In the simplest relay processing strategy, the relay stations amplify and forward the received data towards the base station. In this work, we adopt amplifyandforward (AF) relaying due to its simplicity of implementation [5]. This strategy is preferable when fixed relay stations have a limited computation capacity as opposed to the base station.
The overall link reliability of cooperative diversity schemes strongly depends on the accuracy of channel state information (CSI) associated with the multiple hops involved in the overall communication. Moreover, the use of common precoding techniques at the source and/or relays generally requires instantaneous CSI knowledge of the different channels to optimize transmission [8, 9]. In practice, however, the CSI is unknown and has to be estimated with the aid of training sequences [10, 11]. For twohop relaying systems, the associated channel matrices can be estimated in separate LS estimation stages that operate sequentially at the destination [10]. When the communication involves additional hops, such a sequential LS estimation approach still applies by using additional transmission phases. The main problem is that channel estimation errors accumulate across the consecutive stages. In [11], a closedform solution was proposed for the joint estimation of the channel matrices in a twohop MIMO relaying system, avoiding error propagation.
A few recent works have developed efficient receiver algorithms based on tensor analysis for channel estimation and/or symbol detection in cooperative systems [12–16]. In [12], a training sequencebased channel estimation algorithm is proposed for twoway relaying systems with multiple antennas at the relays. Recently [14], a channel estimation algorithm based on parallel factor (PARAFAC) model [17, 18] was developed for twohop MIMO relay systems. The approach allows estimation of the channel matrices associated with both hops by resorting to training sequences. Other few recent works have developed tensorbased receivers for oneway twohop cooperative systems [13, 15, 16]. In particular, the approach of Ximenes et al. [16] assumes a KhatriRao spacetime (KRST) coding [19] at the source node, and a semiblind receiver is proposed by assuming the existence of a direct link between the source and the destination.
The approach of Roemer and Haardt and Rong et al. [12, 14] allows a joint estimation of the channel matrices by resorting to training sequences. With the idea of avoiding the use of training sequences at the users’ and relays’ transmissions, the work [13] proposed a blind receiver for uplink multiuser cooperative diversity systems based on a PARAFAC model for the received signal. However, [13] is limited to a clustered relaying scenario, where relays belonging to the same cluster have the same spatial signature. The common feature of all these works is on the assumption of only two hops (sourcetorelays and relaystodestination). To further extend the coverage area and combat channel impairments such as pathloss and shadowing, it may be advantageous to introduce an additional hop along with an extra communication phase by means of threehop relaying [5]. We highlight that the interest of the proposed work is on the joint channel estimation problem (i.e., joint channel and symbol estimation is not addressed here). The joint channel estimation problem was addressed in [12] for a twoway relaying system and in [14] for a oneway twohop system. From a tensor modeling viewpoint, the common feature of both works is on the use of the PARAFAC model. Herein, we focus on a oneway threehop multirelay system, while resorting to a PARATUCK2 model to derive the proposed algorithms.
In this work, novel channel estimators are proposed for MIMO multirelay systems. Assuming that the source, relays, and destination are multipleantenna devices and considering a threehop AFbased training scheme, new channel estimation algorithms capitalizing on a multilinear structure of the endtoend communication channel are proposed. The proposed approach is based on a PARATUCK2 tensor model [20] of the data collected at the destination only, which allows the channel matrices to be jointly estimated at the destination. Two receiver algorithms are formulated to solve the channel estimation problem. The first one is an iterative channel estimation method based on a trilinear alternating least squares (TALS) algorithm derived from a PARATUCK2 tensor model of the received data, while the second one is a closedform solution based on a Kronecker least squares (KRLS) factorization. The proposed approach provides an extension of the idea recently proposed in [14] to a more general scenario with twotier relaying using MIMO AF relays. Identifiability of the channel matrices is also examined in this work, and a useful lowerbound on the channel training length is derived. In contrast to conventional pilotassisted LS channel estimation, where the channel matrices are estimated separately in consecutive stages, our proposed algorithms make a more efficient use of cooperative diversity by providing a joint estimation of all the channel matrices. As will be clear later, such a joint channel estimation is possible due to the use of the tensor approach to model the endtoend system.
In comparison with conventional (multistage) LS channel estimation [10], the proposed tensorbased estimators have two distinguishing features: i) they avoid accumulation of channel estimation errors since all the channel matrices are estimated simultaneously (either iteratively or in closedform), and ii) they can operate under less restrictive (and more flexible) conditions on the required number of antennas at the relays and/or destination, as will be clear from our identifiability analysis. Our approach also includes the PARAFACbased channel estimator of [14] as a particular case. We also show that the proposed tensor modeling approach copes with a twoway MIMO multirelaying communication system, where the TALS and KRLS channel estimators can be applied.
This paper is organized as follows. In section 2, the system model and working assumptions are described. Section 3 formulates the proposed approach. The data model is recast using tensor analysis, and the two channel estimation algorithms (TALS and KRLS) are derived. Identifiability of the channel matrices is also examined in this section. In section 4, we provide an extension of the proposed tensorbased signal model to a twoway MIMO relaying scenario. Numerical results are presented and discussed in section 5, and the conclusions are drawn in section 6.
Notation: Scalars are denoted by lowercase letters (a,b,…), vectors as lowercase boldface letters (a,b,…), matrices as uppercase boldface letters (A,B,…), and tensors as calligraphic letters $(\mathcal{A},\mathcal{\mathcal{B}},\dots )$. A^{T} and A^{†} stand for transpose and pseudoinverse of A, respectively. To retrieve the element (i,j) of A, we use a(i,j). The i th row of $\mathbf{A}\in {\mathbb{C}}^{I\times R}$ is denoted as A_{(i,:)} while its r th column is denoted by A_{(:,r)}. The operator Di(A) forms a diagonal matrix out of the i th row of A. The KhatriRao (columnwise Kronecker) product between $\mathbf{A}\in {\mathbb{C}}^{I\times R}$ and $\mathbf{B}\in {\mathbb{C}}^{J\times R}$, i.e., $\mathbf{A}\u25c7\mathbf{B}=\left[{\mathbf{A}}_{(:,1)}\otimes {\mathbf{B}}_{(:,1)},\dots ,{\mathbf{A}}_{(:,R)}\otimes {\mathbf{B}}_{(:,R)}\right]\in {\mathbb{C}}^{\mathit{\text{IJ}}\times R}$.
2 System model
We consider a threehop MIMO AF communication system where the source node transmits information to the destination node with the aid of R_{1} relays in the first tier and R_{2} relays in the second tier. As shown in Figure 1, the source and destination nodes are equipped with N_{ s }≥2 and N_{ d }≥2 antennas, respectively, and halfduplex relays are considered. The q th relay of tier 1, which receives data from the source node, is equipped with I_{ q } antennas, q=1,…,R_{1}, while the p th relay of tier 2, which receives data from tier 1 relays, is equipped with J_{ p } antennas, p=1,…,R_{2}. The total number of antennas that transmit in second and third phases are denoted by ${N}_{1}={I}_{1}+\cdots +{I}_{{R}_{1}}$ and ${N}_{2}={J}_{1}+\cdots +{J}_{{R}_{2}}$, respectively.
Some key assumptions are now given: (i) relays are synchronized at the symbol level. More specifically, the timing offset is assumed to be within one symbol period, so that timing information is acquired only through some form of (rough) coarse synchronization; (ii) fading is assumed to be frequency flat, and the data block size is smaller than the channel coherence time so that the channel is considered as time invariant; (iii) the direct links between the source (resp. tier 1 relays) and the destination node are not available^{a}. This situation is evidenced in the current uplink of IEEE 802.16j.
2.1 Data model
The communication between source and destination is accomplished in three hops. In the first hop, the modulated signal vector ${\mathbf{u}}_{\mathrm{s}}\left(t\right)\in {\mathbb{C}}^{{N}_{s}\times 1}$ is transmitted to R_{1} relays. The received signal at the q th relay of tier 1 can be written as
where ${\mathbf{y}}_{\text{sr}}^{\left(q\right)}\left(t\right)\in {\mathbb{C}}^{{I}_{q}\times 1}$ is the received signal vector at the q th relay of tier 1, ${\mathbf{H}}_{\text{sr}}^{\left(q\right)}\in {\mathbb{C}}^{{I}_{q}\times {N}_{s}}$ is the MIMO channel between the source and the q th tier 1 relay, and ${\mathbf{v}}_{\text{sr}}^{\left(q\right)}\left(t\right)\in {\mathbb{C}}^{{I}_{q}\times 1}$ is an additive noise vector. Noise samples are modeled as independent and identically distributed complex Gaussian random variables with zero mean and unit variance.
In the second hop, the source stops transmission and all the R_{1} relays of tier 1 amplify their received signals with diagonal AF matrices ${\mathbf{\text{G}}}^{\left(1\right)},\dots ,{\mathbf{\text{G}}}^{\left({R}_{1}\right)}$ and simultaneously forward the resulting signals to the tier 2 relays. The received signal vector at the p th relay of tier 2 is then given by
where ${\mathbf{H}}_{\text{rr}}^{(p,q)}\in {\mathbb{C}}^{{J}_{p}\times {I}_{q}}$ is the MIMO channel linking the R_{1} tier 1 relays to R_{2} tier 2 relays, while ${\mathbf{v}}_{\text{rr}}^{\left(p\right)}\left(t+1\right)\in {\mathbb{C}}^{{I}_{p}\times 1}$ denotes the corresponding noise vector. In the third hop, the source and all tier 1 relays are silent, while the tier 2 relays process the received signal vector with the diagonal AF matrices ${\mathbf{J}}^{\left(1\right)},\dots ,{\mathbf{J}}^{\left({R}_{2}\right)}$ and forward their amplified signals to the destination. The received signal vector at the destination is then given by
where ${\mathbf{H}}_{\text{rd}}^{\left(p\right)}\in {\mathbb{C}}^{{N}_{d}\times {J}_{p}}$ is the MIMO channel linking the p th tier 1 relay to the destination, and ${\mathbf{v}}_{\text{rd}}(t+2)\in {\mathbb{C}}^{{N}_{d}\times 1}$ the corresponding additive noise term.
Let us define the multirelay (block) channel matrices
and let $\mathbf{G}\doteq \text{bdiag}\phantom{\rule{0.1em}{0ex}}\left[{\mathbf{G}}^{\left(1\right)},\dots ,{\mathbf{G}}^{\left({R}_{1}\right)}\right]\in {\mathbb{C}}^{{N}_{1}\times {N}_{1}}$ and $\mathbf{J}\doteq \text{bdiag}\phantom{\rule{0.1em}{0ex}}\left[{\mathbf{J}}^{\left(1\right)},\dots ,{\mathbf{J}}^{\left({R}_{2}\right)}\right]\in {\mathbb{C}}^{{N}_{2}\times {N}_{2}}$ be the two diagonal matrices that collect the AF coefficients of the overall multirelay system. Using these definitions, and using (1) and (2), we can rewrite (3) as follows:
where ${\overline{\mathbf{v}}}_{\text{rd}}\left(t+2\right)={\overline{\mathbf{v}}}_{\text{sr}}\left(t\right)+{\overline{\mathbf{v}}}_{\text{rr}}\left(t+1\right)+{\mathbf{v}}_{\text{rd}}\left(t+2\right)$ is the total noise at the destination, which contains the filtered noise contributions from the multiple relays, with ${\overline{\mathbf{v}}}_{\text{sr}}\left(t\right)={\mathbf{H}}_{\text{rd}}\mathbf{\text{J}}{\mathbf{H}}_{\text{rr}}\mathbf{G}{\mathbf{v}}_{\text{sr}}$, ${\overline{\mathbf{v}}}_{\text{rr}}\left(t+1\right)={\mathbf{H}}_{\text{rd}}\mathbf{\text{J}}{\mathbf{v}}_{\text{rr}}\left(t+1\right)$, ${\mathbf{v}}_{\text{rr}}\left(t+1\right)\doteq {\left[\underset{\text{rr}}{\overset{\left(1\right)\mathrm{T}}{\mathbf{v}}}\left(t+1\right),\dots ,{\mathbf{v}}_{\text{rr}}^{\left({R}_{2}\right)\mathrm{T}}\left(t+1\right)\right]}^{\mathrm{T}}\in {\mathbb{C}}^{{N}_{2}\times 1}$, ${\mathbf{v}}_{\text{sr}}\left(t\right)\doteq {\left[\underset{\text{sr}}{\overset{\left(1\right)\mathrm{T}}{\mathbf{v}}}\left(t\right),\dots ,{\mathbf{v}}_{\text{sr}}^{\left({R}_{1}\right)\mathrm{T}}\left(t\right)\right]}^{\mathrm{T}}\in {\mathbb{C}}^{{N}_{1}\times 1}$.
Note that, since this work is concerned with channel estimation, the AF matrices G and J cannot be optimized at the transmission (source and relays). Therefore, for simplicity, we have assumed that these matrices are diagonal. The use of nondiagonal AF matrices in the proposed approach is left for a future work. Note also that, once the channels are estimated, the design of full AF matrices can be done, e.g., based on the SVD of the channel matrices, following the idea of [9] or on the meansquare error (MSE) criterion [21]. If simplified AF schemes are used, where only power allocation is done, G and J are diagonal matrices, the coefficients of which can be designed as a function of the mean channel and noise powers [5] or optimized from power allocation strategies, as shown recently in [22].
2.2 Conventional LS estimation method
The simplest approach to estimate the effective channel H_{ e f f }=H_{rd}J H_{rr}G H_{sr} (including the amplifying factors) is based on training sequences. If separate estimations of the multirelay channels H_{rd}, H_{rr}, and H_{sr} are required, for instance, to optimize the source precoding matrix and the relays’ AF matrices, three separate LS estimation stages should operate sequentially at the destination. The method would work similarly to that of Kong and Hua [10]. Denote ${\mathbf{S}}_{0}\in {\mathbb{C}}^{{N}_{s}\times {L}_{0}}$ as the training sequence matrix sent by the source node, while ${\mathbf{S}}_{1d}\in {\mathbb{C}}^{{N}_{1}\times {L}_{1}}$ and ${\mathbf{S}}_{2d}\in {\mathbb{C}}^{{N}_{2}\times {L}_{2}}$ are the training sequence matrices sent by the relays at tiers 1 and 2, respectively. Assume that orthogonal training sequences are used in all stages, which implies training sequences of length L_{0}≥N_{ s }, L_{1}≥N_{1} and L_{2}≥N_{2} at the source, tier 1 and tier 2 relays, respectively. In the first stage, S_{2d} is transmitted from all tier 2 relays to the destination. The LS estimate of H_{rd} is obtained as
where ${\mathbf{Y}}_{1}\in {\mathbb{C}}^{{N}_{d}\times {L}_{2}}$ is the received signal matrix at the destination during the first training stage. In the second stage, S_{1d} is transmitted from all tier 1 relays to the destination via AF processing at the tier 1 relays. Defining ${\mathbf{Y}}_{2}\in {\mathbb{C}}^{{N}_{d}\times {L}_{1}}$ as the data received from tier 1 relays at the second training stage, an LS estimate of H_{rr} can be obtained as
Finally, S_{0} is transmitted from the source to the destination via the two tiers of relays. The destination collects the received data in ${\mathbf{Y}}_{3}\in {\mathbb{C}}^{{N}_{s}\times {L}_{0}}$. An estimate of ${\hat{\mathbf{H}}}_{\text{sr}}$ is then found as
This method requires 6 transmission phases to provide the destination with all the channel matrices (1 phase for estimating H_{rd}, 2 phases for estimating H_{rr} and 3 phases for estimating H_{sr}). Note that the channel estimation errors accumulate across the consecutive stages, due to the dependency between successive channel estimates. Moreover, this method requires N_{ d }≥N_{2}≥N_{1} for the uniqueness of the LS estimates of ${\hat{\mathbf{H}}}_{\text{rr}}$ and ${\hat{\mathbf{H}}}_{\text{sr}}$. In the following, we adopt a different path to solve this problem by capitalizing on tensor analysis. The idea is to provide the destination with a joint estimate of all the partial channels H_{rd}, H_{rr}, and H_{sr} by exploiting the tensor structure of the endtoend signal model. The proposed approach allows channel estimation to be performed under less restrictive conditions on the number N_{ d } of receive antennas at the destination compared with the conventional LS estimator, while avoiding error accumulation.
3 Proposed approach
In order to derive the proposed channel estimators, we first recast the formulation of the system model by resorting to multiway (tensor) analysis. First, let us divide the overall training period into K time blocks. In every time block, the same training sequence matrix ${\mathbf{\text{S}}}_{0}\in {\mathbb{C}}^{{N}_{s}\times {L}_{0}}$ is transmitted by the source node. In the k th time block, the relays of tiers 1 and 2 use the AF matrices G_{ k } and J_{ k }, respectively, k=1,…,K. Let us define $\mathbf{E}\in {\mathbb{C}}^{K\times {N}_{1}}$ and $\mathbf{F}\in {\mathbb{C}}^{K\times {N}_{2}}$ as channel training matrices such that ${D}_{k}\left(\mathbf{E}\right)\doteq {\mathbf{G}}_{k}$ and ${\mathrm{D}}_{k}\left(\mathbf{F}\right)\doteq {\mathbf{J}}_{k}$, where ${\mathrm{D}}_{k}(\xb7)$ forms a diagonal matrix out of the k th row of its matrix argument. Otherwise stated, the rows of E (resp. F) hold the AF coefficients of the R_{1} (resp. R_{2}) relays associated with the different time blocks. Then, the signal received at the destination during the k th time block can be written as:
where ${\mathbf{V}}_{k}={\mathbf{H}}_{\text{rd}}{\mathrm{D}}_{k}\left(\mathbf{\text{F}}\right){\mathbf{H}}_{\text{rr}}{\mathrm{D}}_{k}\left(\mathbf{\text{E}}\right){\mathbf{V}}_{\text{sr},k}+{\mathbf{H}}_{\text{rd}}{\mathrm{D}}_{k}\left(\mathbf{\text{F}}\right){\mathbf{V}}_{\text{rr},k}$, ${\mathbf{v}}_{\text{sr},k}\in {\mathbb{C}}^{{N}_{1}\times {L}_{0}}$ is the noise matrix at the relays during the k th time block, ${\mathbf{v}}_{\text{rr},k}\in {\mathbb{C}}^{{N}_{2}\times {L}_{0}}$ is the noise matrix at the second hop relays for the kth time block, and ${\mathbf{V}}_{rd,k}\in {\mathbb{C}}^{{N}_{d}\times {L}_{0}}$ is the noise matrix at the destination for the k th time block.
Regarding the structure of the channel training matrices $\mathbf{E}\in {\mathbb{C}}^{K\times {N}_{1}}$ and $\mathbf{F}\in {\mathbb{C}}^{K\times {N}_{2}}$, unless otherwise stated, their columns are chosen as lengthK random sequences following a uniform distribution between [1, 1]. These sequences are defined beforehand and known at the destination node. With such a choice, the signals transmitted by the relays across the K time blocks have random phases and are subject to limited power fluctuations. Clearly, this design is not optimal for minimizing the mean square error of the channel estimation. Determining an optimum design for these matrices is a difficult problem and is not pursued in this work. Nevertheless, extensive computer simulations have demonstrated that this choice yields very good results. For convenience, we will come back later to the problem of choosing E and F from a channel identifiability viewpoint. A more elaborated design of these matrices will be then proposed.
Upon reception of the data matrix Y_{ k }, k=1,…,K, an unstructured estimate of the endtoend channel during the k th time block is first obtained at the destination. Multiplying both sides of (11) with the known training sequence matrix ${\mathbf{S}}_{0}^{\mathrm{H}}$ yields
k=1,⋯,K. Let us introduce
where
is the matrixofinterest that represents the effective endtoend channel, V_{ k } is the total noise matrix, and ${\stackrel{~}{\mathbf{H}}}_{k}$ is the noisy observation of H_{ k }. We can assemble the set {H_{1},⋯,H_{ K }} to form a threeway array, or a thirdorder tensor, $\mathcal{\mathscr{H}}\in {\mathbb{C}}^{{N}_{d}\times {N}_{s}\times K}$, whose dimensions are N_{ d } (first dimension), N_{ s } (second dimension), and K (third dimension).
Equation (14) corresponds to a PARATUCK2 model of the (noiseless) tensor [23]. The PARATUCK2 model has first appeared in [20]. A more comprehensive formulation is given in [23], which also details an alternating least squares procedure for estimating its matrix factors. Here, we show that this tensor model can be exploited to derive novel channel estimators for a cooperative MIMO relaying system.
Now, let us define
where H_{[1]} is a matrix ‘unfolding’ for the tensor obtained by stacking columnwise its K slices. Define also
Substituting (14) into (15), and applying property vec(A C B)=(B^{T}⊗A)vec(C), we get
where
${\mathbf{E}}_{(k,:)}\in {\mathbb{C}}^{1\times {N}_{1}}$ (resp. ${\mathbf{F}}_{(k,:)}\in {\mathbb{C}}^{1\times {N}_{2}}$) denote the k th row of E (resp. F), and ⊙ is the KhatriRao (columnwise Kronecker) product.
Applying property vec(A diag(x)B)=(B^{T}⊙A)x, we get from (17) the following expression:
where
In addition to the matrix unfolding H_{[1]}, it is useful to define two other matrix unfoldings, which collect the information of tensor . Therefore, let us now define
From (14) and (16), it follows that
and
or, more compactly,
where
3.1 Identifiability of channel matrices
Identifiability of H_{sr}, H_{rr}, and H_{rd} in the LS sense from H_{[1]}, H_{[2]}, and H_{[3]} (see Equations (19), (24), and (25)), respectively, requires that ${\mathit{\Omega}}_{1}=\left[{\left({\mathbf{\text{E}}}^{\mathrm{T}}\odot {\mathbf{\text{F}}}^{\mathrm{T}}\right)}^{\mathrm{T}}\odot \left({\mathbf{H}}_{\text{sr}}^{\mathrm{T}}\otimes {\mathbf{H}}_{\text{rd}}\right)\right]\in {\mathbb{C}}^{{N}_{D}{N}_{s}K\times {N}_{1}{N}_{2}}$, ${\mathbf{Z}}_{\left[2\right]}\doteq \left({\mathbf{\text{I}}}_{K}\otimes {\mathbf{H}}_{\text{rd}}\right){\mathit{\Omega}}_{2}\in {\mathbb{C}}^{{N}_{d}K\times {N}_{1}}$ and ${\mathbf{Z}}_{\left[3\right]}\doteq \left({\mathbf{\text{I}}}_{K}\otimes \underset{\text{sr}}{\overset{\mathrm{T}}{\mathbf{H}}}\right){\mathit{\Omega}}_{3}\in {\mathbb{C}}^{{N}_{s}K\times {N}_{2}}$ be full columnrank. These requirements come from the fact that Ω_{1}, Z_{[2]}, and Z_{[3]} must be leftinvertible, from which the following necessary conditions are obtained:
From the three inequalities and from the fact that we must have K≥2, the lower bound on the number K of time blocks necessary for identifiability is given by
where ⌈x⌉ is equal to the smallest integer that is greater than or equal to x.
Note that the identifiability of the channel matrices H_{sr}, H_{rr}, and H_{rd} from the unstructured channel tensor will ensure that the compound channel ${\mathbf{H}}_{c}={\mathbf{H}}_{\text{rd}}{\mathbf{H}}_{\text{rr}}{\mathbf{H}}_{\text{sr}}\in {\mathbb{C}}^{{N}_{d}\times {N}_{s}}$ is strictly unique. Note also that conditions N_{ d }N_{ s }K≥N_{1}N_{2} and N_{ d }K≥N_{1} are clearly much less restrictive in terms of the required number N_{ d } of antennas at the destination node, in comparison with the conventional threestep LS estimator that requires N_{ d }≥N_{2}≥N_{1}. Otherwise stated, estimation of the partial channels can be done even in situations where the number of receive antennas is much less than the number of relay antennas (provided that K satisfies condition (28)). This situation may arise in scenarios with denser deployments of relay stations, where the total number of relay antennas exceeds those of source and/or destination antennas. As shown by these inequalities, the possibility of affording fewer receive antennas is compensated by an increase on the number K of training blocks, which represents a tradeoff.
Condition (28), although necessary, is not sufficient for identifiability. Since ${\mathbf{Z}}_{\left[2\right]}\doteq \left({\mathbf{\text{I}}}_{K}\otimes {\mathbf{H}}_{\text{rd}}\right){\mathit{\Omega}}_{2}\in {\mathbb{C}}^{{N}_{d}K\times {N}_{1}}$ and ${\mathbf{Z}}_{\left[3\right]}\doteq \left({\mathbf{\text{I}}}_{K}\otimes \underset{\text{sr}}{\overset{\mathrm{T}}{\mathbf{H}}}\right){\mathit{\Omega}}_{3}\in {\mathbb{C}}^{{N}_{s}K\times {N}_{2}}$, additionally, must have rank(Ω_{2})=N_{1} and rank(Ω_{3})=N_{2}, i.e., both ${\mathit{\Omega}}_{2}\in {\mathbb{C}}^{{N}_{2}K\times {N}_{1}}$ and ${\mathit{\Omega}}_{3}\in {\mathbb{C}}^{{N}_{1}k\times {N}_{2}}$ must be full columnrank. Otherwise, Z_{[2]} and Z_{[3]} will be rankdeficient, even if (28) is respected.
Let us assume that the partial channels H_{sr}, H_{rr}, and H_{rd} are full rank matrices, which is a reasonable assumption when the wireless links are assumed to undergo scatteringrich multipath propagation. The following corollaries can then be obtained:

C1
If N_{1}=N_{2}, identifiability of the partial channels is guaranteed for N_{1}=N_{ s } and N_{2}=N_{ d };

C2
If N_{1}=1, identifiability of the partial channels is guaranteed for N_{2}=N_{ d } and N_{2}=K;

C3
If N_{2}=1, identifiability of the partial channels is guaranteed for N_{1}=N_{ s } and N_{1}=K.
Remark: For the first corollary, we can note that if N_{1}≤N_{ s } and N_{2}≤N_{ d }, then ${\mathbf{H}}_{\text{sr}}^{\mathrm{T}}\otimes {\mathbf{H}}_{\text{rd}}$ is full columnrank, which ensures that ${\mathit{\Omega}}_{1}\in {\mathbb{C}}^{{N}_{D}{N}_{s}K\times {N}_{2}}$ is full columnrank due to its KhatriRao product structure [24]. Likewise, ${\mathit{\Omega}}_{2}\in {\mathbb{C}}^{{N}_{2}K\times {N}_{1}}$ and ${\mathit{\Omega}}_{3}\in {\mathbb{C}}^{{N}_{1}k\times {N}_{2}}$ are also full columnrank in this case, guaranteeing the identifiability of the channel matrices. Regarding the second corollary, it corresponds to a special case of our system model where the first relay tier reduces to a singleantenna relay. In this case, satisfying N_{2}≤N_{ d } and N_{2}≤K ensures that Ω_{1}, Z_{[2]}, and Z_{[3]} are all full columnrank, so that the three partial channels are identifiable. The same reasoning is valid for the third corollary, which is analogous to the second one.
3.2 Essential uniqueness
Let $\left\{{\hat{\mathbf{H}}}_{\text{sr}},{\hat{\mathbf{H}}}_{\text{rr}},{\hat{\mathbf{H}}}_{\text{rd}}\right\}$ be an alternative set of matrices yielding the same unstructured channel tensor satisfying the PARATUCK2 model (14). If H_{sr}, H_{rr}, and H_{rd} are full rank and the identifiability conditions (27) are satisfied, then ${\hat{\mathbf{H}}}_{\text{sr}}$, ${\hat{\mathbf{H}}}_{\text{rr}}$, and ${\hat{\mathbf{H}}}_{\text{rd}}$ are essentially unique. In this case, we have ${\hat{\mathbf{H}}}_{\text{sr}}={\mathit{\Delta}}_{\text{sr}}{\mathbf{H}}_{\text{sr}}$, ${\hat{\mathbf{H}}}_{\text{rd}}={\mathbf{H}}_{\text{rd}}{\mathit{\Delta}}_{\text{rd}}$ and ${\hat{\mathbf{H}}}_{\text{rr}}={\mathit{\Delta}}_{\text{rr}}^{\left(2\right)}{\mathbf{H}}_{\text{rr}}{\mathit{\Delta}}_{\text{rr}}^{\left(1\right)}$, where the following relation holds:
Note that permutation ambiguity does not exist due to the knowledge of the training matrices E and F. The relation (29) can be obtained by replacing the alternative solutions ${\hat{\stackrel{\u0304}{\mathbf{H}}}}_{\text{sr}}$, ${\hat{\mathbf{H}}}_{\text{rd}}$, and ${\hat{\mathbf{H}}}_{\text{rr}}$ into (14) and then applying some basic manipulations using properties of the Kronecker product. Equation (29) turns into the following relations: ${\mathit{\Delta}}_{\text{rd}}{\mathit{\Delta}}_{\text{rr}}^{\left(2\right)}=\alpha {\mathbf{I}}_{{N}_{2}}$ and ${\mathit{\Delta}}_{\text{sr}}{\mathit{\Delta}}_{\text{rr}}^{\left(1\right)}=(1/\alpha ){\mathbf{I}}_{{N}_{1}}$, where α is an arbitrary scalar factor. These two relations come from the fact that the Kronecker product between any two diagonal matrices is equal to the identity matrix if and only if these diagonal matrices are (scaled) identity matrices that compensate each other. Consequently, H_{sr}, H_{rr}, and H_{rd} can be recovered in an essentially unique manner up to scaling factors. The scaling ambiguity can be eliminated by normalizing the first column of H_{sr} or the first row of H_{rd} to one. Since these ambiguities compensate each other, the compound channel is strictly unique and we have ${\hat{\mathbf{H}}}_{c}={\hat{\mathbf{H}}}_{\text{rd}}{\hat{\mathbf{H}}}_{\text{rr}}{\hat{\mathbf{H}}}_{\text{sr}}={\mathbf{H}}_{\text{rd}}{\mathbf{H}}_{\text{rr}}{\mathbf{H}}_{\text{sr}}={\mathbf{H}}_{c}$.
3.3 Trilinear alternating least squares algorithm
The TALS algorithm is an iterative estimation method that alternates among the LS estimations of the channel matrices H_{sr}, H_{rr}, and H_{rd} by fitting a PARATUCK2 model from the noisy matrices ${\stackrel{~}{\mathbf{H}}}_{\mathit{\text{[i]}}}={\mathbf{H}}_{\mathit{\text{[i]}}}+{\mathbf{V}}_{\mathit{\text{[i]}}}$, i=1,2,3. Note that the noise term V_{ [i] } is constructed in a way analogous to H_{ [i] }, i=1,2,3, following Equations (15) and (21), respectively. The AF training matrices E and F are assumed to be known at the destination and are fixed during the estimation process. From (19), (24), and (25), we respectively obtain the following linear optimization problems:
These LS estimation problems can be solved alternately by estimating one channel matrix at each time, while fixing the other matrices to their values obtained in previous estimation steps. Therefore, each iteration of the algorithm has three estimation steps. The algorithm starts by randomly initializing two out of the three channel matrices and proceeds until convergence. In the following, a summary of the TALS algorithm is provided.
Define $\mathbf{\text{e}}\left(n\right)=\text{vec}\left({\stackrel{~}{\mathbf{H}}}_{\left[1\right]}\right)\left[{\left({\mathbf{\text{E}}}^{\mathrm{T}}\odot {\mathbf{\text{F}}}^{\mathrm{T}}\right)}^{\mathrm{T}}\odot \left({\hat{\mathbf{H}}}_{\text{sr}}^{\mathrm{T}}\left(n\right)\otimes {\hat{\mathbf{H}}}_{\text{rd}}\left(n\right)\right)\right]{\hat{\mathbf{h}}}_{\text{rr}}\left(n\right)$. The sum of squared residuals (SSR) at the end of the n th iteration is defined as S S R(n)=e^{H}(n)e(n). We declare the convergence of the algorithm when S S R(n)S S R(n1)≤10^{6}, meaning that the model reconstruction error does not significantly change between two successive iterations.
Generally, the ALS algorithm is sensitive to the initialization, and convergence to the global minimum can be slow when all the matrix factors of the model are unknown [25]. However, in our case, we have observed that convergence to the global minimum is always achieved (e.g., within 10 to 30 iterations for mediumtohigh SNRs) due to the knowledge of the AF training matrices E and F.
3.4 Kronecker least squares algorithm
We now derive a closedform solution to our channel estimation problem by exploiting the mixed Kronecker/ KhatriRao factorization structure of the matrix unfolding H_{[1]} defined in (17). Starting from (13), the noisy version of (15) is given by:
where ${\mathbf{V}}_{\left[1\right]}=\phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}\text{vec}\right({\mathbf{V}}_{1}),\dots ,\text{vec}({\mathbf{V}}_{K}\left)\right]\in {\mathbb{C}}^{{N}_{d}{N}_{s}\times K}$. Let $\mathbf{Z}={\mathbf{\text{E}}}^{\mathrm{T}}\odot {\mathbf{\text{F}}}^{\mathrm{T}}\in {\mathbb{C}}^{{N}_{1}{N}_{2}\times K}$ denote the combined AF training matrix and assume that $\mathbf{Z}{\mathbf{Z}}^{H}={\mathbf{I}}_{{N}_{1}{N}_{2}}$. Multiplying both sides of (33) by Z^{H}, we have:
where ${\hat{\mathit{X}}}_{\left[1\right]}={\mathbf{X}}_{\left[1\right]}+\left({\mathbf{S}}_{0}^{H}\otimes {\mathbf{I}}_{{N}_{d}}\right){\mathbf{V}}_{\left[1\right]}{\mathbf{Z}}^{H}$. From (17), we have:
Our goal is to directly identify the channel matrices from (35). However, let us first address the deterministic design of the AF training matrices E and F such that $\mathbf{Z}{\mathbf{Z}}^{H}\doteq \left({\mathbf{\text{E}}}^{\mathrm{T}}\odot {\mathbf{\text{F}}}^{\mathrm{T}}\right){\left({\mathbf{\text{E}}}^{\mathrm{T}}\odot {\mathbf{\text{F}}}^{\mathrm{T}}\right)}^{H}={\mathbf{I}}_{{N}_{1}{N}_{2}}$. Assuming K≥N_{1}N_{2}, this condition is satisfied by designing Z, for instance, as a discrete Fourier transform (DFT) matrix. Having fixed the structure of Z, we are left with the problem of factorizing this matrix as the KhatriRao product between E^{T} and F^{T}. This problem can easily be solved by means of K rankone matrix factorizations, which admit unique solutions. Note that the k th column of Z can be written as
Defining a rankone matrix ${\stackrel{~}{\mathbf{Z}}}_{k}\doteq \text{unvec}\left(\mathbf{Z}\right(:,k\left)\right)\in {\mathbb{C}}^{{N}_{2}\times {N}_{1}}$, it follows that
from which E(k,:) and F(k,:) can be determined as the unique right and left singular vectors of ${\stackrel{~}{\mathbf{Z}}}_{k}$, k=1,…,K. Note that the proposed design, although not optimized to minimize the mean square error of the channel estimation, ensures that the noise characteristics in (33) will not be changed when ${\hat{\mathbf{H}}}_{\left[1\right]}$ is postmultiplied by Z^{H} (i.e., inverse DFT transformation).
Coming back to the channel estimation problem, from (35), let us define ${\mathbf{x}}_{{n}_{1},{n}_{2}}\in {\mathbb{C}}^{{N}_{s}{N}_{d}\times 1}$ as the [(n_{1}1)N_{2}+n_{2}]th column of ${\mathbf{X}}_{\left[1\right]}\in {\mathbb{C}}^{{N}_{s}{N}_{d}\times {N}_{1}{N}_{2}}$, n_{1}=1,…,N_{1}, n_{2}=1,…,N_{2}. Note that
Defining ${\stackrel{~}{\mathbf{X}}}_{{n}_{1},{n}_{2}}\doteq \text{unvec}\left({\mathbf{x}}_{{n}_{1},{n}_{2}}\right)\in {\mathbb{C}}^{{N}_{d}\times {N}_{s}}$ as a rankone matrix obtaining by reshaping, we have
Consider the singular value decomposition (SVD) of ${\stackrel{~}{\mathbf{X}}}_{{n}_{1},{n}_{2}}$:
From the rankone property of ${\stackrel{~}{\mathbf{X}}}_{{n}_{1},{n}_{2}}$, we have:
Final estimates of H_{rd}(:,n_{2}) and H_{sr}(:,n_{1}) can be obtained by averaging over the N_{1} and N_{2} independent estimates, respectively:
with
Note that the columns of the estimated ${\hat{\mathbf{H}}}_{\text{sr}}$ and ${\hat{\mathbf{H}}}_{\text{rd}}$ have unit energy while each entry of ${\hat{\mathbf{H}}}_{\text{rr}}$ concentrates all the energy of the wireless link connecting the source node to the destination node via a given tier 1tier 2 relay pair. Such an interpretation is useful for designing transmit and receive spatial filters for system optimization as well as for power allocation purposes.
Discussion: The KRLS algorithm involves the computation of N_{1}N_{2} SVDs to provide rankone approximations for the matrices ${\hat{\mathbf{X}}}_{1,1},\dots ,{\hat{\mathbf{X}}}_{{N}_{1},{N}_{2}}$, of dimensions N_{ d }×N_{ s }, which are constructed from the N_{1}N_{2} columns ${\hat{\mathbf{X}}}_{\left[1\right]}$. The distinguishing feature of the KRLSbased estimator is on the closedform solution to the problem, as opposed to the TALS algorithm that consists of iterative LS estimation steps, which implies a higher computational complexity. However, note that the KRLS algorithm is only applicable under the condition K≥N_{1}N_{2}, which is necessary for Z=E^{T}⊙F^{T} to have orthogonal rows, leading to (35). In contrast, the TALS algorithm can operate under a much lower bound on K, as discussed in Section 3.1. This is clearly a tradeoff between both estimators in terms of identifiability conditions and computational complexity. As will be shown in the next section, both estimators provide satisfactory performances, and the choice of the best estimator is rather dependent on the design constraints of the system. For instance, we can say that the TALS estimator is preferable if processing power at the receiver is not too limited, as is often the case with base station reception in outdoor micro or macrocells. The KRLS solution would be more likely chosen in indoor scenarios, where channel coherence time is long enough to allow for higher values of K.
4 Extension to twoway MIMO relaying systems
In the previous sections, we have focused on a multirelay cooperative scheme, where transmission is directed in one direction, i.e., from a specific source to a specific destination via two tiers of multiple relays. In this section, we show that the same modeling approach can be extended to a twoway MIMO relaying scenario, where pilot/data transmission takes place in both directions. In the first phase, two sources simultaneously transmit their data to the multiple relays. Note that, in the twoway case, the relays of each tier receive a superposition of ${N}_{{s}_{1}}+{N}_{{s}_{2}}$ signals coming from sources 1 and 2. In the second and third phases, interrelay communication takes place. More specifically, in phase two, tier 1 relays transmit signals towards tier 2 relays, while tier 1 relays stay silent. In phase three, the opposite happens. Finally, in the fourth communication phase, all the relays transmit to the two sources, and each one of them receives a superposition of N_{1}+N_{2} signals.
In the first transmission phase, we assume that training symbol matrices ${\mathbf{S}}_{1}\in {\mathbb{C}}^{{N}_{{s}_{1}}\times L}$ and ${\mathbf{S}}_{2}\in {\mathbb{C}}^{{N}_{{s}_{2}}\times L}$ are transmitted from sources 1 and 2, respectively. We omit the additive noise terms for convenience of presentation. The signal received at the i th relay tier is given by:
where ${\mathbf{H}}^{\left(i\right)}\doteq \left[{\mathbf{H}}_{{\mathrm{s}}_{1}{\mathrm{r}}_{i}}\phantom{\rule{0.3em}{0ex}}{\mathbf{H}}_{{\mathrm{s}}_{2}{\mathrm{r}}_{i}}\right]\in {\mathbb{C}}^{{N}_{i}\times ({N}_{{s}_{1}}+{N}_{{s}_{2}})}$, and $\mathbf{S}\doteq {\left[\underset{1}{\overset{T}{\mathbf{S}}}\phantom{\rule{0.3em}{0ex}}{\mathbf{S}}_{2}^{T}\right]}^{T}\in {\mathbb{C}}^{({N}_{{s}_{1}}+{N}_{2})\times L}$. The training sequence S_{ i } chosen by source i, is designed to satisfy the following conditions:

(i)
${\mathbf{S}}_{i}{\mathbf{S}}_{i}^{H}={\mathbf{I}}_{{N}_{i}}$, i=1,2,

(ii)
${\mathbf{S}}_{1}{\mathbf{S}}_{2}^{H}={\mathbf{0}}_{{N}_{1}\times {N}_{2}}$.
A possible construction satisfying these two conditions is based on the normalized DFT matrix of size $L\times ({N}_{{s}_{1}}+{N}_{{s}_{2}})$, with $L\ge {N}_{{s}_{1}}+{N}_{{s}_{2}}$. This design allows the sources to eliminate the selfinterference generated by their own transmission, when receiving the signal back from the relays.
In the second and third phases, where interrelay communications happen, the signal received at the relays of tier i from the relays of tier j, (i,j)={(1,2),(2,1)}, can be written as:
k=1,…,K, where ${\mathbf{H}}_{{\mathrm{r}}_{j}{\mathrm{r}}_{i}}\in {\mathbb{C}}^{{N}_{i}\times {N}_{j}}$ is the MIMO channel linking the relays of tier j at transmission to the relays of tier i at reception, (i,j)={(1,2),(2,1)}. Note that channel reciprocity in the interrelay communications is not a necessary assumption which means that we may have ${\mathbf{H}}_{{\mathrm{r}}_{1}{\mathrm{r}}_{2}}\ne {\mathbf{H}}_{{\mathrm{r}}_{2}{\mathrm{r}}_{1}}$.
Finally, in the fourth transmission phase, the signals received at sources 1 and 2 are postmultiplied by ${\mathbf{S}}_{2}^{H}$ and ${\mathbf{S}}_{1}^{H}$, respectively, to accomplish selfinterference elimination, yielding
and
where
Therefore, we can conclude that the signals received at sources 1 and 2 in the considered twoway MIMO relaying scenario (Equations (50) and (51)) follows a PARATUCK2 model. By analogy with the noiseless part of the oneway signal model (14), we have the following correspondences between the factor matrices:
Consequently, the tensorbased channel estimation algorithms proposed in the previous section can be equally applied at each source to estimate the channels ${\stackrel{\u0304}{\mathbf{H}}}^{(i,1)}$, ${\stackrel{\u0304}{\mathbf{H}}}^{(i,2)}$ and ${\mathbf{G}}_{\mathrm{r}\mathrm{r}}^{\left(i\right)}$, i=1,2, from Equations (50) and (51), respectively. If reciprocity is assumed in the twoway relay channels, we have:
which in turn implies ${\stackrel{\u0304}{\mathbf{H}}}^{(1,1)}={\left({\stackrel{\u0304}{\mathbf{H}}}^{(2,2)}\right)}^{T}={\mathbf{H}}_{{s}_{1}}$, ${\stackrel{\u0304}{\mathbf{H}}}^{(1,2)}={\left({\stackrel{\u0304}{\mathbf{H}}}^{(2,1)}\right)}^{T}={\mathbf{H}}_{{s}_{2}}$, and ${\mathbf{G}}_{\mathrm{r}\mathrm{r}}^{\left(1\right)}={\left({\mathbf{G}}_{\mathrm{r}\mathrm{r}}^{\left(2\right)}\right)}^{T}=\mathbf{G}$. In this particular case, the PARATUCK2 models (50) and (51) become essentially equal, i.e., they depend on the same unknown channel matrices ${\mathbf{H}}_{{s}_{1}}$, ${\mathbf{H}}_{{s}_{2}}$, and G to be estimated. Note, however, that such a reciprocity is not a necessary assumption of our modeling approach, which can be used in the general case of nonsymmetrical twoway MIMO relay channels.
5 Numerical results
We now present computer simulation results for assessing the performance of the proposed channel estimator in selected system configurations. The estimator’s performance is evaluated in terms of the normalized mean square error (NMSE) of the estimated channel matrices. From the estimated channels, the performance in terms of bit error rate (BER) is calculated by assuming a linear receive filter. The BER and NMSE curves are plotted as a function of the overall signaltonoise ratio (SNR) at the destination. This SNR is given by the ratio between the powers of the useful signal component and the noise component in Equation (11). For each simulated SNR value, the results represent an average over L=5,000 Monte Carlo runs. At each run, the channel coefficients are drawn from a circularly symmetric complexvalued Gaussian distribution with zeromean and unit variance, while the transmitted symbols are drawn from a BPSK sequence. The SNR level at the tier 1 and tier 2 relays are assumed to be 30 dB above the SNR level at the destination.
For purposes of performance evaluation, the scaling ambiguities affecting the estimates of the channel matrices are removed by assuming the first column of H_{sr} and first row of H_{rd} contain all one elements, similarly to [11, 14]. These scaling ambiguities can be determined as follows. First, we find ${\mathit{\Delta}}_{\text{sr}}={D}_{1}\left({\hat{\mathbf{H}}}_{\text{sr}}^{\mathrm{T}}\right)$ and ${\mathit{\Delta}}_{\text{rd}}={D}_{1}\left({\hat{\mathbf{H}}}_{\text{rd}}\right)$. Then, applying property (A B)⊗(C D)=(A⊗C)(B⊗D) yields $\left({\mathit{\Delta}}_{\text{sr}}\otimes {\mathit{\Delta}}_{\text{rd}}\right)\left({\mathit{\Delta}}_{\text{rr}}^{\left(1\right)}\otimes {\mathit{\Delta}}_{\text{rr}}^{\left(2\right)}\right)={\mathbf{I}}_{{N}_{1}{N}_{2}}$, from which we obtain ${\mathit{\Delta}}_{\text{rr}}^{\left(1\right)}\otimes {\mathit{\Delta}}_{\text{rr}}^{\left(2\right)}={\mathit{\Delta}}_{\text{sr}}^{1}\otimes {\mathit{\Delta}}_{\text{rd}}^{1}$. A solution to this relation is then found as ${\mathit{\Delta}}_{\text{rr}}^{\left(1\right)}={\left[{D}_{1}\left({\hat{\mathbf{H}}}_{\text{sr}}^{\mathrm{T}}\right)\right]}^{1}$ and ${\mathit{\Delta}}_{\text{rr}}^{\left(2\right)}={\left[{D}_{1}\left({\hat{\mathbf{H}}}_{\text{rd}}\right)\right]}^{1}$.
In Figure 2, we depict the NMSE performance for the compound channel of our proposed estimators in comparison with the conventional LS estimator. The parameters are N_{ s }=2, N_{1}=4, N_{2}=4, N_{ d }=6, K=16, L_{0}=30, and the number of transmitted data symbols is N=1000. We can see that TALS and KRLS have similar performances, which are considerably better than the conventional (threestage) LS estimator. The worst performance of the LS estimator comes from the error accumulation across successive channel estimation stages, which degrades its overall NMSE performance.
Figure 3 shows the NMSE performance of our proposed estimators in comparison with the twohop bilinear alternating least squares (BALS) estimator of Rong et al. [14]. This estimator is a special case of the proposed one, where only one tier of relays is used. In this case, model (11) reduces to
and the channel matrices H_{sr} and H_{rd} are estimated by means of a BALS algorithm. The parameter setting is the same as that of Figure 2. It can be seen the proposed estimator operates satisfactorily, being able to effectively estimate the three channel matrices. Figure 3 also indicates the proposed estimator performs close to the BALS estimator operating in a twohop system. A small performance degradation is observed, which is due to the presence of an additional AF transmission phase of our threehop system, resulting in a higher overall noise contribution at the destination. Note also that the TALS estimator involves three estimation steps while the BALS one has two estimation steps only.
Figure 4 shows the BER performance of a linear zero forcing (ZF) receiver designed from the estimated channel matrices, which are obtained from the TALS, KRLS, or the conventional LS estimators. The ZF receiver operates on data block collected in the received data matrix $\mathbf{Y}\in {\mathbb{C}}^{K{N}_{d}\times N}$. The length of the data block is N=100 symbols, and the remaining system parameters are the same as those of the previous experiment. The ZF filter output is given by:
This figure shows similar BER performances for TALS and KRLS, which are better than that of the conventional LS algorithm. This result corroborates the effectiveness of our channel estimators when used with linear receiver for symbol detection. In Figure 5, we evaluate the impact of the number of relay antennas on the BER performance of a linear ZF detector using the proposed TALS channel estimator. The fixed system parameters are N_{ s }=2, N_{ d }=6, L_{0}=30, and K=10. It can be seen that the BER performance is considerably improved as the number of relay antennas is increased, corroborating the expected gains of cooperative diversity. Although not plotted in this figure, the BER curves of the KRLS estimator are similar to those obtained with the TALS one.
Figure 6 depicts the performance of the ZF receiver designed from the perfect CSI for all channel matrices. Two parameter settings are considered, where N_{ d }=2 and 4, respectively. The other system parameters are fixed to N_{ s }=2, N_{1}=N_{2}=3, L_{0}=6, and K=9. First, it can be seen that the BER performances are considerably improved as the number of antennas at the destination is increased, owing to the higher spatial diversity, as expected. From these results, we also find that the TALS and KRLS estimators provide similar results and, more interestingly, their performances are close to that of the perfect CSI case. For instance, for a target BER of 10^{1}, the SNR gap with respect to the perfect CSI case is less than 2 dB.
6 Conclusions
We have proposed channel estimation algorithms for MIMO AF multirelay systems. The proposed estimators are designed to provide the destination (base station) with the instantaneous CSI of all the channels involved in the communication. In contrast to conventional pilotassisted channel estimation, the proposed algorithms make a more efficient use of cooperative diversity by providing a joint estimation of all the channel matrices thanks to the use of a tensor modeling of the endtoend system. Such a joint estimation can be accomplished either iteratively (using TALS) or in closedform (using KRLS). Our numerical results corroborate the effectiveness of the proposed algorithms. The TALS estimator has a higher computational complexity than the KRLS one due to its iterative nature. On the other hand, the minimum condition for operation of KRLS (K≥N_{1}N_{2}) is more restrictive than the identifiability conditions of TALS, which implies more training (i.e., higher number of time blocks) to carry out the joint channel estimation. Both algorithms are suitable to the joint channel estimation problem, and a particular choice is mostly dictated by practical system requirements. We have also provided an extension of the proposed approach to twoway MIMO multirelay system and verified that such an extension results in the same tensor model as the oneway scenario. Consequently, the proposed algorithms can be applied to one and twoway multirelay MIMO schemes.
Endnote
^{a} Since our focus is on the relay channel, direct links are not considered for simplicity. However, the idea proposed in this work can be easily extended to include direct links.
References
 1.
Sendonaris A, Erkip E, Aazhang B: User cooperation diversity  part I: system description. IEEE Trans. Commun 2003, 51(11):19271938. 10.1109/TCOMM.2003.818096
 2.
Sendonaris A, Erkip E, Aazhang B: User cooperation diversity  part II: implementation aspects and performance analysis. IEEE Trans. Commun 2003, 51(11):19391948. 10.1109/TCOMM.2003.819238
 3.
Laneman JN, Tse DNC, Wornell GW: Cooperative diversity in wireless networks: efficient protocols and outage behavior. IEEE Trans. Inform. Theor 2004, 50(12):30623080. 10.1109/TIT.2004.838089
 4.
Cao L, Zhang J, Kanno N: Multiuser cooperative communications with relaycoding for uplink IMTadvanced 4G systems. In Proc. IEEE GLOBECOM’09. Honolulu, HI; November 2009:16.
 5.
Liu KJR, Sadek AK, Su W, Kwasinski A: Cooperative Communications and Networking. Cambridge University Press, New York, USA; 2009.
 6.
Pabst R, Walke BH, Schultz DC, Herhold P, Yanikomeroglu H, Mukherjee S, Viswanathan H, Lott M, Zirwas W, Dohler M, Aghvami H, Falconer DD, Fettweis GP: Relaybased deployment concepts for wireless and mobile broadband radio. IEEE Comm. Mag 2004, 42(9):8089. 10.1109/MCOM.2004.1336724
 7.
Dohler M, Li Y: Cooperative Communications: Hardware, Channel and PHY. John Wiley & Sons, West Sussex, United Kingdom; 2010.
 8.
Rong Y, Tang X, Hua Y: A unified framework for optimizing linear nonregenerative multicarrier MIMO relay communication systems. IEEE Trans. Signal Process 2009, 57(12):48374851.
 9.
Toding A, Khandaker MRA, Rong Y: Joint source and relay optimization for parallel MIMO relay networks. EURASIP J. Adv. Signal Process 2012, 174: 17.
 10.
Kong T, Hua Y: Optimal design of source and relay pilots for mimo relay channel estimation. IEEE Trans. Signal Process 2011, 59(9):44384446.
 11.
Lioliou P, Viberg M, Coldrey M: Efficient channel estimation techniques for amplify and forward relaying systems. IEEE Trans. Comm 2012, 60(11):31503155.
 12.
Roemer F, Haardt M: Tensorbased channel estimation and iterative refinements for twoway relaying with multiple antennas and spatial reuse. IEEE Trans. Signal Process 2010, 58(11):57205735.
 13.
Fernandes CAR, de Almeida ALF, Costa DB: Unified tensor modeling for blind receivers in multiuser uplink cooperative systems. IEEE Signal Process. Lett 2012, 19(5):247250.
 14.
Rong Y, Khandaker MRA, Xiang Y: Channel estimation of dualhop MIMO relay system via parallel factor analysis. IEEE Trans. Wireless Comm 2012, 11(6):22242233.
 15.
de Almeida ALF, Fernandes CAR, Benevides da Costa D: Multiuser detection for uplink dscdma amplifyandforward relaying systems. IEEE Signal Process. Lett 2013, 20(7):697700.
 16.
Ximenes LR, Favier G, Almeida ALF, Silva YCB: PARAFACPARATUCK semiblind receivers for twohop cooperative MIMO relay systems. IEEE Trans. Signal Process 2014, 62(14):36043615.
 17.
Harshman RA: Foundations of the PARAFAC procedure: model and conditions for an ‘explanatory’ multimode factor analysis. UCLA Working Papers Phonetics 1970, 16: 184.
 18.
Carroll JD, Chang JJ: Analysis of individual differences in multidimensional scaling via an Nway generalization of “EckartYoung” decomposition. Psychometrika 1970, 35(3):283319. 10.1007/BF02310791
 19.
Sidiropoulos ND, Budampati RS: KhatriRao spacetime codes. IEEE Trans. Signal Process 2002, 50(10):23962407. 10.1109/TSP.2002.803341
 20.
Harshman RA, Lundy ME: Uniqueness proof for a family of models sharing features of Tucker’s threemode factor analysis and PARAFAC/CANDECOMP. Psychometrika 1996, 61(1):133154. 10.1007/BF02296963
 21.
Chalise BK, Zhang YD, Amin MG: Joint Optimization of Source Beamformer and Relay Coefficients Using MSE Criterion. Proc. of SPIE’12 May 2012.
 22.
Mohammadi M, Ardebilipour M, Mobini Z, Zadeh RA: Performance analysis and power allocation for multihop multibranch amplifyandforward cooperative networks over generalized fading channels. EURASIP J. Wireless Commun. Networking 2013, 2013(1):113. 10.1186/1687149920131
 23.
Bro R: Multiway analysis in the food industry: Models, algorithms and applications. PhD thesis, University of Amsterdam, Amsterdam, 1998
 24.
Sidiropoulos ND, Liu X: Identifiability results for blind beamforming in incoherent multipath with small delay spread. IEEE Trans. Signal Process 2001, 49(1):228236. 10.1109/78.890366
 25.
Smilde A, Bro R, Geladi P: Multiway Analysis: Applications in the Chemical Sciences. John Wiley & Sons, West Sussex, England; 2004.
Acknowledgements
André L. F. de Almeida is partially supported by CNPq and CAPES. This work was supported by the China’s Next Generation Internet Project (CNGI Project) (CNGI1203009) and DNSLAB. This work was also supported by National Natural Science Foundation of China (Grant No. 61173017).
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Xi Han, André LF de Almeida and Zhen Yang contributed equally to this work.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Han, X., de Almeida, A.L. & Yang, Z. Channel estimation for MIMO multirelay systems using a tensor approach. EURASIP J. Adv. Signal Process. 2014, 163 (2014) doi:10.1186/168761802014163
Received
Accepted
Published
DOI
Keywords
 Channel estimation
 Cooperative MIMO
 Relaying
 Tensor decomposition