Unified tensor model for space-frequency spreading-multiplexing (SFSM) MIMO communication systems

This paper presents a unified tensor model for space–frequency spreading-multiplexing (SFSM) multiple-input multiple-output (MIMO) wireless communication systems that combine space-and frequency-domain spreadings, followed by a space–frequency multiplexing. Spreading across space (transmit antennas) and frequency (subcarriers) adds resilience against deep channel fades and provides space and frequency diversities, while orthogonal space–frequency multiplexing enables multi-stream transmission. We adopt a tensor-based formulation for the proposed SFSM MIMO system that incorporates space, frequency, time, and code dimensions by means of the parallel factor model. The developed SFSM tensor model unifies the tensorial formulation of some existing multiple-access/multicarrier MIMO signaling schemes as special cases, while revealing interesting tradeoffs due to combined space, frequency, and time diversities which are of practical relevance for joint symbol-channel-code estimation. The performance of the proposed SFSM MIMO system using either a zero forcing receiver or a semi-blind tensor-based receiver is illustrated by means of computer simulation results under realistic channel and system parameters.


Introduction
Wireless communication systems employing multiple antennas at both ends of the link, commonly known as multiple-input multiple-output (MIMO) systems, are being considered as one of the key technologies to be deployed in current and upcoming wireless communication standards [1]. In this context, the integration of multiple-antenna systems with code-division multiple-access (CDMA) transmission and/or orthogonal frequency division multiplexing (OFDM) has also been the subject of several works over the past few years [2][3][4].
Different combinations of OFDM and CDMA have been reported in a number of works. Multi-carrier (MC)-CDMA performs spreading of the information symbols across the different subcarriers [5,6], but suffers from limited frequency diversity gains like conventional CDMA *Correspondence: andre@gtel.ufc.br 1 Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza, Brazil Full list of author information is available at the end of the article systems. MC direct-sequence (MCDS)-CDMA differs from MC-CDMA by performing the spreading operation in the time-domain at each subcarrier [7]. For combating frequency-selective fading, MCDS-CDMA requires forward error-correction coding and frequency-domain interleaving. In [8], a hybrid of MC-CDMA and OFDM systems with orthogonal transmission in the frequencydomain was proposed, which ensures interference-free transmission/reception regardless of the multipath channel profile. A related approach, called multicarrier blockspread (MCBS)-CDMA, was introduced in [9] by capitalizing on redundant block spreading and frequencydomain linear precoding to preserve orthogonal multipleaccessing and to enable full multipath diversity gains. The receiver is based on a low-complexity single-user equalization.
By introducing the spatial dimension at the transmit processing, jointly with time and/or frequency dimensions, a number of different space-frequency MIMO transceivers were proposed to enable orthogonal multiple-access in multiuser systems combining OFDM http://asp.eurasipjournals.com/content/2013/1/48 and CDMA techniques. A spread spectrum-based transmission framework was proposed in [10], therein called multicarrier spread space spectrum multiple access (MC-SSSMA), with the idea of fully spreading each user symbol over space, time, and frequency. MC-SSSMA is a generalization of its single-carrier counterpart proposed in [11]. Despite the achieved spectral efficiency gains, the design of [10] was restricted to the case where the number of transmit and receive antennas is equal to the spreading gain. In [12], space-time-frequency spreading was proposed for MC-CDMA based on the concatenation of a space-time spreading code with a frequency-domain spreading code.
A common characteristic of all these works is the assumption of perfect channel knowledge at the receiver. When the channel is not known, as it is the case in practice, the receiver design is generally based on suboptimum (linear or nonlinear) filtering/equalization/signal separation structures that use training sequences for channel acquisition and tracking, before decoding the transmitted data. However, practical limitations such as the receiver complexity and the training sequence overhead (which implies a reduction of the information rate) may be too restrictive and prohibitive in some cases.
Recently, tensor modeling has successfully been applied to the design of MIMO transceivers based on spatial multiplexing and/or space-time coding [13][14][15][16][17][18][19]. Relying on the use of spreading codes, the common feature of these works is the fact that the received signal can be modeled as a third-order tensor, the dimensions of which are associated with space, time, and code diversities [20]. Due to the uniqueness properties of tensor models, these tensor-based MIMO-CDMA transceivers afford blind multiuser detection and channel estimation under more relaxed conditions compared with conventional matrixbased receivers. The approach of [13] relies on pure spatial multiplexing by means of a parallel factor (PARAFAC) model [21]. The work of [14] deals with a multiple-access MIMO antenna system relying on a block tensor model [22]. In [15], a constrained "block-structured" PARAFAC model is proposed for allowing multiuser space-time spreading in the uplink. The multiuser downlink case is treated in [16]. More general tensor-based space-time spreading and multiplexing structures were also proposed relying on the constrained factor (CONFAC) model [17,18] and on PARATUCK-type models [19,23].
In this article, we present a unified tensor model for space-frequency spreading-multiplexing (SFSM) MIMO wireless communication systems combining both space and frequency spreadings along with a space-frequency multiplexing. On one hand, spreading across space (transmit antennas) and frequency (subcarriers) potentially provides robustness against frequency-selective fading and channel ill-conditioning while providing transmit diversity gains. On the other hand, an orthogonal space-frequency multiplexing enables interference-free multistream transmission. For this system, we adopt a tensorial formulation of the transmitted and received signals that jointly incorporates space, frequency, time, and code dimensions by means of a PARAFAC tensor model. From this tensorial formulation, we show how several existing multipleantenna CDMA-based systems can be derived by making appropriate simplifications on the unified tensor model structure.
We also address the problem of joint symbol-channelcode estimation for the proposed system by capitalizing on the uniqueness properties of the PARAFAC model. By exploiting the space, time, frequency, and code diversities inherent to the unified SFSM tensor model, we obtain new results providing useful bounds on the required number of transmit and receive antennas, subcarriers, and spreading length for ensuring a unique recovery of users' symbols, channels, and codes. A performance evaluation of the SFSM MIMO system is also carried out considering a zero forcing (ZF) receiver and a semiblind alternating least squares (ALS) receiver that only requires a single pilot symbol per transmitted data stream in order to remove the scaling factor introduced by the estimation process.
The remainder of this article is organized as follows. In Section 2, the main building blocks of the SFSM transmitter are detailed and the transmitted signal model is formulated. In Section 3, we present the received signal model and also derive the proposed unifying tensor model and its special cases. A ZF receiver with joint blockdecoding and equalization is formulated in Section 4. Section 5 is dedicated to the problem of joint symbolchannel-code estimation for the unified SFSM MIMO system, where bounds on the required numbers of transmit/receive antennas, subcarriers, spreading length, and the number of symbols per data stream are provided. The semi-blind ALS receiver is also presented in this section. In Section 6, the performance of the SFSM MIMO system is evaluated by means of computer simulations under different system parameter settings. The article is concluded in Section 7.
Notations: Some notations and properties are now defined. Scalars are denoted by lower-case letters (a, b, . . .), vectors are written as boldface lower-case letters (a, b, . . .), matrices as boldface capitals (A, B, . . .), and tensors as calligraphic letters (A, B, . . .). We use a i,j =[ A] i,j to denote the entry (i, j) of matrix A while a i,j,k,l refers to the entry (i, j, k, l) of the tensor A ∈ C I×J×K×L . The ith row and jth column of A are denoted by A i. C 1×J and A .j ∈ C I×1 , respectively. A T , A −1 and http://asp.eurasipjournals.com/content/2013/1/48 A † stand for transpose, inverse, and pseudo-inverse of A, respectively. The operator diag(·) forms a diagonal matrix from its vector argument, while blockdiag(·) forms a block-diagonal matrix from its matrix arguments. The operator vecdiag(·) forms a column vector out of the main diagonal of its matrix argument, while 1 R denotes the "allones" vector of dimension R. The operator vec(·) forms a vector by stacking the columns of its matrix argument. D i (A) forms a diagonal matrix holding the ith row of A on its main diagonal. The Kronecker and the Khatri-Rao products are denoted by ⊗ and , respectively: . . .
We shall make use of the following properties of the Khatri-Rao product: vec with A ∈ C I×R , B ∈ C J×R and x ∈ C R , and where * denotes the Hadamard (element-wise) matrix product.

SFSM: transmitted signal model
We consider the uplink of a single-cell multicarrier multiuser MIMO system with Q active co-channel users transmitting data across the same set of F subcarriers. Each user terminal is equipped with M t transmit antennas and transmits R data streams. The base station is equipped with M r receive antennas. The proposed SFSM transmission structure is composed of three main operations: (i) space spreading, (ii) frequency spreading, and (iii) space-frequency block-coding. Figure 1 depicts the block diagram of the transmitter structure by focusing on the transmission of the nth symbol of the rth data stream. For notational simplicity, we begin by limiting ourselves to a single-user transmission model in order to facilitate the presentation. Later on, we show that the multiuser signal model is readily obtained with minor changes in notation.

Space-domain spreading
The input symbol sequence is serial-to-parallel converted into R data streams, each one being constituted by N symbols. For the nth symbol period, let us define s n,r as the nth symbol of the rth data stream. The first operation is the space spreading, which consists in spreading each data stream on the M t transmit antennas using a different code. Let us define . =[ · 1 , . . . , · r , . . . , · R ] ∈ C M t ×R as the matrix collecting the code vectors of the R data streams. The space-domain spread signal is defined by the thirdorder tensorS ∈ C M t ×N×R , the (m t , n, r)th element of which is given bȳ s m t ,n,r = ω m t ,r s n,r , and represents the nth space spread symbol of the rth data stream transmitted by the m t th antenna. For the space-domain spreading matrix , we choose a Vandermonde design with complex generators As shown in [24], the Vandermonde structure minimizes an upper bound of the pairwise error probability at high signal-to-noise ratios (SNRs). Moreover, this structure yields a good coding gain and makes the transmission more robust to ill-conditioned/rank-deficient MIMO channels [25].

Frequency-domain spreading
The second operation consists in jointly spreading and coding each components m t ,n,r in the frequency-domain. This operation is implemented by means of linear precoding, which adds transmit redundancy in the frequencydomain before the multicarrier modulation. Each data symbol is transmitted simultaneously (in parallel) on different subcarriers in a way similar to an MC-CDMA system with frequency-domain spreading [26]. In addition to provide frequency diversity gains, frequency-domain spreading adds resilience to symbol detection even in the presence of a deep channel fade over one or more subcarrier channels. Let . =[ · 1 , . . . , · r , . . . , · R ] ∈ C F×R be the frequency spreading matrix. The output of this frequency spreading operation is given bỹ s f ,m t ,n,r = θ f ,rsmt,n,r = θ f ,r ω m t ,r s n,r , which is the (f , m t , n, r)th element of the fourth-order ten-sorS ∈ C F×M t ×N×R representing the space-frequency spread signal s n,r associated with the nth symbol period and rth data stream. http://asp.eurasipjournals.com/content/2013/1/48 The frequency spreading can be redundant (F > R) or not (F ≤ R). As for the space-domain spreading, here we also choose as a Vandermonde matrix with complex The reason for choosing the Vandermonde structure for the frequency spreading matrix follows that of the space spreading matrix. Some designs for have been reported in the literature (we refer the interested reader to [27] for further details).
Note that spreading in the space-domain consists in multiplying the symbol s n,r by a complex code that depends on the transmit antenna number m t while spreading in the frequency-domain results in a multiplication of the same symbol by a complex code that depends on the frequency number f, as shown in (6).

Space-frequency multiplexing
The third operation of the SFSM transmitter consists in a multiplexing of the R space-frequency spread symbols. Using conventional direct sequence (DS) spreading, each space-frequency symbols f ,m t ,n,r is spread by a factor P using a specific spreading code. Due to spectrum spreading at the subcarrier level, each subcarrier signal constitutes a DS spread signal. Consequently, the frequency spectrum associated with each subcarrier is allowed to overlap in order to achieve high spectral efficiency.
. . , C · R ] ∈ R P×R as the spreading code matrix the columns/rows of which belong to a (possibly truncated) Walsh-Hadamard (WH) code matrix. When P ≤ R, we form C by selecting the P first rows of an R × R WH matrix. Each spreading code vector is applied with the chip period T c = T/P, where T corresponds to the OFDM symbol duration. The proposed space-frequency multiplexing operation consists in summing up R DS spread signals, each one of which being obtained by multiplyings f ,m t ,n,r by the corresponding spreading code c p,r . Therefore, this operation yields a multi-stream signal tensor Z ∈ C F×M t ×N×P whose typical element is given by

Multicarrier modulation
Before being transmitted, the space-frequency multiplexed signal passes through the OFDM modulator. Considering a frequency selective wireless link between each transmit-receive antenna pair, define L max as the maximum length of the impulse response of all the channels, including the effects of the physical channel, and pre-/post-filtering at transmitter and receiver. An inverse fast Fourier transform (IFFT) is applied and a cyclic prefix (CP) of L max chips is appended to the resulting timedomain samples. Let = T cp F H ∈ C J×F be a matrix representing the combined IFFT and CP-adding opera- is the CPadding matrix, J = F + L max , and I cp is the matrix formed from the L max last rows of I F , the identity matrix of order http://asp.eurasipjournals.com/content/2013/1/48 F. The output of the IFFT+CP-adding block corresponding to the transmitted signal is given by the following tensor transformation: where ξ j,k =[ ] j,k and x j,m t ,n,p is a typical element of the transmitted signal tensor X ∈ C J×M t ×N×P .

SFSM: received signal model
The block diagram of the receiver is depicted in Figure 2.
We adopt a discrete-time baseband equivalent model for the received signal in the SFSM MIMO system, assuming perfect chip-and symbol-level synchronization at the receiver. Following the tensor notation used in the previous section, the fourth-order tensor V ∈ C J×M r ×N×P representing the time-domain received signal in absence of noise a is defined as: whereḣ j,m r ,m t is an element of the tensorḢ ∈ C J×M r ×M t , H · m r m t ∈ C J×1 being the impulse response of the channel linking the m r th receive antenna to the m t th transmit antenna.
The time-domain samples v j,m r ,n,p pass through the combined FFT and CP-removal (CPR) block, represented C F×J is the CPR matrix. This yields the following received signal tensor Y ∈ C F×M r ×N×P : Using (10), we can rewrite (11) as where h f ,k,m r ,m t corresponds to the end-to-end (frequency-domain) channel tensor H ∈ C F×F×M r ×M t that results from the combined FFT+CPR and IFFT+CP transformations at the receiver and transmitter, respectively. Note that h f ,k,m r ,m t is zero for all f = k. In matrix notation, this can be seen by noting that the matrix slicë has a diagonal structure [28]. Consequently, we can simplify (12) by eliminating the summation over index k, yielding Finally, using (6) and (8), we can rewrite (13) as: In the next section, we show how the tensor model (14) satisfied by the received signals can be cast into a PARAFAC model by contracting the first two modes of Figure 2 Receiver block diagram. http://asp.eurasipjournals.com/content/2013/1/48 the transmitted and received signal tensors. Our motivation behind the use of PARAFAC modeling comes from the possibility of studying identifiability by resorting to the well-known results available in the literature.

PARAFAC model formulation
In its general form, the PARAFAC decomposition amounts to decomposing the third-order tensor X ∈ C I 1 ×I 2 ×I 3 into a sum of R rank-one third-order tensors [21]. It has the following scalar representation where a (n) i n ,r is the entry (i n , r) of the nth mode matrix factor A (n) ∈ C I n ×R , n = 1, 2, 3. When R is minimal, it is called the rank of X .
Starting from the space-frequency block-coded signal (8), let us contract the first two modes of the coded sig- with M = FM t , and define the space-frequency spreading matrix U ∈ C M×R such as Then, Equations (6), (8), and (16) lead to the following contracted signal tensor: which corresponds to a third-order PARAFAC model for the transmitted signal tensorZ ∈ C M×N×P , with matrix factors (U, S, C). Following the same reasoning, let us now contract the first two modes of the received signal tensor Y ∈ C F×M r ×N×P by defining i = (f − 1)M r + m r , with I = FM r . Combining this contraction with the one introduced for the transmitted signal tensorZ ∈ C M×N×P and using (17), we get the following contracted received signal tensorȲ ∈ C I×N×P : i,m u m,r s n,r c p,r , (18) whereH ∈ C I×M is a channel matrix obtained from a double contraction of the end-to-end chan- i,m u m,r as the effective MIMO channel linking the R multiplexed data streams at the transmitter to the I = FM r equivalent subchannel outputs at the receiver, the tensorȲ can be rewritten element-wise as which corresponds to a third-order PARAFAC model for the contracted received signal tensorȲ. The final step is to determine an adequate expression for the factorization of the effective MIMO channel matrix G. From the definition of g i,r and the expression (16) of U, we get Note that the contracted received signal tensorȲ ∈ C I×N×P given by (19) follows a PARAFAC model with matrix factors (H( ), S, C). In fact, models (17) and (19) for the transmitted and received signal tensors, respectively, differ only in their first-mode matrix factors, which are related by (20).
For the model (19), we have the following matrix representations: where

Multiuser case
The extension of the transmitted and received signal models to the multiuser MIMO case is straightforward. Let us assume that Q users are transmitting to the base station (uplink transmission) and that all users have the same number M t of transmit antennas, M r denoting the number of receive antennas at the base station. The multiuser signal model follows that of the single-user case by considering a block-partitioned notation. In the multiuser case, the total number of transmitted data streams (summed over all the users) is equal to R = R (1) + · · · + R (Q) , where R (q) denotes the number of space-frequency spread data streams transmitted by the qth user. With these definitions, the received signal model (19) can be rewritten as follows In this case, the mode-1 unfolded matrix representation of (24) is given bȳ Therefore, the PARAFAC model (17) is equally valid for the multiuser case by simply interpreting its factor matrices as blockmatrices.

Special cases
The proposed structured PARAFAC model (19) of the received signal is general in the sense that it incorporates several existing multiple-access/multiple-antenna signaling schemes. By making appropriate assumptions, the proposed model can gradually be simplified, so that we obtain different tensor-based transceiver models as special cases: • Space-time spreading CDMA (STS-CDMA) : For F = 1, which corresponds to a single-carrier transmission over a flat-fading channel, we can abandon the frequency-dependent index and eliminate the frequency spreading matrix = 1 T R , so that G =H . Thus, the trilinear model (21) reduces to classical space-time spreading using multiple spreading codes and can be written as: This model is valid for modeling the multiple-antenna transmission systems proposed in [25,29]. • Spatial multiplexing CDMA (SM-CDMA) : In SM-CDMA systems, the space spreading operation (which is responsible for spreading R data streams across M t transmit antennas) is eliminated. In other words, each data stream is transmitted by a different transmit antenna. Still considering F = 1, in this case we have R = M t , = I M t , and = 1 T R , which implies G =H, and model (21) becomes: This model covers a spatial multiplexing/multipleaccess CDMA system using a different spreading code per transmit antenna [2], and is the same as the PARAFAC-CDMA model proposed in the seminal paper [20]. It also coincides with the Khatri-Rao space-time (KRST) coding model of [13].
• Multicarrier CDMA systems (MCBS-CDMA /MCDS-CDMA/ MC-CDMA) : We consider the transmission model of a MCDS-CDMA system where frequency-domain spreading and orthogonal multiplexing take place (e.g. see [26,30]). This is a single-input single-output (SISO) antenna system (M r = M t = 1), which means that the channel matrixH reduces to an F × F diagonal matrix, and we can eliminate the space spreading matrix = 1 T R so that G =H ∈ C F×R . Consequently, the general PARAFAC model (21) becomes: It is worth noting that this special model can be interpreted as the tensorial formulation of the MCBS-CDMA system proposed in [9]. In particular, if frequency-domain spreading is not used, we have = I R so that (28) reduces to a PARAFAC model for a MCDS-CDMA system with direct-sequence spectrum spreading at the subcarrier level [7]. In the SISO case, whereH ∈ C F×F is diagonal, if space-frequency block-coding is not used (P = 1 and C = 1 T R ), then (28) reduces to traditional MC-CDMA, and we have: • Conventional spatial multiplexing: This is the well-known single-user single-carrier MIMO system with spatial multiplexing (such as the V-BLAST system of [31]). Then, we have F = P = 1, R = M t , and C = = 1 T R , = I M t . In this case, the general PARAFAC model (21) simplifies to the conventional matrix-based model: Table 1 summarizes the different special cases covered by the proposed tensor model. It allows us to deduce how the proposed tensor model parameters and the structure of the associated matrix factors are adjusted to model different existing systems in a tensorial form.
Remark 1 (subcarrier grouping). In order to reduce the complexity of the receiver, we can resort to subcarrier grouping [32,33]. It consists in dividing the set of F subcarriers into μ nonintersecting subsets of K equispaced subcarriers, where K can be chosen equal to the number of independent multipaths. Since both F and K can be viewed as system design parameters, we choose them so that μ = F/K is an integer. Information recovery can be carried out independently within each subcarrier group at the receiver (after FFT demodulation). This lowcomplexity detection strategy will be considered later in our simulations. We have chosen to not explicitly model http://asp.eurasipjournals.com/content/2013/1/48 a Tensor models with diagonal channel matrixH ∈ C F×F . b Systems in which the received signal model is reduced to a matrix (bilinear) decomposition.
subcarrier grouping in order to avoid unnecessary complicated mathematical notation in the formulation of the transmitted and received signal models.

ZF receiver
Assuming that the channel (H), code (C), and spreading ( , ) matrices are known at the receiver, we propose a ZF receiver that simultaneously estimates all the R transmitted data streams by means of a joint blockdecoding and an equalization without de-spreading. The ZF receiver is based on Equation (21). It minimizes the least squares (LS) criterion Ȳ 1 −(C G)S T 2 with respect to the symbol matrix, giving a simultaneous estimate of the R data streams as: where Since C G ∈ C PFM r ×R must be full column-rank to be left-invertible, the ZF receiver requires that PM r F ≥ R. From the structure of (32), we can observe that the ZF receiver does not require code-orthogonality to jointly estimate the transmitted signals. In Section 5, we propose a PARAFAC-based receiver that can blindly operate, i.e. without a priori knowledge of the space-frequency MIMO channel.

Space-frequency linear combiner
Note that, under the condition P ≥ R, the columnorthonormality of C turns the ZF receiver into a simpler space-frequency linear combiner that avoids matrix inversion and decodes each transmitted data stream separately. Indeed, if C has orthonormal columns, we have C H C = I R . By expanding W in (32) and using property (3), we get Since the Hadamard product I R * G H G eliminates the offdiagonal elements of G H G, we have so that

Semi-blind ALS receiver
The goal of the base station receiver is to separate the co-channel transmissions while recovering the data transmitted by each user. In our proposed SFSM MIMO system, co-channel transmissions are represented by the R data streams accessing simultaneously the space, time, and frequency channel resources. We are interested in a semi-blind receiver that neither requires prior knowledge, or estimation, of channel and antenna array responses, nor relies on statistical independence between the transmitted signals. These properties are distinguishing features of the PARAFAC modeling and constitute the main motivation for using the unified tensor model. Moreover, the proposed receiver is called semi-blind in the sense that it relies only on a single pilot symbol inserted at the beginning of each data stream. This pilot symbol is used to remove the scaling factor introduced by the estimation process.
We now study the joint symbol-code-channel recovery by capitalizing on the fundamental uniqueness property of the PARAFAC model (19). This property allows to establish several practical corollaries, which provide lower bounds on the required number of transmit/receive antennas, subcarriers, symbol periods, and the spreading length for ensuring a semi-blind symbol-code-channel estimation. They also clearly illustrate the underlying tradeoffs involving space, frequency, and code diversities. http://asp.eurasipjournals.com/content/2013/1/48 Let us rewrite the three unfolded matrices of the received signal in (21), (22), and (23), in the following manner where Z (c,g) = C G ∈ C PFM r ×R , Z (g,s) = G S ∈ C FM r N×R , and Z (s,c) = (S C)( ) T ∈ C NP×FM t , where we have used the factorization of G defined in (20). Identifiability of the symbol, code, and channel matrices in the LS sense from factorizations (35) requires that Z (c,g) , Z (g,s) , and Z (s,c) be full column-rank, which implies min(PFM r , FM r N) ≥ R, and NP ≥ R ≥ FM t . (36) The first inequality comes from the full column-rank requirement of C G and G S, while the second one comes from the full column-rank requirement of (S C)( ) T . These necessary conditions are useful when one is interested in eliminating system configurations leading to a non-identifiable model. We emphasize that conditions (36) do not imply model identifiability since it is not a sufficient condition.
In the following, we start from the Kruskal's condition for the essential uniqueness of the PARAFAC decomposition [34] and then deduce simplified conditions by considering different special cases of practical interest. Directly applied to model (19), Kruskal's condition states that G, S, and C can uniquely be estimated up to column permutation and scaling ambiguities b from the received data tensorȲ if where k (·) denotes the Kruskal-rank c of a matrix. Assume that G =HU is full rank. If the number N of symbols is large enough compared to the number R of data streams, the symbol matrix S is likely to be full rank. Note also that the space-frequency multiplexing matrix C has orthogonal columns and is full rank by definition. Taking these considerations into account, Kruskal's condition can be written as [34,35]: We now use the fact that G =HU, with U given in (16) and consider particular cases leading to simplifications of (38) which are of practical relevance for the unified SFSM MIMO system. Interesting tradeoffs for joint symbol-channel-code estimation can explicitly be obtained.

Single-carrier transmission (F = 1)
1. M r ≥ M t . We have G =H . Assuming thatH is full column-rank and is full rank due to its Vandermonde structure, it follows that rank(G) = rank( ) = min(M t , R), and (38) becomes: 2. R ≥ M t . In this case is full row-rank due to its Vandermonde structure. Assuming thatH is modeled by i.i.d entries (which corresponds to scattering-rich propagation) and thus is full rank, it follows that rank(G) = rank(H) = min(M r , M t ), which implies: These two conditions (39) and (40) have interesting practical corollaries. Assuming that the number of symbols and the code spreading factors are large enough (i.e., both S and C are full column-rank), they become, respectively, and and can be interpreted in the following way.

Single-antenna transmission (M t = 1)
In this case,H ∈ C FM r ×F is full column-rank, and we have G =H . Moreover, considering that is full rank due to its Vandermonde structure, we have rank(G) = rank( ) = min(F, R), which implies: Now, assuming that S and C are full column-rank (i.e., N ≥ R and P ≥ R), condition (43) is equivalent to: and we obtain: Corollary 3. For M t = 1, spreading across F = 2 subcarriers is sufficient for joint symbol-code-channel recovery, regardless of the number R ≥ 2 of data streams, for large enough number of symbols and code spreading factors. Note that this condition is independent on the number M r of receive antennas, which means that joint symbolcode-channel recovery is achieved even with one receive http://asp.eurasipjournals.com/content/2013/1/48 antenna. This clearly illustrates the tradeoff between frequency diversity and space diversity at the receiver, which is inherent to this trilinear PARAFAC model.

Small spreading lengths (P < R)
A different interpretation of (39) and (40) arises if S is full column rank, but P < R, i.e., the spreading length is smaller than the number R of data streams. This is a challenging situation, since most of the multiuser receivers (as well as the single-user one) need P ≥ R in order to achieve multiuser interference rejection or de-spreading. In this case, for single-carrier transmissions (F = 1), conditions (39) and (40) reduce, respectively, to the following ones: and The simplified condition (45) results in the following corollary: Corollary 4. For M r ≥ M t ≥ R, spreading across P = 2 chips is sufficient for joint symbol-code-channel recovery, regardless of the number R ≥ 2 of data streams and receive antennas. This condition establishes a tradeoff between code diversity (spreading length) and space diversity afforded by the proposed trilinear PARAFAC modeling.

Remark 2.
When subcarrier grouping is used, receiver processing is parallelized into μ independent detection "layers", each one associated with K = F/μ subcarriers. For this reason, identifiability can be studied group-wise (i.e., what matters for identifiability is K and not F) since the results obtained for a given subcarrier group are equally valid for all the other groups.
It is worth mentioning that uniqueness conditions more relaxed than Kruskal's one have been reported in [36,37], and can be applied to our PARAFAC model. For instance, it is common to assume that the symbol matrix S is full column-rank for sufficiently large N. In this case, applying the sufficient condition derived in [37] to model (19) gives the following uniqueness condition: Note that this condition is more relaxed than Kruskal's condition (37). In connection with [36], it is shown in [37] that this condition is valid if G and C are randomly sampled from an (FM r +P)R-dimensional continuous distribution. In a recent work [38], a mathematical proof is provided to the case of non-random G and C matrices.

Receiver algorithm
The symbol-code-channel recovery is carried out by estimating each one of the three matrix factors S, C, and G of the trilinear PARAFAC model (19) through the minimization of the following nonlinear cost function: In this study, we propose the use of the ALS algorithm [20,39,40], which is the classical solution to minimize this cost function. It exploits the Khatri-Rao factorizations (21)-(23) of the unfolded matrix representations of the received signal tensor, by alternating among the estimation of G, S, and C. These estimates are found by, respectively, optimizing the three following LS criteria: where Y i =Ȳ i + B i , i = 1, 2, 3, is the noisy version ofȲ i , and B i is a matrix representing the additive complex-valued white Gaussian noise. d We can rely on the knowledge of the space and frequency spreading matrices and to directly obtain an LS estimate of H , provided that the second inequality of (36) is satisfied, i.e., if R ≥ FM t . From (51), and using (20), we have H On the other hand, if R < FM t , a unique estimation ofH is not guaranteed, although we can still estimate S, C and G from (49), (50), and (51), respectively.
The ALS algorithm always monotonically converges to (at least) a local minimum. Convergence to the global minimum can sometimes be slow if all the matrix fac-torsH, S, and C are unknown. Several alternative algorithms have been proposed in the literature to alleviate the slow convergence problems caused by a random initialization of the algorithm. For instance, an eigenanalysis solution based on compression of the tensor dimensions can be used [20]. The study of [37] proposes a generalization of the eigenanalysis solution by means of simultaneous matrix diagonalization. The convergence can also be improved by means of enhanced line search [41,42] or, using a nonlinear optimization algorithm such as the Levenberg-Marquardt algorithm [43]. The ALS algorithm rapidly converges when one of the three matrix factors of the model is known. This is typically the case in the SFSM MIMO system when relying on the knowledge of the code and spreading matrices (C, , ). http://asp.eurasipjournals.com/content/2013/1/48 After convergence of the ALS algorithm, the estimated matrix factors S, C, and H are affected by unknown scaling factors. In order to eliminate the scaling ambiguity from the columns of S, thus leading to an unambiguous symbol recovery, we assume that "all ones" pilot symbols are introduced at the beginning of the transmission, i.e., at the first symbol of all the data streams. Mathematically speaking, this means that the first row of the symbol matrix is given by S 1 · =[ 1 1 · · · 1] ∈ C 1×R . A final estimate of the symbol matrix is therefore obtained in the following manner: where D 1 ( S) is the diagonal matrix formed from the first row of S. In principle, the ALS receiver is capable of processing a higher number of users as long as condition (38) is satisfied. Regarding the computation complexity, three matrix inverses are performed at each iteration of the algorithm. The asymptotic complexity is therefore O(R 3 ) per iteration. Consequently, a joint detection of a very large number of users can be prohibitive. This is generally a common limitation of multiuser detection receivers. Note that the computational complexity can be reduced if users' codes are mutually orthogonal. In this case, their symbol matrices can be estimated separately using (34).

Simulation results
We simulated a system operating at a transmission rate of R c = 1/T c = 4.096 × 10 6 chips per second (cps), using a total of F = 64 subcarriers divided into μ groups of K subcarriers each. Note that F = 64 is a fixed parameter, while K is a transmission design parameter (now representing the frequency spreading length) that will be varied in our simulations. Due to subcarrier grouping, at each symbol period, R symbols belonging to R different data streams are transmitted using μ groups of K subcarriers. In all simulations, we assume the transmission of N = 10 symbols per data stream. In order to avoid interference between adjacent subcarriers, a guard interval in the form of a CP is appended to each OFDM symbol [5]. Perfect time and frequency synchronization is assumed. Table 2 summarizes the SFSM MIMO system parameters. At each run, the transmitted symbols are randomly drawn from a quaternary phase shift keying (QPSK) alphabet. The channel is assumed quasi-static, which means that the channel impulse responses do not change during the N symbol periods. Each plotted bit error rate (BER) curve is shown as a function of an overall SNR measure, given by where B ∈ C F×M r ×N×P is the additive noise tensor, whose entries are circularly symmetric complex Gaussian random variables. Note that this SNR measure takes all the received signal dimensions into account, i.e., the number F of subcarriers, the number M r of receive antennas, the number N of symbol periods, and the spreading length P. At each run, the additive noise power is generated according to this SNR measure. The BER curves represent the performance averaged over the R transmitted data streams and 1,000 independent Monte Carlo runs. We adopt two frequency selective channel models for modeling the channel between each pair of transmit and receive antennas. Both are ITU's outdoor-to-indoor models, and are valid for typical urban propagation environments: (i) the 4-ray pedestrian channel A and (ii) the 6-ray pedestrian channel B [44]. The channel parameters are summarized in Tables 3 and 4. Note that, for channel A, the maximum multipath delay is τ max = 410 ns, so that the maximum channel impulse response memory is L max = τ max /T c = 2 chip samples. We chose a CP length of 5 chips when considering channel A. For channel B, the maximum multipath delay is τ max = 3700 ns, so that maximum channel impulse response memory has L max = τ max /T c = 15 chip samples. We chose a CP length of 20 chips when the channel B is simulated.
In the following simulation results, the maximum number of iterations allowed for the ALS algorithm is fixed to 1000. Thus, for each Monte Carlo run, we assume that the algorithm has converged at the tth iteration when |e (t) − e (t−1) | < 10 −4 for t ≤ 1000, where e(t) is the error between the received signal tensor and its reconstructed version obtained from the estimated matrices S(t), C(t), and H (t). By exploiting the knowledge of the spreading codes, convergence is typically achieved within a few iterations. In a more challenging situation where  the spreading codes are unknown, the convergence speed is much slower. In this situation, we make use of eigenanalysis to initialize the ALS algorithm [20], and we have discarded 1% of the total number of runs for the BER calculation, corresponding to inevitable non-convergent runs, typical in ALS-type algorithms due to their sensitivity to initialization [40]. As an illustrative example, we have simulated a system with M t = M r = 2, K = 2, P = 8, N = 10, R = Q = 8 (i.e., R (q) = 1, q = 1, . . . , Q) and SNR = 30 dB. For this system configuration, Figure 3 depicts an histogram of the required number of iterations for convergence of the ALS algorithm. The histogram was based on 100 Monte Carlo runs. In this example, 92% of the runs have converged within the first 1,000 iterations.

Semi-blind ALS versus ZF receivers
The following simulation results illustrate the performance of the SFSM MIMO system using the ALS receiver   3. To evaluate the channel estimation accuracy as a function of the SNR.
All the simulations were performed assuming F = 64 subcarriers divided into groups of K = 2 or K = 4 subcarriers. As a reference for comparison, in Figure 4, we compare the performance of the semi-blind ALS receiver with that of the ZF receiver described in Section 4, which assumes perfect channel and code knowledge. Our aim is to determine the performance loss due to semi-blind receiver processing in the SFSM MIMO system. We assume M t = M r = 2, K = 2, P = 8, N = 10, Q = 4, and R (q) = 2, q = 1, . . . , 4. We can observe that the performance loss of the proposed receiver in comparison with the perfect ZF receiver is around 5 dB for channel A and 2 dB for channel B, for a BER equal to 10 −3 . In particular, the slope of the BER curves is approximately the same, which means that the proposed receiver presents the same BER improvement as the ZF receiver as a function of the SNR. Also, both receivers perform better with channel B due to the increased multipath diversity.

Performance for different system loads
The next results illustrate the performance of the proposed receiver for different system loads. From now on, the ITU channel B is considered in all the simulations. We assume M t = 2, K = 2, P = 16, and N = 20 while the number of users is varied (Q = 4, 6, and 8). Each user transmits two data streams (R (q) = 2, q = 1, . . . , Q). We assume M r = 1 or 2. Note that these configurations are challenging in terms of receiver spatial diversity, since M r is always smaller than Q. Our aim is, therefore, to show that joint symbol-channel-code estimation is still possible in this situation thanks to the joint use of SFSM and PARAFAC modeling. Note that the sufficient uniqueness http://asp.eurasipjournals.com/content/2013/1/48 condition (38) is satisfied in the chosen configurations. In fact, as can be observed from Figure 5, semi-blind recovery of symbol, channel, and codes is achieved even when M r = 1. For instance, with M r = 2 receive antennas, increasing the number of users from Q = 4 to Q = 6, or from Q = 6 to Q = 8, implies nearly a 2-dB increase in the required SNR for a target BER of 10 −2 . We can also note that the BER performance is more sensitive to a variation in the system load when M r = 2 receive antennas are used.

Comparison with the MCDS-CDMA system
The MCDS-CDMA system is a multicarrier extension of the classical DS-CDMA to frequency-selective channels, by performing the spreading operation in the timedomain at each subcarrier [7]. As shown in Section 3.3, the PARAFAC modeling is also valid to model the MCDS-CDMA system, which is a special case of the proposed SFSM MIMO system, where space and frequency spreadings are not used (i.e., M t = 1 and K = 1). We now compare the performance of both systems using the same PARAFAC-based ALS receiver with knowledge of the spreading codes. The perfect ZF receiver was also simulated for both systems as a reference for comparison. By comparing SFSM with MCDS-CDMA, we can verify the impact of space and frequency spreadings as a distinguishing feature of the SFSM MIMO system. Here, we assume M r = 2, P = 8, N = 50, and Q = 8, each user transmitting a single data stream (i.e., R (q) = 1, q = 1, . . . , 8). Figure 6 shows the substantial performance gain obtained with the proposed system, which corroborates the advantages of space and frequency spreadings. We can also note that the gap between ALS and ZF receivers is smaller when SFSM MIMO is used.

Comparison with the SSSMA system
In [10], an MC-SSSMA system was proposed to provide space and frequency diversities in the forward link of a MIMO wireless system. The space-frequency spreading model proposed therein is a generalization of [45] to frequency-selective channels. The multicarrier SSSMA system has some similarity with the proposed SFSM MIMO system in the sense that space and frequencydomain spreadings are performed. In [10], a joint spacetime spreading is used by means of Hadamard codes (its structure is detailed in [11]), while our approach uses separate space and frequency spreadings using Vandermonde codes. In Figure 7, the performances of SSSMA and SFSM MIMO are compared. We assume M t = 2 transmit antennas, M r = 1 or 2, F = 64 and K = 2. For a fair comparison, we adjust the transmit parameters and the modulation to keep the same data rate for both systems. The SSSMA scheme assumes R = 8, P = 2, and BPSK. For the proposed SFSM scheme we have R = 4, P = 4, and 16-QAM. In this case, both schemes have a rate of 2 bits per channel use. For the SSSMA system, a ZF receiver with perfect channel knowledge is used. For the proposed SFSM MIMO system, a semi-blind estimation without channel knowledge is used. The spreading codes are assumed to be known at the receiver for both systems. Note that for M r = 1, SSSMA exhibits a poor performance. This is due to the fact that multiuser detection in the SSSMA system requires M r ≥ M t . This constraint is not necessary in the SFSM MIMO system that makes an efficient use of the frequency diversity to separate the transmitted data streams when spatial diversity is not available at the receiver. For M r = 2, SSSMA outperforms SFSM MIMO over the low-to-medium SNR range. For higher SNR values, the proposed system has superior performance. The slope of the BER curves indicates that the proposed SFSM scheme has a higher diversity gain.

Channel estimation performance
The channel estimation accuracy of the semi-blind ALS receiver is now evaluated from a root mean square error (RMSE) measure obtained from 100 Monte Carlo runs. The overall RMSE is calculated using the following formula: where H (i) is the channel matrix estimated at the ith Monte Carlo simulation. The following system configuration is considered for the SFSM MIMO system: Q = 1, M t = 2, P = K = 2, R = 4, N = 10, and M r = 1 or 2. We can observe from Figure 8 that the RMSE has a linear decrease as a function of the SNR in both cases. Using M r = 2 antennas provides a performance gain of 3 dB over the single receive antenna case. Such a gain obviously comes from the increased receiver spatial diversity that helps the separation of the data streams, despite the larger number of parameters to estimate.

Conclusion
We have proposed a unified tensor model for MIMO communication systems with SFSM. The proposed model unifies several existing multiple-access/multiple-antenna communication systems. We have shown that the received signal can be formulated as a trilinear PARAFAC model, and capitalizing on its uniqueness property we have put in evidence lower bounds on the design parameters (number of transmit/receive antennas, subcarriers, symbols per data stream, and spreading length) for a joint symbolcode-channel recovery. The obtained conditions help the understanding of the existing tradeoffs involving space, frequency, and code diversities that are inherent to the SFSM MIMO system. The performance of the proposed receiver using a semi-blind ALS algorithm has been illustrated by means of computer simulations under realistic channel models and system parameters, and a comparison with other multiple-antenna CDMA-based systems has been made. Perspectives of this work include an investigation of the impact of different transmit antenna, spreading code, and subcarrier allocation schemes on the design and performance of the proposed tensor-based receiver. We believe that these features could be integrated into the SFSM system by modeling the received signals using a CONFAC tensor model [18]. In this case, identifiability can be investigated using the recently established results on the partial uniqueness of constrained tensor decompositions [46,47]. The impact of non-perfect users' synchronization on the receiver performance is also a subject for a future work. Endnotes a For notational convenience, we omit the noise terms in the following developments. They will be added later, when the receiver algorithm is presented. b This means that any alternative triplet {G,S,C} satisfying model (19) is related to the true triplet {G, S, C} by the following equalities:G = G 1 ,S = S 2 ,C = C 3 , where is a permutation matrix and i , i = 1, 2, 3, are diagonal (scaling) matrices such that 1 2 3 = I R .
c The Kruskal-rank of A is equal to κ if every subset of κ columns of A is linearly independent. d See [20,40] for further details about the ALS algorithm.