EURASIP Journal on Applied Signal Processing 2002:3, 275–288 c ○ 2002 Hindawi Publishing Corporation Low Complexity Receiver Structures for Space-Time Coded Multiple-Access Systems

Multiuser detection for space-time coded synchronous multiple-access systems in the presence of independent Rayleigh fading is considered. Under the assumption of quasi-static fading, it is shown that optimal (full diversity-achieving) space-time codes designed for single-user channels, can still provide full diversity in the multiuser channel. The joint optimal maximum likelihood multiuser detector, which can be implemented with a Viterbi-type algorithm, is derived for such space-time coded systems. Low complexity, partitioned detector structures that separate the multiuser detection and space-time decoding into two stages are also developed. Both linear and nonlinear multiuser detection schemes are considered for the ﬁrst stage of these partitioned space-time multiuser receivers. Simulation results show that these latter methods achieve performance competitive with the single-user bound for space-time coded systems.


INTRODUCTION
Most previous work in space-time coded systems has been concerned with single-user channels [1,2,3].For example, a performance criterion for single-user space-time code construction was given in [3].In this paper, multiuser detection for space-time coded multiple-access systems is considered.We first investigate the design of space-time codes for multiple-access channels subject to quasi-static, independent Rayleigh fading.We show that the code design criterion derived for the single-user channel in [3] can still be used in this multiuser case.In particular, we show that diversity-achieving codes in single-user channels are capable of providing the full diversity in such multiuser channels.We also consider detection and decoding of space-time coded multiuser systems.As we will see, the joint Maximum Likelihood (ML) decoder for such systems has prohibitively large computational complexity, motivating us to consider low-complexity, sub-optimal detector structures.In particular, we propose partitioned space-time multiuser detectors that separate the multiuser detection and space-time decoding into two stages.We consider both linear and nonlinear schemes for the first stage of the partitioned receiver and examine the performance of these detectors.
Low complexity multiuser receiver structures for spacetime coded systems have previously been described in [4,5].For example, a multi-stage receiver suitable for a codedivision multiple-access (CDMA) system employing both turbo and space-time coding was proposed in [5].However, this paper considers only space-time block coding whereas we are concerned here with space-time trellis coded multiuser systems.Receiver structures for multiple-access systems with both space-time block and trellis coded systems have been presented in [4].In particular, [4] has proposed an iterative multiuser receiver based on interference cancellation and Minimum Mean Square Error (MMSE) filtering for a space-time trellis coded CDMA system.Though this receiver has the same basic structure as one of the partitioned receivers we proposed in Section 4 below, there are some important differences between the two schemes, as we will point out in Section 4. This presentation is organized as follows: in Section 2, we present our signal model, while in Section 3, we derive the jointly optimal ML detector/decoder.In Section 4, we propose low complexity receiver structures for space-time coded multiuser systems by separating the multiuser detection stage from the space-time decoder stage.We consider both linear and nonlinear multiuser detection stages.
In particular, in this section, we consider partitioned spacetime multiuser receivers based on the linear decorrelator [6] and on the linear MMSE estimator [6], as well as two partitioned receiver structures based on nonlinear interference cancelling multiuser detection stages.We propose an iterative receiver structure based on the turbo principle, similar to that developed in [7] for a convolutionally coded CDMA channel, and demonstrate how near-single-user performance is achievable with only a few iterations.Section 5 details a soft input soft output (SISO) maximum a posteriori probability (MAP) decoder [8] that can be used as the second stage of these interference cancelling receivers.Finally, in Section 6 we give performance results for the proposed receiver architectures for representative space-time trellis codes.

SIGNAL MODEL
Consider a system of K independent users, each employing an independent space-time code with N T transmitter antennas.The binary information sequence {d k (n)} ∞ n=0 of user k, for k = 1, . . ., K, is first encoded by a space-time encoder, and then the encoded data is divided into N T streams by passing it through a serial-to-parallel converter. 1The code bits in each parallel stream are block interleaved, BPSK symbol-mapped, modulated by an appropriate signature waveform, s k (t), and are transmitted simultaneously from the N T transmitter antennas.
The kth user's transmitted signal at time t can thus be written as where {b knT (i) ∈ {+1, −1}} B−1 i=0 is the symbol-mapped spacetime encoder output of the kth user on transmitter antenna n T at time i, and B is the number of channel symbols per user in a data frame which is assumed to be the same as the length of a space-time codeword.We assume that the signature waveform of each user is supported only on the interval 0 ≤ t ≤ T, and is normalized so that Thus, A 2 k represents the transmitted energy per bit of user k, independent of the number of transmitter antennas.Note that the model of ( 1) is otherwise general with regard to the signalling format, and so the following results can be applied to any signalling scheme.However, we are interested here in nonorthogonal signalling schemes such as code-division multiple-access (CDMA).
Assuming that the fading is sufficiently slow to be constant over a received data frame, the corresponding signal received at a single receive antenna can be written as (3) where n(t) is complex white Gaussian noise with zero mean and variance N 0 /2 per dimension.The complex fading coefficient, α k,nT , between the kth user's n T th transmitter antenna and the receiver, is assumed to be a zero-mean unit variance complex Gaussian random variable with independent real and imaginary parts.Equivalently, α k,nT has uniform phase and Rayleigh amplitude; that is, the so-called Rayleigh fading model.These fading coefficients are assumed to be mutually independent with respect to both k and n T .In what follows, we assume that all parameters of the model ( 3) are known to the receiver.Only the transmitted symbols are unknown.

JOINT ML MULTIUSER DETECTION AND DECODING FOR SPACE-TIME CODED MULTIUSER SYSTEMS
In this section, we consider the joint maximum-likelihood detection and decoding of the symbols in the model of Section 2. To do so, we first establish some notation.We denote the kth user's transmitted symbol vector (on N T antennas) at time i by the row vector b k where we have also introduced the notation, for k = 1, . . ., K, Note that D k ∈ {+1, −1} B×NT , for k = 1, . . ., K. We will call the joint codeword, D, of all users, the super codeword.The space-time coded output from all the users at time i is the K × KN T matrix denoted as D(i), where The fading coefficients of the kth user can be collected into a vector , and we can combine all these fading coefficient vectors into one vector . With this notation, the output, of a bank of K matched filters (each matched to a user signature waveform s k (t)) at the ith symbol interval can be written as y(i) = RAD(i)α + η(i), (7) where the diagonal matrix A is defined as R is the cross-correlation matrix of the users' signature waveforms and η(i) ∼ ᏺ(0, N 0 R).Denote the B-vector of the kth matched filter outputs corresponding to the complete received codeword as ] T and the BK-vector of outputs of all the matched filters corresponding to a complete codeword as y = [y 1 • • • y K ] T .Then we can write where η ∼ ᏺ(0, N 0 R ⊗ I B ), I B denotes the B × B identity matrix and ⊗ denotes the Kronecker product.In going from ( 9) to (10), we have used the fact that, for general matrices A, B, C, and D we have, (A ⊗ B)(C ⊗ D) = (AC ⊗ BD) provided the dimensions of the matrices A, B, C, and D are such that the various matrix products are well defined [9].The joint ML multiuser decision rule for the space-time coded CDMA system is then given by where the maximization is over all the valid super codewords.Note that this joint ML detector and decoder searches over a super trellis made up by combining all the users' space-time code trellises.Next, we investigate the diversity gain of the space-time codes in the multiple-access channel.We show that the spacetime codes designed to achieve full diversity in the single user channels will also be able to achieve full diversity asymptotically, as the noise N 0 vanishes, in the multiuser case.
For convenience, we define the following quantity: Suppose that the super codeword D is transmitted.Then, conditioned on α, the pairwise error probability (PEP) that the ML decision rule erroneously decides in favor of another valid super codeword D is given by where the random variable V is given by It can be shown that V is a Gaussian random variable with where H = ARA and Thus, letting E = D − D , we have where the last step follows from the standard approximation of the unit Gaussian tail distribution.Note that we have introduced the notation Clearly, Γ is Hermitian.Hence we can decompose it as where U is an N T K × N T K unitary matrix, whose columns are orthonormal eigenvectors of Γ, and Σ = diag(λ 1 , . . ., λ NK ) where {λ i } NK i=1 are the eigenvalues of Γ, which are all nonnegative.Exactly N T K − r of these eigenvalues will be zero, where r is the rank of Γ.
Substituting (19) into the upper bound for the conditional pairwise probability (conditioned on the fading coefficients) given in (17), we obtain the following: where, Since α is a vector of independent Gaussian random variables with identical distributions, that is, ᏺ(0, I NT K ), and since U is an N T K × N T K unitary matrix, it then follows that β is also an independent Gaussian random vector having the same distribution, that is, β ∼ ᏺ(0, I NT K ).Thus, the |β i |'s are independent and identically distributed (i.i.d.) Rayleigh random variables.By averaging (21), over the |β i |'s we can upper bound the PEP of the joint ML decision rule as To investigate the dependence of the above upper bound on the individual user space-time codes, we proceed as follows.
If we assume that all the users are assigned different signature waveforms, then the cross-correlation matrix R will be symmetric and positive definite.In this case, by performing Cholesky decomposition we can find a K ×K lower triangular matrix L such that, Then we also have where we have set L 0 = L ⊗ I B .Since L 0 is a lower triangular matrix, (26) defines the Cholesky factorization for the matrix R ⊗ I B .Substituting (26) into the matrix Γ we get We define the BK × BK matrix Now we make the observation that if B < N T , then the eigenvalues of (L 0 (A ⊗ I B )E) H (L 0 (A ⊗ I B )E) are the same as the eigenvalues of (L 0 (A⊗I In either case, the nonzero eigenvalues of the N T K × N T K matrix Γ, will be the same as the nonzero eigenvalues of the BK × BK matrix Γ.We now make use of the following theorem due to Ostrowski (cf.[10, page 224]).
We may apply the above theorem to (29), with S replaced by L 0 and A replaced by (A ⊗ I B )EE H (A ⊗ I B ). Again we note that the eigenvalues of L 0 L H 0 are the same as the eigenvalues of L H 0 L 0 = R ⊗ I B (from (26)).Also, for an N × N matrix A and an M × M matrix B, the eigenvalues of A ⊗ B are given by and {λ m (B)} M m=1 are the eigenvalues of A and B, respectively.Thus, if we denote the eigenvalues of the cross-correlation matrix R, in ascending order, by {λ k (R)} K k=1 , then the eigenvalues of R ⊗ I B will also be the same eigenvalues with each repeating B times.
For k = 1, . . ., K, denote the rank of the codeword error matrix Then the nonzero eigenvalues of the matrix (A ⊗ I B )EE H (A ⊗ I B ) are given by the collection Theorem 1 allows us to specify the eigenvalues of the matrix Γ in terms of the eigenvalues of the individual user codeword error matrices and the eigenvalues of the crosscorrelation matrix.Specifically, we may write the set of nonzero eigenvalues of the matrix Γ as where λ min (R) and λ max (R) denote the minimum and maximum eigenvalues, respectively, of the cross-correlation matrix R.
As noted previously, since the nonzero eigenvalues of the matrix Γ are the same as the nonzero eigenvalues of the matrix Γ, the above set in (32) also gives the nonzero eigenvalues of Γ, which we may substitute into (23) to obtain, Next we introduce the following notation.Let f k (•) denote the kth user's space-time encoding function.That is f k (I) maps an information bit sequence I ∈ {+1, −1} B×1 into a valid codeword D k ∈ {+1, −1} B×NT .Define the following sets: Ᏸ k = set of all valid codewords corresponding to the kth user's space-time encoder output for k = 1, . . ., K , Ᏹ k = set of all the error code matrices that affect . . .
For each valid super codeword D ∈ Ᏸ, let Ꮽ(D) and Ꮽ k (D) denote the set of all the valid error code matrices corresponding to D ∈ Ᏸ and the set of valid error code matrices corresponding to D that effect the kth user, respectively; that is, We may now upper bound the frame error probability, P k e , for the kth user as, where in the last step we have used the bound obtained in (33) and P[D] denotes the a priori probability that the super codeword D is transmitted.Next, define, for k = 1, . . ., K, Note that Λ k (E k ) may be interpreted as the coding gain of the kth user's space-time code in the multiuser channel, corresponding to the codeword pair D k and Dk .Then, as N 0 → 0, we can upper bound P k e as where r = K k=1 r k .For small N 0 , we may take the dominating terms in the last sum to be the terms with the smallest exponent r.For E ∈ Ꮽ k (D), the smallest value of r is obtained when E is such that E j = 0 B×NT for j = k.In this case, it is easily seen that θ k,n = 1 for n = 1, . . ., r k and thus as N 0 → 0, we get From (39) we observe that, at least when the SNR is sufficiently large, the diversity advantage offered by the kth user's space-time code is the same as that in a single-user channel.That is, if the minimum rank of all the valid error codewords E k is r k , then the asymptotic diversity advantage in the multiuser channel is equal to r k .In particular, if the kth user's space-time code were to achieve the full diversity N T in a single-user environment, then it will also achieve the full diversity N T in the multiuser channel, at least asymptotically in SNR, as long as the signature cross-correlation matrix is nonsingular.
It is easily seen that this ML path search can be implemented as a maximum-likelihood path search over a super trellis formed by combining all the users' space-time code trellises using the Viterbi algorithm.This is similar to the optimal decoder for convolutionally coded CDMA channels derived in [11].Assuming (for simplicity) that all the users employ space-time codes based on underlying convolutional codes that have a constraint length ν, this super trellis will have a total of K(ν − 1) states, resulting in a total complexity per user of about ᏻ(2 Kν /K), which is exponential in Kν.Note also that, in order to achieve full diversity gain N T in an N T transmitter antenna system we must have ν ≥ N T [3,12].Hence, it is clear that even for a small number of users this could easily become a prohibitively large computational burden at the receiver.This motivates us to look for sub-optimal, low complexity receiver structures for space-time coded multiuser systems.
In Section 4, we develop partitioned receiver structures in order to reduce the computational complexity of joint multiuser detection and space-time decoding, while still achieving competitive performance against the joint ML decision rule.Specifically, we separate the multiuser detection and the space-time decoding into two stages, as is done in [13] for the case of (single-antenna) convolutionally coded CDMA channels.At the first stage of the partitioned receiver, multiuser detection is performed.The outputs from the multiuser detection stage are then passed onto a bank of singleuser space-time decoders corresponding to the K users in the system.Thus, each user's space-time decoder operates independently from the others.Of course, it is possible to employ either an ML or a maximum a posteriori probability (MAP) decoder as the single-user space-time decoder at the second stage of the receiver.Also, it is possible to use any reasonable multiuser detection strategy at the first stage of the receiver.In the following we consider both linear and nonlinear multiuser detectors as the first stage of the partitioned space-time multiuser receiver, and compare the performance of these receivers against the best possible performance.

PARTITIONED LOW COMPLEXITY DETECTOR STRUCTURES FOR SPACE-TIME CODED MULTIUSER SYSTEMS
We first consider the linear multiuser detector based partitioned receiver architecture, followed by the nonlinear multiuser detection approaches.For linear multiuser detectors, we investigate both decorrelator and linear MMSE detectors [6].For nonlinear approaches we consider both a simple iterative receiver based on interference cancellation and the turbo principle, and an improved iterative receiver based on instantaneous MMSE filtering after the interference cancellation step [7].

Decorrelator based partitioned space-time multiuser receiver
The decorrelator output at the ith symbol time is given by [6], where η ∼ ᏺ(0, N 0 R −1 ).The first stage of the receiver computes soft outputs corresponding to each user's transmitted symbol vectors at time i.The soft outputs are the a posteriori probabilities (APPs) of each user's transmitted symbol vectors, defined as below for l = 1, . . ., L, k = 1, . . ., K, and i = 0, . . ., B − 1 where L = 2 NT is the number of possible transmitted symbol vectors: (Note that s l is a row vector.)From (40), we can write this a posteriori probability as, where (R −1 ) kk is the (k, k)th element of the matrix R −1 , ŷk (i) is the kth component of the vector ŷ(i) and C 1 is a normalizing constant.
The second stage of the partitioned receiver employs a bank of single-user space-time Viterbi decoders that use these a posteriori probabilities as inputs.The kth user's decoder uses only the symbol vector probabilities corresponding to the kth user.This results in a decentralized implementation of the receiver.Clearly this partitioned receiver is equivalent to a single-user space-time coded system, except for a different noise variance value.Thus we may also easily obtain the following upper bound for the pairwise error probability of the decorrelator-based partitioned space-time multiuser receiver:

Linear MMSE based partitioned space-time multiuser receiver
As is well known, the decorrelator performance degrades when the background noise is dominant, since it completely ignores the presence of background noise [6].A better compromise between suppressing the multiple-access interference (MAI) and the background noise is obtained by employing a linear MMSE filter at the first stage of the spacetime receiver.The linear MMSE multiuser detector output at symbol time i is given by [6] The decision statistic corresponding to the kth user can then be written as, where In order to compute the soft output a posteriori probabilities at the end of the first stage, we make the assumption that the noise at the output of an MMSE multiuser detector (residual MAI plus the background noise) can be modeled as being Gaussian [14].Therefore, we may model (45) as with ηk (i) ∼ ᏺ(0, ν 2 k (i)).It can be shown that Using this model, the soft output a posteriori probabilities at the output of the linear MMSE multiuser stage can be written as where C 2 is a normalizing constant.The second stage of this receiver operates exactly the same way as that in the decorrelator-based partitioned receiver.When the second stage employs ML decoding in the kth user's space-time decoder, the pairwise error probability corresponding to the kth user's decisions, conditioned on the fading coefficients, may be given as, where the λ k,n 's are the nonzero eigenvalues of the matrix (D k − Dk ) T (D k − Dk ), and as mentioned before, Here |γ k,n | and |α j,n | are independent sets of unit-variance independent Rayleigh random variables.By averaging the right-hand side of the last inequality (50) over the joint distribution of the random variables {|γ k,n |, |α j,n |; j = k, n = 1, . . ., N T }, we can obtain the following upper bound for the kth user's pairwise error probability for the linear MMSEbased partitioned space-time multiuser receiver: (52)

Iterative MUD with interference cancellation for space-time coded CDMA
In this section, we present a simple iterative receiver structure based on interference cancellation and the turbo principle (cf.[7]).Suppose that at the first stage of the receiver, we have available a priori probabilities of all users' transmitted symbol vectors, p k,l (i) for l = 1, . . ., L, k = 1, . . ., K, and i = 0, . . ., B−1.Note that the subscript 2 and superscript p indicate that these a priori probabilities were in fact generated by the second stage of the receiver (i.e., the single-user space-time decoders) at the previous iteration.Using these a priori probabilities p k,l (i) p 2 , the interferencecancelling multiuser detector at the first stage of the receiver computes soft estimates of the transmitted symbol vectors of all the users as bk These soft estimates are used to suppress the multipleaccess interference at the output of the kth user's matched filter.Thus, the interference cancelled output corresponding to the kth user is obtained as the kth component of the vector where Dk (i) = diag( b1 (i), . . ., bk−1 (i), 0, bk+1 (i), . . ., bK (i)).From (54), with ŷk (i) denoting the kth element of ŷk (i), we have Since  k (i) ∼ ᏺ(0, N 0 ), assuming all the previous estimates of the symbol vectors were correct, the iterative interference-cancelling space-time multiuser detector (IC-ST-MUD) computes the soft output a posteriori probabilities of the transmitted symbol vectors of user k, for k = 1, . . ., K, as where ŷk (i) is given by (55), and C 3 is a normalizing constant.Following turbo decoding terminology, we call the term p k,l (i) 1 the extrinsic a posteriori probability as computed by the space-time multiuser detector.These extrinsic a posteriori probabilities, p k,l (i) 1 , are de-interleaved and passed on to a bank of K single-user soft-input/soft-output (SISO) space-time MAP decoders, described in Section 5 below.The kth user's SISO space-time MAP decoder computes a posteriori probabilities of the transmitted symbol vectors for all the symbols in a given frame [7].The extrinsic component of these symbol vector APPs, p k,l (i) 2 , are then interleaved and fed back to the first stage of the IC-ST-MUD, to be used as the a priori probabilities p k,l (i) p 2 , in the next iteration.At the final iteration, the space-time MAP decoders output hard decisions on the information symbols.
The conventional matched filter complexity is ᏻ(1).At each iteration, the first stage of the receiver needs to compute 2 NT symbol vector a posteriori probabilities.Hence, the computational complexity of this partitioned receiver is ᏻ(2 NT +2 ν ) per user per iteration.Note that even though both MAP and ML decoding have same ᏻ(2 ν ) complexity order, the MAP decoding in general requires more computations compared to the ML decoding.It has been shown that MAP decoding can be done with a complexity roughly 4 times that of ML decoding [15].

Iterative MUD with interference cancellation and instantaneous MMSE filtering for space-time coded multiuser systems
In this section, we modify the iterative receiver proposed in Section 4.3 with the addition of an instantaneous filter.This becomes similar to the iterative decoder proposed in [7] for a convolutionally coded CDMA channel.However, in the following we modify it for the space-time coded multiple-access channel and explicitly derive the form of the instantaneous MMSE filter.
As will be seen from the simulation results below, the performance of the iterative IC-ST-MUD receiver, proposed in Section 4.3, degrades considerably for medium to large cross-correlation values.Especially when the user crosscorrelations are high, the soft estimates at the initial iteration are very poor and thus the performance does not improve significantly on subsequent iterations.In order to overcome this shortcoming, we may apply a linear filter to the interference suppressed output.Specifically, we choose a linear MMSE filter that minimize the mean square error between the interference-suppressed output and the kth user's fadingmodulated transmitted symbol vector.Clearly, when the soft estimates of the multiple-access interference are very poor or they are not available at all (as in the case of the first iteration), this filtering helps the receiver to still maintain an acceptable performance level, as we will see by the simulation results given in Section 6.
The kth user's linear MMSE filter at symbol time i applies weights w k (i) to the interference-suppressed output ŷk (i) of (54), where w k (i) is designed so that, It can easily be shown that the solution to (57) is given by where we have defined the matrix V k (i) as Denoting the matrix (RV k (i)R + N 0 R) −1 by M k (i), we can write the instantaneous linear MMSE filter corresponding to the kth user at symbol time i as (61) Now, we again model the residual noise at the linear MMSE filter output as having a Gaussian distribution [7,14].Thus, we have the following model for z k (i), the output of the linear MMSE filter corresponding to the kth user at symbol time i: where The soft-output interference-cancelling multiuser detector with instantaneous MMSE filtering makes use of the model in (62) in order to compute the a posteriori probabilities of the transmitted symbol vectors corresponding to the kth user.Specifically, we have where µ k (i) and ν 2 k (i) are given by (63) and C 4 is a normalizing constant.
The second stage of this modified iterative receiver is a SISO space-time MAP decoder which operates exactly the same way as the receiver described in Section 4.3.This decoder is described briefly in Section 5.
The K × K matrix inversion required in the first stage of this receiver can be performed iteratively using matrix inversion lemma similarly to what is done in [7].Hence, the complexity of the MMSE-based interference cancelling partitioned receiver has a total complexity of ᏻ(K 2 + 2 NT + 2 ν ) per user per iteration.
As mentioned earlier, a similar type of iterative receiver has been proposed in [4] for a space-time trellis coded CDMA system.Specifically, [4] has also proposed a partitioned iterative receiver where an instantaneous MMSE linear filter is applied to suppress the residual multiple access interference (MAI) and noise present in the interference cancelled channel outputs.However, the MMSE filter employed in that receiver is a spatial MMSE filter in that the MAI interference is cancelled by exploiting the receiver diversity.In contrast, we exploit the knowledge of the temporal structure of the multiuser signal in our MMSE filter and especially do reasonable computational cost, adaptive implementation not rely on the availability of the receiver diversity.It is also worth mentioning that the soft information used in the iterative process is also different in the two schemes.
As a consequence of relying on a spatial filter to suppress the multiple access interference, the receiver proposed in [4] requires N R > N T K, where N R is the number of receiver antennas, for proper interference suppression.Even at a base station with multiple receiver antennas, this criterion could be difficult to satisfy.Our proposed detector does not rely on spatial diversity since it exploits only the multiuser signal structure, which is more likely to be available at a base station receiver, in order to suppress the interference.Also, as a result, the complexity of the MMSE filter in [4] is dominated by the inversion of an N R × N R matrix which can be done with ᏻ(M 2 ) < ᏻ(N 2 T K 2 ) complexity.Comparing with our approach, we see that we only need to compute the inverse of a K × K matrix which, as mentioned above, can be done with ᏻ(K 2 ) complexity.
In Table 1 we have summarized the properties of the different receiver structures considered in this paper.

SINGLE-USER SOFT-INPUT/SOFT-OUTPUT SPACE-TIME MAP DECODER
We assume that the space-time encoder of each user appends zero bits to a given information bit block of size B , so that the trellis is always terminated in the zero state.Thus, the actual space-time code block length is B = B + ν − 1 (since we assume that the rate of the space-time code is 1), where ν is the constraint length of the underlying convolutional code.
In this section, we use the MAP decoding algorithm [8] to compute the a posteriori probabilities of all the symbol vectors and the information bits.
Similarly to the notation in [7], we will denote the state of the space-time trellis at time i by a (ν − 1)-tuple, as , where d i is the input information bit to the space-time encoder at time i.The corresponding output code symbol vector is denoted by b i .(Note that here we are using the subscripts to denote the time index.)Let d(s , s) be the input information bit that causes the state transition from S i−1 = s to S i = s and b(s , s) be the corresponding output bit vector, which is of length N T .
Define the forward and backward recursions [8] as where Initial conditions for (65) are given as α 0 (0) = 1, α 0 (s = 0) = 0, β τ (0) = 1, and β τ (s = 0) = 0.The summations are over all the states s where the state transition (s , s) is allowed in the code trellis.Normalization of forward and backward variables is done as in [7] to avoid numerical instabilities, though we do not elaborate them here.Let l denote the set of state pairs (s , s) such that the output symbol vector corresponding to this transition is s l .The SISO ST MAP decoder of user k updates the a posteriori symbol vector probabilities as (66) Again, the extrinsic part of the above a posteriori symbol vector probability, p k,l (i) 2 , is interleaved and fed back to the interference-cancelling space-time multiuser detector, to be used as the a priori probability p k,l (i) p 2 , in the next iteration.In the final iteration the SISO ST MAP decoder also computes the a posteriori log-likelihood ratio (LLR) of the information bits.Again, similarly to the notation in [7], let ᐁ + denote the set of state pairs (s , s) such that the corresponding input information bit is +1.ᐁ − is defined similarly.Then we have Based on these a posteriori LLRs, the decoder outputs a final hard decision on the information bit d k (i) for i = 1, . . ., B −1, at the last iteration.

SIMULATION RESULTS
In this section, we simulate the proposed receiver structures for some representative situations.We consider a synchronous multiuser system and always set the number of receiver antennas to one, ignoring the possibility of exploiting receiver diversity since our primary concern here is to investigate the transmitter diversity schemes.We will consider two systems: one with two transmit antennas and another with four transmit antennas.We make use of full diversity BPSK space-time trellis codes with constraint length ν = 5, given in [12], for both systems.Specifically, we employ spacetime codes based on the underlying rate-1/2 convolutional code with octal generators (46, 72), and the underlying rate-1/4 convolutional code with octal generators (52, 56, 66, 76), both given in [12], for the two and four antenna systems, respectively.In all simulations, the information block size is set to 128 bits.
Figure 1 shows the performance results for the joint maximum likelihood detector in a space-time-coded multipleaccess system with two equal-power users and two transmitter antennas.We use the Frame Error Rate (FER) as the measure of performance.Plots (a) and (b) correspond to the cases where user cross-correlations are 0.4 and 0.9, respectively.Also shown on these figures is the performance of equivalent systems without space-time coding.These results reveal the significant gains that can be achieved with space-time coding in multiuser systems, even with only two transmitter antennas.For example, at 0.1 FER, there is more than 6 dB gain in employing the two-antenna space-time code, against a system that does not employ transmitter antenna diversity.
Figure 2 shows the performance of the joint ML receiver, again for an equal power two-user system, when each user  employs a space-time code with four transmitter antennas.
Comparing this figure with Figure 1a, we see that, at an FER of 0.01, there is more than 4 dB SNR gain over the twoantenna system.Also shown in Figure 2, for comparison purposes, is the performance of a similar multiuser system but without the space-time coding.It is clear from these results that space-time coding can offer significant SNR improvement in multiuser channels.
Plots (a) and (b) in Figure 3 show the FER performance of the partitioned space-time multiuser receiver based on a decorrelating multiuser detector and ML single-user decoders, in a four-user system with two and four transmit antennas, respectively.User cross-correlations are assumed to be ρ jk = 0.4 for all k = j.For the same two systems Figure 4 shows the FER performance for user cross-correlations of ρ = 0.75.From Figures 3 and 4, it is seen that the decorrelatorbased partitioned space-time receivers may offer some diversity gain over single-antenna systems, though they fail to capture the full gains achievable with space-time coding.This is especially clear from the large performance gap between the decorrelator-based partitioned receiver and the single-user bound in Figure 4.This performance degradation becomes severe with increasing user cross-correlations, as one would expect.These results also justify our iterative approach, which is capable of providing near single-user performance even in severe MAI environments (as we will see below).
Figure 5 shows the corresponding FER performance results for a partitioned space-time multiuser receiver based on linear MMSE multiuser detection at the first stage.These results are included primarily to further motivate the use of iterative partitioned receivers in space-time coded multiuser systems.We observe that for the given cross-correlation values, the MMSE first stage performance is no better than that with a decorrelator first stage.Of course in the case of smaller MAI than what we have simulated, the MMSE first stage would out-perform the decorrelator-based receiver, since in this case the background noise would be the dominant noise source.In either case, these linear detectors fail to exploit the large performance gains available with space-time coding.
FER performance of the iterative receiver based on interference cancellation, but without linear MMSE filtering, is shown on Figure 6 for a four-equal-power-user system.In this figure, plots (a) and (b) correspond to two and four transmit antenna systems, respectively.In both cases we have assumed that the user cross-correlations are 0.4 between any two users.From these simulation results we observe that with only about four iterations we can achieve most of the gain available from the iterative decoding process.Significantly, we see that for medium values of ρ, this simple interference cancellation scheme can achieve near single-user performance with few iterations, which is not possible with linear first stages, as we observed earlier.
However, this simple interference-cancellation-based iterative detector fails when the cross-correlations between users begin to increase.In this case, the performance becomes almost insensitive to the number of iterations.This is not surprising, since when the user cross-correlations are high our estimates at the end of the initial iteration are very poor (which of course is the same as a system employing a single-user matched filter front end), and thus the subsequent iterations will be based on these poor estimates.
Figure 7 shows the FER performance of the interferencecancelling space-time multiuser receiver with instantaneous linear MMSE filtering.Figures 7a and 7b correspond to two and four transmitter antenna systems, respectively.In both cases, there are four users in the system with equal 0.75 crosscorrelations among them.We observe that this modified iterative receiver provides excellent performance and is able to achieve near single-user performance with only a few iterations (2-3 iterations), even in the presence of considerable MAI.

CONCLUSIONS
We have considered space-time coding for multiple-access systems in the presence of quasi-static Rayleigh fading.By analyzing the joint ML receiver for space-time coded multiuser systems, we have shown that codes that achieve full   diversity advantage in single-user channels, will also be able to provide full diversity gain in multiple-access channels.In order to obtain a better tradeoff between performance and computational complexity at the receiver, we have proposed low-complexity receiver structures by partitioning the multiuser detection and space-time decoding into two stages.In particular, we have shown that a nonlinear iterative receiver based on interference cancellation and instantaneous MMSE filtering is capable of capturing most of the gains available with space-time coding in multiple-access channels, with only a few iterations.Our simulation results reveal the gains achievable with space-time coding in multiuser channels and the favorable performance tradeoffs offered by the proposed partitioned space-time multiuser receivers.

Figure 6 :
Figure 6: FER performance versus E b /N 0 (in dB) of the partitioned iterative space-time receiver based on interference cancelling multiuser detection.K = 4 and ρ = 0.4 in both cases.(a) N T = 2. (b) N T = 4.

Figure 7 :
Figure 7: FER performance versus E b /N 0 (in dB) of the partitioned iterative space-time receiver based on interference cancelling and linear MMSE filtering multiuser detection stage.K = 4 and ρ = 0.75 in both cases.(a) N T = 2. (b) N T = 4.

Table 1 :
Properties of different receiver structures for space-time coded multiuser systems.