Improved Multiuser Detectors Employing Genetic Algorithms in a Space-Time Block Coded System

Enhanced genetic algorithms (GA) applied in space-time block coded (STBC) multiuser detection (MUD) systems in Rayleigh ﬂat-fading channels are reported in this paper. Firstly, an improved objective function, which is designed to help speed up the search for the optimal solution, is introduced. Secondly, a decorrelating detector (DD) and a minimum mean square error (MMSE) detector have been added to the GA STBC MUD receiver to create the seed chromosome in the initial population. This operation has improved the receiver performance further because some signal information has been intentionally embedded in the initial population. Simulation results show that the receiver employing the improved objective function and the DD or MMSE detector can converge faster with the same bit error rate (BER) performance than the receiver with the initial population chosen randomly. The total signal-to-noise ratio (SNR) improvement contributed by these two modiﬁcations can reach 4dB. Hence the proposed GA receiver is a promising solution of the STBC MUD problem.


INTRODUCTION
In wireless communications, space-time block coding (STBC) with diversity gains has been widely studied in multiuser detections (MUDs) [1,2,3] because STBC can utilize the information in the spatial and time domains simultaneously [4]. From the coding point of view, the single-user performance of STBC has been studied in [5,6] and some STBC code designs have been established. Generally speaking, STBC is a coding technique designed particularly for the application with multiple transmit antennas [5,6]. It introduces temporal and spatial correlation into signals transmitted from different antennas. The signals transmitted from the different transmit antennas in an STBC transmitter can be considered and calculated as if they were originated from different virtual users so that STBC detection in the case of single user can be regarded simply as a MUD problem.
Conventional detection (CD) of multiuser signals utilizes a bank of chip-level matched filters to detect each user signal separately while treating all other user signals as interference. However, owing to multiaccess interference (MAI), this single-user detection method suffers much from the nearfar effect. Consequently, the MUD technique that employs a combinatorial optimization process to exploit the information of all the users in order to detect a target user signal has been proposed to mitigate the near-far effect [7]. Among all the MUD techniques, the maximum likelihood (ML) optimal detector can achieve satisfactory bit error rate (BER) performance but the computational complexity varies exponentially with the number of users. Even with the maximum a posteriori (MAP) method described in [8], the computational complexity still varies in the order of 2 l , where "l" is the length of the codeword and is a linear function of the number of users. Therefore, the MUD technique is a nondeterministic polynomial (NP)-hard problem [9], which requires unforeseeable huge computing power in order to find a global optimum solution. For such NP-hard problems, it is necessary to search for good approximation algorithms that yield solutions close to the optimum, although they do not guarantee that a global optimum can be obtained for every instance. Such approximate algorithms are also based on the ML method but the final decision is reached from a simpler route at the expense of performance degradation. Some general approximation algorithms that can achieve a reasonable balance of system performance and computational speed have been reported: simulated annealing (SA) M-PSK modulator User 1 Figure 1: Schematic of an uplink STBC MUD system. [10], tabu search (TS) [11], and genetic algorithms (GA) [9,12,13,14,15,16]. In the TS method, the move attribute, which is a set of important parameters stored in the tabu list, must be chosen in such a way that it is neither too permissive nor too restrictive. Otherwise, the method will likely converge to a local optimum with higher probabilities. On the other hand, the GA method, which can be transformed into the SA method by gradually modifying some GA parameters, will only lead to a longer convergence time but not a higher probability of converging to a local optimum even if the GA parameters have been improperly selected. It is because of this more tolerable selection of parameters that GA has been chosen to investigate alternative MUD solutions in this paper.
The GA comes originally from the schemata theory [12]. It was inspired by the observations of the natural process of evolution of plant and animal species. These living species constantly explore new possibilities in building new living organisms as well as skillfully exploit the "knowledge" accumulated in the current living organisms to create new species that are as capable of surviving as their ancestors [9]. These remarkable characteristics of the process of creating new forms of life have caught the interest of computer science researchers and led to the creation of the GA in 1975 [12].
It has been shown in [13,14,15,16] that the application of GA in MUD systems can significantly reduce the computation complexity with comparable BER performance in Rayleigh flat-fading channels. In particular, the application of GA in an STBC MUD system is proposed for the first time in [13], where all the chromosomes in the initial population are randomly chosen. However, the results shown in [13] indicate that the number of generations needed for the output to converge to a satisfactory performance is still fairly large, which makes the computation too long even though it is already much shorter than the ML method.
In this paper, a modified objective function is first introduced to shorten the GA computation. Furthermore, a decorrelating multiuser detector (DD) and a minimum mean square error (MMSE) detector [17] are also proposed to provide the seed chromosome of the initial population in the STBC MUD receiver in a wireless uplink transmission system. The DD decouples the received signal and the linear MMSE detector maximizes the receiver output signal-tonoise ratio (SNR). The seed chromosome provided by the DD or MMSE detector is perturbed randomly to generate all the chromosomes of the initial population. Hence some signal information has been embedded in the initial population. Consequently, we expect this algorithm to converge more quickly and the computation burden should be reduced. The only cost of this technique is the time consumed to obtain the seed chromosome, which is small compared with the benefit it brings. The simulation results have confirmed the validity of the proposed receiver.

SYSTEM MODEL
The schematic of a multiuser STBC system is shown in Figure 1. Just to simplify the analysis, we cut the description of the outer code and the channel estimation with the assumption that the channel state information (CSI) is perfectly known and remains constant during an entire block period. In real cases, it is not difficult to include such technologies in STBC MUD systems. There are K users in the uplink channel. For each user, the bit stream is MPSK modulated and STBC encoded before being transmitted from N antennas. The size of the PSK modulation set is Q. All users are assumed to be synchronous and mutually independent. The signals from all users reach the M receive antennas through a Rayleigh flat-fading channel.
Without loss of generality, the STBC G 2 code described in [5,6] is selected in this study. The G 2 code represents one category of orthogonal STBC, where the subscript "2" refers to N, the number of transmit antennas. It is straightforward to extend this study to the case of more transmit antennas at other transmission rates, no matter whether the STBC design is orthogonal or not. G 2 is given as where the superscript "k" represents the kth user. x k 1 and x k 2 are the symbols to be transmitted per block (i.e., the number of symbols per block is K 0 = 2). The symbol c i t,k (where i = 1, . . . , N and t = 1, . . . , P) in the ith column for the kth user with P = 2 time slots per block is transmitted by the ith transmit antenna. The transmission rate of G 2 is R = K 0 /P = 1, which is the highest in all STBC designs. In all of the other cases, R is always less than 1.
The detected signal from all the users in the jth ( j = 1, 2, . . . , M) receive antenna (in the base station) in the tth time slot is where α k i, j is the path gain between the ith transmit antenna and the jth receive antenna for the kth user and η j t is the noise. As usual, α k i, j is taken as an independent complex Gaussian random variable with zero mean and a variance of 0.5 per dimension.
For the kth user, the path gain matrix for the jth receive antenna is (the superscript k is dropped till (9) to simplify the notation) ( The signal symbols per block can be denoted as The signal in the jth receive antenna is and the noise in the jth receive antenna is Using STBC encoding, the detected signal in the jth antenna can be obtained from the following general matrix transformation which is applicable to any STBC design: where H j for the G 2 design is defined as The purpose of introducing the redundant vector x in (4) with the conjugation of the signal symbols is to enable it to treat the generalized STBC design. When a nonredundant signal vector [x 1 , x 2 , . . . , x K0 ] T is used, the cost of conjugating half of the received signals and CSI has to be taken into account as well [3]. For example, instead of the transformation in (7), the received signal for the Alamouti design G 2 [6] will be represented as [3] follows: However, it is impossible to apply (9) for the following STBC design: The H 3,4 design in [5] is similar to S. In this situation, the redundant form given by (4) and the transformation shown in (7) have to be adopted, which then eliminates the need to conjugate the received signals and the CSI [18]. Hence the combined received signal of the M receive antennas can be written as where MP × 1 and similar notations in the following equations refer to the dimensions of the matrices. Therefore, including all of the K users, the signal in the jth receive antenna should be Hence the received signal vector is where In matrix form, the received signal is In the conventional receiver, each user is separately detected [14]. Therefore, the detection rule [6] for single user can also be applied to the MUD with matched filtering. The estimation of r j t,k (r j t,k ) for the kth user based on (7) is obtained after the matched filter from the received signal r. For example, the decisions based on the estimation in [6] for two receive antennas with the G 2 design arê In this paper, if we letx be the estimation of the original signal x, the estimation error can be obtained assuming perfect CSI: As mentioned earlier, a DD or MMSE detector is applied at the receiver side in the MUD so as to create the seed chromosomes for the subsequent GA operations in order to improve the receiver performance. Firstly, the estimation of the original signal is to be calculated by either the DD or the MMSE detector and designated as the seed chromosome. Then this seed chromosome is perturbed randomly to form the initial population, which will undergo further GA operations. The DD estimation of the original signal iŝ where H is assumed to be full column rank. The corresponding MMSE detector estimation iŝ where R rr is the self-correlation of the received signals from all the receive antennas. The inversion operations in (18) and (19) only appear once and the subsequent operations do not require such inversions so that the total computation time is little affected if the number of evaluations of (17) is fairly large.

THE GENETIC ALGORITHM
In general, a GA is composed of the following steps.
(1) Initial population generation: all initial chromosomes are encoded in bit level to simplify the following GA recombination operations. They are either generated randomly or derived from the DD decision or the MMSE decision. Each chromosome is a combination of the probable solution for all users. Normally, the population size is taken as the product between K, the number of users, and Q K0 , the number of all possible solutions of each user. (2) Fitness value calculation: the MSE shown in (17) can act as the objective function to evaluate the fitness of each chromosome. The optimal solution of (17) should yield a minimum value. In fact, the ML method evaluates every possible combination of bits so the computation time varies exponentially with the number of users. As explained before, the use of GA in an STBC MUD system can reduce the computation times of (17) significantly. To make the search more effective, an improved version of the objective function over that defined by (17) is proposed here: The expression calculates the phase difference between r and Hx and should therefore be more sensitive to the changes of either or both vectors. The best chromosome in a generation should have the least value of the objective function. If the value of the best chromosome in the present generation is larger than its counterpart in the previous generation, the chromosome with the largest value of the objective function in the present generation will be replaced by the best chromosome of the previous generation. This operation ensures that at least the useful information contained in the present generation is passed on to the next. A chromosome is usually considered to be better if it has a larger fitness value. Hence a fitness value can be defined with the help of (17) or (20) as follows: where f 0 is a sufficiently large constant and can be taken as the largest value of f 1 or f 2 within the whole population. Obviously, the larger the fitness value, the better the chromosome is. (3) If the optimal criterion is satisfied, that is, when any one of all f i in the population is less than a predetermined threshold, or if the generation number has exceeded a predefined value, which is also commonly taken as the product between K and Q K0 , then go to Step (9). Otherwise, go to Step (4). (4) Selection: this operation is based on the Roulettewheel rule [9] and the probability of each chromosome being selected is calculated using the fitness value obtained with (21) in Step (2). It serves to provide the chromosomes for the subsequent recombination operations. (5) Reproduction: this step is intended to replace the chromosome of the largest objective function by the best chromosome of the same generation. (6) Crossover: this operation exchanges some parts of the chromosomes to provide a chance for a chromosome to include more signal information. Since the objective function is calculated in symbol (8 PSK) level and normally erroneous symbols are detected adjacent to the correct symbols, the crossover operation is carried in symbol level. In this paper, a single-point crossover is adopted. (7) Mutation: this operation can enhance the convergence of the GA in case the original signal information has Receiver not been included in the initial population. For example, if the second bit of all chromosomes is "0" whereas the real signal bit is "1," then the only way leading to the right solution is by mutation. The mutation operation is also carried out in symbol level, where a symbol may be mutated to its adjacent symbol in the constellation according to a certain selected mutation probability. Generally, the crossover probability p c is close to 1 and the mutation probability p m is close to 0. (8) Go to Step (2) for the next generation. (9) End and output the decisions.
Sometimes the bit inversion operation is performed in the GA but it can be regarded as a special kind of crossover so that it is not considered here. The operations of Step (6) and Step (7) may also lead to chromosomes with less information but the operation in Step (5) can reverse this degenerative effect. For given predefined number of generations and population size, the computation times of (17) vary linearly with (KQ K0 ) 2 , which is much smaller than the factor of Q KK0 in the ML detection. The improvement is clearly significant.
The significant features of the GA proposed in this paper are the introduction of an improved objective function and the preparation of an initial population that already contains some signal information from the output of a DD or MMSE detector given by (18) or (19) instead of a blind and random selection of the initial population as suggested in [13].

SIMULATIONS
In the following simulations, 8 PSK modulation and G 2 STBC of rate 1 have been adopted and the number of users is K = 4. Hence Q = 8, N = 2, and the signal vector length, that is, the chromosome length, is L = K * K * 0 log 2 Q = 24. The number of receive antennas is M = 2. The predefined number of generations and the population size are both 256 since the number of all possible solutions for each user is Q K0 = 64. The improvement of the computation time in the GA STBC MUD is therefore 256 times over the ML detection. The recombination operation parameters are p m = 0.05 and p c = 0.95. The Roulette-wheel selection rule and the single-point crossover are adopted. The channel is Rayleigh flat fading and is maintained constant during the whole block period. The path gain is taken as an independent complex Gaussian random variable with zero mean and a variance of 0.5 per dimension.
The computation times for each block averaged from 1000 times of Monte Carlo simulations in Matlab are given in Table 1 for various simulation schemes. In the table, " f 1 " and " f 2 " represent the receiver based on the objective function given by (17) and (20) respectively. "RR" refers to the receiver with the seed chromosome chosen randomly. "RDD" refers to the receiver with its seed chromosome created from the DD detector. "RMD" refers to the receiver with its seed chromosome created from an MMSE detector. "ML (K = 1)" represents the ML detector for the single-user case. The time per block needed for "ML (K = 4)" is so large that only several blocks are simulated for reference and therefore the BER performance of this case is not shown in the following figures. The same notations are also used in the following figures. The criterion for terminating the algorithm is when the objective function from (17) or (20) is less than a predefined threshold. This threshold is selected from the smallest value of (17) or (20), where the training signal is obtained by settingx to x. The resulting threshold is 0.5 for f 1 or 0.1 for f 2 in this paper. The table shows clearly that the time needed for our proposed GA STBC MUD is only about 1/5 of that of the single-user receiver with ML detection, and the improvement over the MUD is more than 256 times since some block detection may have the GA operations terminated before the last generation. The computation time required when the seed chromosome is prepared from either the DD detector or the MMSE detector is a few times further smaller than that when the initial chromosomes are randomly chosen. The computation time for the modified objective function f 2 is about 10% less than that for the original function f 1 because of its quicker convergence.
The BER performance versus the number of generations for the various detection schemes employing the objective function f 1 or f 2 is shown in Figure 2 at SNR = 6 dB. For comparison, the BER curves for the iterative MAP method suggested in [3] after the 6th iteration (6th iter. of IMAP in the figure) for the CD, DD, MMSE, and ML detectors are also given. By comparing Figure 2a with Figure 2b, it can be observed that the improvement of RDD and RMD over RR is more significant with f 2 than with f 1 . Besides, the performance of RDD is comparable to that of RMD. All proposed GA receivers outperform the CD after 30 or 10 generations with f 1 or f 2 , respectively. RDD or RMD detector initially performs the same as DD or MMSE detector but gradually achieves a better BER performance after GA operations. Figure 3 gives a comparison between the two objective functions when SNR = 6 dB for RR, RDD, and RMD, respectively. The receiver with the objective function f 2 converges to a BER of 10 −2 about 80 generations sooner than that with f 1 for both RDD and RMD, which is also the reason why the computation time for the final decision is smaller with f 2 than with f 1 . A similar behavior is observed with RR but the improvement is not so significant. It is also obvious that the performance of the receivers with f 2 approaches to that of the single-user ML detection much faster and nearer than those with f 1 . Figure 4 shows the BER versus SNR performance of the final GA output of various receivers with the objective  function f 1 or f 2 , respectively. Figure 4a shows that RDD and RMD with the objective function f 1 outperform RR by about 2 dB at the BER of 10 −2 . Here the comparison is referenced at the BER of 10 −2 instead of 10 −3 just because the results cannot approach 10 −3 . Figure 4b shows that RDD and RMD with the objective function f 2 outperform RR by about 3 dB at the BER of 10 −2 and can outperform 6th iteration results of the iterative MAP detection. The SNR degradation compared  with the single-user ML performance is about 2 dB for RMD and RDD at the BER of 10 −3 . Furthermore, RMD outperforms RDD by a very small amount with both objective functions. The improvement of all the proposed receivers over the CD, DD, and MMSE is significant. However, Figure 4 also shows that there may be a bound for those detectors with f 1 , which will limit any further application of such detectors.  Figure 5 shows the detailed performance comparison between f 1 and f 2 for RR, RDD, and RMD, respectively. Clearly the receivers with f 2 outperform those with f 1 . The SNR improvement at the BER of 10 −2 is about 1.3 dB or 1 dB for RDD or RMD, respectively.

CONCLUSIONS
The GA has previously been shown to be a feasible technique to solve the STBC MUD problem requiring less computing resources. To further improve its performance, two modifications have been proposed in this paper. Firstly, a new objective function is introduced which includes the phase information of the relevant signal vectors in order to make the decision more accurate. It contributes to about 10% reduction of detection time and about 1.3 or 1 dB SNR improvement at the BER of 10 −2 for the GA receiver with DD or MMSE detector, respectively. It also requires fewer generations to converge. Secondly, the DD and MMSE detectors have been embedded into the GA STBC MUD system to generate the seed chromosome and thus provide some signal information to the first generation. The resulting simulations confirm that the receivers thus designed can converge faster than that with the initial population randomly chosen. The improvement in SNR is about 2-3 dB at the BER of 10 −2 . Therefore, the total SNR improvement of the best receiver proposed here can reach 3-4 dB at the BER of 10 −2 when compared with the previous reported GA STBC MUD receiver. This receiver performance is also better than the DD, MMSE, or MAP detector. The degradation in SNR when compared with a single-user ML detector is limited to about 2 dB at the BER of 10 −3 . All the above results suggest that the proposed improved GA receiver is a promising solution of the STBC MUD problem with reasonable computation complexity and fairly good performance.