EURASIP Journal on Applied Signal Processing 2002:12, 1401–1414 c ○ 2002 Hindawi Publishing Corporation Adaptive Near-Optimal Multiuser Detection Using a Stochastic and Hysteretic Hopfield Net Receiver

This paper proposes a novel adaptive MUD algorithm for a wide variety (practically any kind) of interference limited systems, for example, code division multiple access (CDMA). The algorithm is based on recently developed neural network techniques and can perform near optimal detection in the case of unknown channel characteristics. The proposed algorithm consists of two main blocks: one estimates the symbols sent by the transmitters and the other identiﬁes each channel of the corresponding communication links. The estimation of symbols is carried out either by a stochastic Hopﬁeld net (SHN), by a hysteretic neural network (HyNN) or by both. The channel identiﬁcation is based on either the self-organizing feature map (SOM) or the learning vector quantization (LVQ). The combination of these two blocks yields a powerful real-time detector with near optimal performance. The performance is analyzed by extensive simulations.


INTRODUCTION
Recently, multiuser detection (MUD) has gained much attention in the world of telecommunication research.The claim for MUD primarily arises in systems suffering from the limitation of interference, such as code division multiple access (CDMA) which has been adopted as the main multiple access method of the third generation universal mobile telecommunication system (UMTS).Without novel MUD techniques, conventional receiver structures suffer severe performance degradation in high bitrate applications [1,2].MUD carries out joint detection for a group of users or single-user detection for a specific user in the presence of other users in the channel.In the 90's, a large number of articles were published focusing on this field.Many different approaches to MUD have been proposed [1] (e.g., some authors regard this field as a task of joint detection, others implement signal processing methods to get rid of unwanted interference, while a third group of authors still regard it as a classification or hypothesis testing problem).Nevertheless, we should keep in mind that the purpose of MUD is to provide a robust, low cost, reliable, and fast method to separate signals arriving from different sources over the same medium.In this paper, we propose an alternative method which unites the fast convergence property of hysteretic neural networks (HyNN) [3] with the optimization power of stochastic recurrent neural networks (SHN) [4].Furthermore, an adaptive detection technique is applied in the neural network receiver, though still including the existing RAKE receiver structure in the detector.
This paper is organized as follows: in Section 2, the applied baseband equivalent system model is introduced; in Section 3, some neural network-based MUD techniques are discussed; in Section 4, some novel multiuser detector structures are introduced based on the modified recurrent neural network model; in Section 5, the applied adaptation technique is demonstrated; in Section 6, the structure of the proposed adaptive multiuser detector is described; in Section 7, the performance of the novel algorithm is analyzed by extensive simulations; and finally in Section 8, its computational complexity is discussed focusing on real time applications.

SYSTEM MODEL
One of the major attributes of CDMA systems is the multiple usage of the same frequency band and time slot.Although the system is susceptible to multiple access interference, theoretically the users are not jammed by each other due to the uncorrelated waveforms.However, in practice, this property cannot be sustained because of multipath propagation or asynchronous transmission which makes the waveforms correlated.Two different channel models of the uplink are introduced.In the first part of this section, the synchronous channel is described, whereas in the second part the asynchronous multipath environment is considered.
However, in both cases, the baseband equivalent output signal of the kth user, denoted by q k (t), can be written in the same form.For the sake of simplicity, we apply BPSK modulation although the equations can also describe more sophisticated multivalued modulation schemes (e.g., QPSK, 16QAM).User k transmits b k [i] ∈ {−1, +1} binary symbols where i refers to the time instant.The output signal is given as where A k denotes the signal amplitude associated with the kth user, T refers to the time period of one symbol, θ k denotes the delay of user k, and N B is the block size, respectively.The spreading waveform of user k is denoted by s k (t).
In the case of direct sequence (DS) multiple access, it can be written as follows: Here S k [i] denotes the ith time chip of user k, and e(t) is the elementary waveform in the system.The chip duration is denoted by T c .Each user is transmitting over a specific channel which can be characterized by its impulse response function denoted by h k (t).The received signal is the sum of the arriving signals plus a Gaussian noise which can be written as where K refers to the number of users and n(t) is a white Gaussian noise with a constant one-sided N 0 spectral density.Depending on the properties of h k (t), we get different received signals (r(t)), but the resulting channel matched filter (CMF) output can be expressed in the same mathematical form, as will be pointed out in the following sections.

Synchronous model
In a symbol synchronous channel, all users are forced to transmit at the same time instant (θ k = 0), their channels are characterized by single-path attenuation factors h s k (t) = α k δ(t), where δ(t) is the Dirac delta function.The received signal in (3) is given as where the "s" in the superscript refers to synchronous transmission.In the case of signatures limited to one symbol length (s , there is no intersymbol interference, consequently, without loss of generality, index i can be omitted which yields ( The conventional detector consists of k = 1, 2, . . ., K filters matched with the signature waveforms and channels, generating the following output for the kth user: In the case of BPSK modulation, the traditional single-user detector (SUD) simply calculates the signum of expression (6) yielding bSUD k = sign{ bs k }.This will, however, severely deteriorate the detector's performance, as can be seen from the expanded version of formula (6) bs where R s km is defined as follows: and ñs dt is a zero mean colored Gaussian noise due to linear transformation.The output of the matched filter in vector form is where

Asynchronous model
Synchronous high bit rate communication cannot be ensured in practice due to multipath propagation.Thus, it is required to develop more sophisticated channel models for either analytical investigations or computer simulations.For the sake of generality, we assume a general multipath propagation channel.In this case, the channel impulse response function for the kth user h a k (t) is a general continuous function as opposed to the previous case where h s k (t) was a simple attenuation factor.For instance is a five-path propagation model where the jth path's attenuation and delay of user k are α k j and τ k j , respectively.Here the superscript "a" refers to the asynchronous transmission.
To ensure simple notations, a new function is defined Substituting into (3), we obtain The kth matched filter performs a convolutional product on the incoming stream with v * k (−t), which results in a continuous signal ra k (t), which can be written as ra The output of the channel matched filters is given then in the form of where A s is defined previously, and Furthermore, ña (t) = v * (−t) * n(t) and Φ a vv (t) is a K × K matrix with all correlation functions between v i (t) and v j (t).It can be written as a dyadic convolution product where Sampling with (iT), expression (13) results in a discrete-time model for mapping b where b[i] = r(iT), ña [i] = ña (iT), and R a [i] = Φ a vv (iT), which is the discrete-time channel matrix.To describe the different detection schemes in Section 3, it is helpful to use the following block notation where the components of all vectors ba [i], b[i], and ña [i] are written into column vectors with KN B elements ba = ba 1 [1], ba 2 [1], . . ., ba The convolutional form in ( 16) can be rewritten as a multiplication resulting in the same form as in ( 9) but with different parameters.
Here Since both the synchronous and asynchronous case can be treated by the same mathematical formula (see (9) and (18)) in the forthcoming discussion, we use b, ñ, A, and R without indices which can either refer to the synchronous or the asynchronous case.

MULTIUSER DETECTION
Based on the linear model introduced in (9) and in (18) applying BPSK modulation (only the real part of all matrices and vectors are taken into account), we can derive the optimal MUD as a maximum likelihood sequence estimation (MLSE) problem which can then be obtained in the following fashion: which yields [1,2] bopt This equation involves the global minimization of a quadratic form where the perfect knowledge of system parameters A and R is required.In practice, the RAKE receiver is used to keep track with the incoming signal and to estimate the components of matrix R. A power control mechanism is applied to ensure that diag[RA] tends to the unit matrix.
The optimal solution can be found, for example, by exhaustive search using (21), however, it implies exponentially increasing computational complexity as the number of users grows.In real life implementation, we cannot afford to use such a time wasting mechanism, thus many suboptimal MUD schemes have been studied in the recent past [1].Among the investigated ones, neural networks, which are widely accepted and used for classification and optimization tasks, have received considerable interest in sub-optimal MUD algorithms too.Analyzing (20), we can apply a feedforward network for MUD.The first feed-forward neural network-based MUD was proposed by Aazhang et al. in [5].Since that time, dozens of articles have appeared, for example, [6,7,8,9,10,11], including further studies on the subject.The papers treating feed-forward neural network-based MUD can be distinguished mainly by the applied training method.
In this paper, we focus on recurrent neural network detection which can be characterized by strong feedback mechanisms.Recurrent neural networks are built up by computing elements termed as neurons.In each neuron a simple updating process is realized where Y j [ ] is the output of the jth neuron at the th iteration.Parameter V l denotes the threshold of the lth neuron and W l j is the connection strength between the output of the jth neuron and the input of the lth neuron.The size of the network is denoted by M. It has been proven [12,13] that the Hopfield net will drive the following quadratic form (often referred to as Lyapunov function) into local minimum in the {−1, +1} M state space where , and W = [W i j ], respectively, [14].For a two-dimensional example see Figure 1.In the figure, the Lyapunov function is depicted as a discrete function of the neuron outputs.We assume that the two dimensional neural network runs from the initial state y In each step, the net decreases the Lyapunov function, thus the first iteration produces y[1] = [−1, +1] T and in the second iteration, the network arrives at the steady state y Due to the similarity between ( 21) and ( 23), we can use the Hopfield net for MUD (as was done in [15,16,17]).In order to do this, we have to choose the parameters in (22) in the following manner: where matrix A and matrix R are the previously defined amplitude and discrete-time channel matrices, respectively.The number of performed iterations is denoted by T r , which is limited by the speed of hardware implementation and the symbol period T. Thus, the estimated symbols are given by the outputs of the neural network in the last iteration The first implicit Hopfield neural network-based realization of MUD was introduced by Varanasi and Aazhang [18].They defined a multistage detection scheme without mentioning the term neural.Later many authors, for example, Miyajima et al. [15], Kechriotis and Manolakos [16], and Teich and Seidl [17] have shown that this structure is identical to-and can be replaced with-the recurrent neural network detector, often referred to as Hopfield net detector.
Since the Lyapunov function in (23) can only be locally maximized by traditional Hopfield neural networks, the optimal solution is not reached in most cases.Avoiding the local maxima of the Lyapunov function results in a better performance.Many articles introduced modified recurrent neural network structures to achieve improved performance in MUD by avoiding local minima.For instance, Yoon and Rao [19] and Chen et al. [20] have proposed an annealed neural network multiuser receiver, which is based on normal Hopfield network, but they applied optimized sigmoid function as nonlinearity in (22).This modification resulted in remarkable performance enhancement, although the optimality of this strategy has not yet been verified analytically.Kechriotis and Manolakos proposed another modified structure which was named as hybrid detector.It is built up by a reduced detector followed by a normal Hopfield neural network [21].Although there exists some analytical description on the behavior of this hybrid receiver structure, its optimality has not yet been proven.Wang et al. [22] proposed a transiently chaotic neural network-based multiuser receiver scheme which originates from chaos theory.This deterministic network can reach the global optimum, however its performance is deeply influenced by the choice of initial parameters which are set on the basis of experimental results.
To achieve performance enhancement, we use stochastic sources inside the neurons and to speed up the network, we apply hysteretic type nonlinearity.These two methods are introduced in the next section.

THE MODIFIED HOPFIELD NETWORK
For the sake of generality, we use the following updating function instead of (22): where ν l [ ] is a stochastic term with F(x, ) distribution at time instant , that is, Pr{ν l [ ] ≤ x} = F(x, ).Parameter h l denotes the bound of the hysteresis function related to the lth neuron, sght h {•} is a hysteretic decision function, with decision boundary h, see Figure 2 for better comprehension.If one set h l = 0 and ν l [ ] = 0, then the operation of the original Hopfield net is obtained.
In Figure 3, the above defined modified recurrent neural network is depicted.This structure combines the advantages of the two previously investigated methods, namely the stochastic and the hysteretic type recurrent neural networks.The former is able to escape from local maxima due to the stochastic nature and to find the optimal solution of the corresponding quadratic form.In exchange, additional iterations are required [4].Whereas the latter is able to reduce the number of iterations to provide faster detection [3].Both are detailed in the forthcoming two subsections.

Stochastic recurrent neural net
The additional ν[ ] term is expected to be symmetric, to provide equiprobability of movements in both directions The structure of the modified recurrent neural network.
Furthermore, in order to obtain a stable solution, the variance of ν[ ] should decrease with .
The authors claim to prove that if we use distribution-termed as logistic distribution-to generate the values of ν[ ], where γ[ ] is a negative parameter, decreased with , then the recurrent neural network defined in (27) finds the optimum of the corresponding quadratic form defined in (21) with the highest probability.This statement is a subject of a separate paper and is not proven here.However, the cooling schedule defining the function γ[ ] remains open for further investigation.Previously, we used γ[ ] = −1.5 • , which is applicable in the case of randomly generated channel matrices [23].In other cases, where channel matrices are constant, the resulting performance is slow [24].In general, parameter γ[0] and the function of γ[ ] is a trade-off between performance and speed; quickly decreased γ[ ] values results in quick detection, but worse performance.On the contrary, slowly decreased γ[ ] values can yield almost optimal performance but the network may require more iterations to find the steady state.

Neural net with hysteretic type nonlinearity
Recurrent neural networks with hysteretic type nonlinearities can provide fast convergence since hysteresis prevents output changes in the case of small input values.Hysteretic type recurrent neural networks was introduced by Levendovszky et al. [3].Here it is taken into consideration that the parameters are still defined based on rule (24) and (25) but the power control is perfect, that is, A = I.Later, R is used instead of W and b is considered instead of v.For special R matrices, the uniqueness of the steady state, which is at the global optimum of the corresponding Lyapunov function, can be proven [3].First, we define the so-called eyeopenness parameter (D) in the following form: where D must be positive obviously.Secondly, we define the internal double cube (DC) with parameter ε (0 < ε < 1), which is denoted by DC ε , Finally, we introduce vector m as the continuous solution of (21), that is, m satisfies b = Rm.
Theorem 1.If there exists 0 < ε < 1 parameter such that m ∈ DC ε , and if is positive, then the corresponding Hopfield recursion converges to the global optimum.Moreover, reaching the optimal state can be accelerated by using hysteresis decision function with parameter Proof.For detailed proof see [3].The outline of the proof can be summarized as follows: firstly, it can be seen that if m is in the double cube DC ε , then y = sign{m} is one steady state solution of the corresponding quadratic form.Secondly, there is no other y = sign{m} which could be a steady state.
The third proof shows that y = sign{m} gives a stable solution.
Note that introducing hysteretic nonlinearity results in less state changes, thus the hysteretic neural network is more stable than the original Hopfield neural network.As a multiuser detector, the Hopfield network reaches the steady state in less than ten iterations (even if the number of users and package sizes are large).Thus, the application of hysteretic Hopfield network alone, as a multiuser detector, is not advisable.The need for application of hysteretic Hopfield network alone, as a multiuser detector primarily, arises in overloaded systems where computation time must be minimized to handle all connections.
Sometimes, however, κ > 0 could not be satisfied.Since in (32) only the term (ε(1 + D) − 3) can be negative, we investigate its behavior.Two variables are considered: D and ε.Parameter D is deterministic, it depends on the structure of the discrete-time channel matrix R, and can be computed easily.On the other hand, due to the existence of unbounded thermal noise, b can take an arbitrarily large value (see (9) and ( 18)).Thus, the vector m can also be infinitely large, the boundary of the DC tends to infinity, hence ε cannot be kept in the interval (0, 1).Nevertheless, if some error is allowed in the system, we can choose parameter ε to provide sufficiently small fault probability.In the next section, an example is shown on how to compute the fault probability, that is, the probability of m escapes from the double cube DC ε .

A simple example
Let the following system be investigated, which is defined by its discrete-time channel matrix It is easy to see that D = 24/7 is the eye-openness parameter.To sustain positive κ, we must choose at least ε = 21/31 (see (32)).The output of the RAKE receiver ( 9) is given as b = [−0.85,0.8, −0.75] T + ñ, where ñ is a zero mean Gaussian noise with covariance matrix RN 0 .Based on the above statements m = R −1 b, where m is still a noise term, with b mean and R −1 N 0 covariance.Taking into account the expression of three-dimensional normal distribution, the probability of satisfying the condition defined in Theorem 1 can be expressed as The results for different E b /N 0 scenarios is shown in Table 1.The event m ∈ DC ε gives a sufficient-but not necessary-condition that the optimal solution is found.Thus, the value of Pr{m ∈ DC ε } gives a lower bound to the probability of finding the global optimum.For instance, if E b /N 0 = 15 dB, then the probability of not being in the double cube with m is equal to 0.08977.In other words, for the discrete time channel matrix R defined in (34), the optimal solution is not found at a maximum probability of 9%.We must note that the probability of falling into the DC is not equal to the probability of finding the global optimum.In most cases, under faulty circumstances, it is still possible to start from a neighboring state and reach the global optimum.

ADAPTATION METHOD
In this paper, the adaptation is regarded as the approximation of discrete-time channel matrix R, whereas the amplitude matrix A is assumed to be known in the receiver.Furthermore, for the sake of simplicity, the amplitude matrix is assumed to be a unit matrix (A = I) in the sequel.However, the equations can be easily expanded to the more general case.
RAKE receiver is designed to synchronize and to keep track of the incoming signal.The fingers of the RAKE receiver are assigned to different multipath components.These fingers are locked to the signal when sensing synchronization takes place.Assuming perfect knowledge of system parameters (R), the optimal decision rule for single-user scenarios is to apply a signum function on ( 6) and ( 9).However, perfect estimation of the channel parameters may not be ensured due to thermal noise and multiuser interference.In our case, the RAKE receiver is modeled as an absolutely erroneous estimator-this situation is termed as noncoherent scenario-yielding R = I (i.e., the unit matrix instead of R) as the input of the adaptation method.In spite of the erroneous estimator, we still consider the output of the channel matched filter to be b which is the perfect case.This may seem to be in contradiction with the previous statement and could invite criticism.Nevertheless, the explanation is simple: in this paper, the proposed method is analyzed in its simplest way (taking into account that a faulty channel matched filter remains as a subject of further investigation).
To carry out the detection, the detector algorithm needs the exact form of matrix R (see, e.g., (24)).Due to the unknown channel and noise, the elements of this matrix must be estimated recursively from the incoming signal by the means of adaptation.The adaptation method applied in the paper is based on Kohonen's self-organizing map (SOM) [25,26].This method was first implemented for the purpose of MUD by Hottinen [27] who used SOM followed by a two-stage detector.We investigate two types of adaptation which are commonly used in mobile communication (see [25,26]) supervised and unsupervised algorithms.When the decisions are assumed to be correct at the detector (near optimal performance), a training set can be obtained by using the detected symbols matched against the received sequence at the input of the receiver.In this case, supervised methods such as learning vector quantization (LVQ) can be of use.However, when the detected sequence contains errors with high probability (the detector is far from optimal performance), blind equalization techniques, such as SOM, are needed.

Supervised learning based on LVQ
To apply supervised equalization, we assume that a training set is given where b[z] represents the received block number z (see ( 9) and ( 18)).In the case of communication session (following the training period), we can replace b [z] with the detected sequence b[z] .If the detector works in a near optimal fashion, we can track further changes in the channel or noise characteristics based on this set.LVQ equalizer operates in the following way (see [27]): where ω b [z] is the Voronoi tessellation of vector b [z] .The learning rate is defined as where c [z] = +1 in the case of correct classification, that is, b[z] = b [z] , and c [z] = −1 otherwise.In the expression above, b[z] / ∈ ω b [z] denotes misclassification when b[z] does not fall into the corresponding Voronoi cell denoted by ω b [z] .

Blind learning based on SOM
In the case of blind learning, we can apply the SOM algorithm described as follows: where b[z] represents the detected binary vector at the output of MUD, x denotes a generic binary vector and Ab[z] denotes a set of rival decisions.Rival decisions refer to a set of possible classifications excluding the chosen one (which is b[z] ).
Choosing the weight sequence η [z] should satisfy the conditions to adapt to a stationary distribution.Similar conditions are imposed on β [z]  [27].
One further structure should be considered, which is a modification of the SOM based blind learning.Here (39) is used with β [z] = 0.This method is often termed adaptive vector quantization (AVQ) or the on-line k-means method.It is worth emphasizing that AVQ is still a blind method while LVQ was a supervised one.The reader should not be confused by the phrase.For both the synchronous and asynchronous case, to avoid the increment in the computational complexity, we focus only on the AVQ technique, that is, β [z] = 0, but we term it SOM to avoid misunderstandings.

THE STRUCTURE OF THE PROPOSED DETECTOR
In Figure 4, the structure of the proposed detector is depicted.The RAKE receiver block is assumed to be incorporated, it receives the incoming signal and as an output gives the values b and R. At the top of the figure, a random number generator is shown, it generates noise values to the neurons of the stochastic recurrent neural network-based on the distribution function (29).This block has two inputs, indicates the iteration instant, and A is the initial variance parameter of noise: γ[ ] = −A • .On the right, the modified neural network is shown which has already been depicted by Figure 3.The input matrix R and vector b are used to set the network parameters based on the rule ( 24) and ( 25), vector h = [h 1 , h 2 , . . ., h N ] T determines the hysteresis parameters of the network.The output bMNN is the estimation of the On the other hand, it controls the entire detector.Depending on the structure of matrix R, this unit decides what operation should the detector follow: normal, stochastic, hysteretic type or the latter two.In the following subsections, these four different operations are detailed.

Normal operation
If the eye-openness parameter is close to-or less than-one, and there is a limited amount of time to iterate, then normal operation is performed, thus both A and vector h are set to zero.Neither stochastic nor hysteretic properties are exhibited in this case.The controller unit is ordered to focus on adaptive estimation of matrix R.

Stochastic operation
If there is enough computational capacity (i.e., the mobile network is rarely loaded) or the discrete-time channel matrix R shows poor eye-openness (D is small), then the controller can switch to stochastic operation.The limit vector h is set to contain zero elements, thus hysteresis in the decision function of the neurons does not appear.Parameter A is set to be the proper value which can be pre-programmed constant (e.g., 2.5), or can be adaptively updated in the course of operation based on bit error ratio (BER) measurements.

Hysteretic operation
To perform hysteretic operation, matrix R must show wide eye-openness.If the receiver serves many users, and there is lack of computational time, the controller drives the detector to perform hysteretic operation.Vector h depends on the time the receiver has to demodulate, larger h value produces higher speed detection but worse bit error ratio.Note that the minimal value of h is zero, a negative value drives the detector to malfunction.Parameter A is set to zero, thus the stochastic nature disappears.

Stochastic and hysteretic operation
The best performance can be achieved by applying both stochastic and hysteretic properties in the state transition rule (27).The advantage of hysteresis truly lies in streamling the convergence time of the detector.The controller unit determines what h and A are to be used.This operation is the most effective although there is no general recipe on how to set the corresponding parameters.A separate paper could be dedicated to this problem, thus it is not considered here.When simulating the detector, we focus on the stochastic operation.

SIMULATION RESULTS
To show the applicability of the proposed structure, simulations of both synchronous and asynchronous uplink are considered in the following two subsections.In the synchronous case, we show the results for all adaptive scenarios.For the SOM structure, we have omitted the terms which take into consideration the possible rival decisions (β [z] = 0 in (39)).
In this way, a considerable computational simplification became possible.Despite the fact that this simplification results in performance degradation, the simplified structure is still able to yield a performance close to the coherent case, as will be demonstrated later.

Synchronous case
Uplink with synchronous transmission can be regarded as a system where intersymbol interference (ISI) is somehow avoided.If we assign limited bandwidth for the communication link, it will result in infinite waveform in the time domain which gives rise to the effect of ISI.Thus, synchronous transmission models are not realistic, but they provide an easy way for simulation and get insight into the applicability of different methods.Section 7.2 deals with more sophisticated channel models.
Here the signal is modeled as a pulse amplitude modulated (PAM) stream, so e(t) = ∆ Tc (t), where ∆ Tc (t) is equal to one, if 0 ≤ t < T c and zero otherwise.AWGN scenario is investigated (α [z] = 1, for all k) and the amplitude of the users are equal (A = I).Seventeen users are considered (K = 17) only with Gold codes of length PG = 15.Due to synchronous transmission, all θ k are equal to zero.The resulting discretetime channel matrix R from ( 8) is visualized in Figure 5.Each row and each column is assigned for one user, and the boxes represent the absolute value of cross correlation between the corresponding users.Dark tones refer to stronger correlations while light tones show weaker correlations between the corresponding users.The greatest elements of the matrix are located in the diagonal which are the auto correlation values and are equal to one.
For the synchronous scenario, the SHN was ordered to perform 1000 iterations with initial noise value A = 0.02 with linearly decreased γ.For the noncoherent methods, 300 iterations were performed and the initial learning weight was set to δ [0] = 0.01, η [0] = 0.02, and η [z] = η [0] (1 − τ/300), where τ refers to the iteration instance.The initial estimate of channel matrix R was set to R = I which corresponds to the worst case.In Figure 6, BER is exhibited as a function of E B /N 0 , the case of different detector structures and adaptation methods.The optimal performance can be lower  bounded by the theoretical BPSK limit where denotes the complementary error function [28].This BPSK bound is the lowest curve on the figure.The optimal detector is also simulated (assuming the full knowledge of R) denoted by MLSE-coherent on the figure.The corresponding curve is just beyond the BPSK bound.Next, the performance of the stochastic recurrent neural network was simulated assuming known channel characteristics.It is denoted by SHNcoherent on Figure 6.The corresponding curve is located above the MLSE-coherent as expected.Hereby, we must note  that the SHN-coherent detection algorithm provides the best performance among all suboptimal detection methods [24].For example, if we want to achieve BER = 10 −3 , then only 3 dB additional signal-to-noise ratio (SNR) is needed compared with the MLSE-coherent, although the system is overloaded (serving 17 users with processing gain of length 15).It may be noteworthy that the SUD method cannot reach BER = 10 −1 which is grossly unacceptable.Some simulation results related to the noncoherent case are also depicted in Figure 6.Namely, the MLSE method was implemented with the LVQ algorithm to adaptively identify matrix R. We see that the corresponding curve termed as MLSE-LVQ goes above the one of MLSE-coherent scenario due to the information loss on R. Next, the MLSE detector with the self-organizing map is denoted by MLSE-SOM.Using an unsupervised adaptation method results in slight loss of performance (which can only be significant in the case of powerful noise).The performance of SHN was also evaluated by using both LVQ and SOM algorithms.Note that both curves are very close to the corresponding MLSE curves.An important conclusion can be drawn from the figure: for applying SHN with SOM, we pay only approximately 4 dB additional SNR related to the coherent SHN case.
In Figure 7, the Frobenius norm between the discretetime channel matrix R and its estimated counterpart R is given as a function of iterations (performed by the corresponding learning algorithms).The Frobenius norm is defined as follows: In this case, the E b /N 0 = 12 dB scenario is considered.As can be seen, the LVQ algorithm yields a faster convergence and smaller deviation from the true channel matrix R because this is a supervised method.The decision rule results in further difference in the learning performance as the detected symbols are needed both by the SOM algorithm (39) and to reconstruct the Voronoi cells in the case of LVQ (37).Because of this effect, the learning algorithm (let it be either LVQ or SOM) exhibits slower convergence when working with SHN than working with MLSE.
Figure 8 illustrates the cumulated BER as the function of iterations for E b /N 0 = 12 dB.As the figure demonstrates, the BER decreases with time as the Frobenius norm of deviation becomes smaller and smaller.We conclude that LVQ algorithm provides the fastest adaptation, however, there is only a slight difference between SOM and LVQ in the performance of SHN.This implies that unsupervised learning can yield almost as good performance as the supervised one in the case of SHN.In the sequel, the asynchronous scenario is considered where the MLSE method is impossible to simulate.To save computational time, only the unsupervised SHN-SOM method is simulated which is expected to give almost similar performance as the one of the supervised SHN-LVQ algorithm.

Asynchronous case
The asynchronous uplink of a DS-CDMA transmission system is considered.BPSK modulation with carrier frequency 5.2 GHz, chip duration T c = 31.25 ns, and measurement bandwidth 128 MHz (which is equal to four samples per chip), we assumed mobile velocity 3 m/s, time invariant channel impulse response.For e(t), we used root raised cosine with roll-off parameter 0.22, the processing gain was set to PG = 16 in all the simulations.We simulated K = 16 users with both Walsh-Hadamard and extended Gold code sets.The delay configuration was θ k = (k − 1)T C for the kth user.
We have investigated the behavior of this complex structure in two typical types of indoor channel models [29]: (i) HPLOS: hard partitioned (HP) office scenario with line of sight (LOS) propagation, average RMS delay The simple AWGN was omitted here as the other models present more relevant channel scenarios.Based on (15) calculating the discrete-time channel matrices for these cases, the R[i] values are not equal to zero for any i due to the infinite nature of e(t).However, for |i| > 6 they are negligible, and thus they can be omitted.The resulting discrete-time channel matrices can be checked at [30].For all four scenarios, only the SOM adaptation method is considered with 100 iterations and η [z] = (1 − τ/100)η [0] , where τ refers to the iteration instance.For the detector part, only SHN was investigated due to the unmanageable computational requirements of the MLSE.In the neurons, the same F(x) distribution function was used (29) but with parameter γ[ ] = −2.5 • , and only 100 iterations were performed.
In Figure 9, BER is depicted as a function of E b /N 0 in the case of asynchronous HPLOS channel with extended Gold codes.We can see that SHN with SOM and SHN-coherent yields a reasonable detection performance.Of course, SHN-SOM is slightly worse than SHN-coherent, it needs an additional 2 dB payoff in SNR to achieve the same performance.This stresses the advantage of SHN detection with SOM, as only a marginal increase in SNR yields the same performance as SHN-coherent.Thus, LVQ adaptation has not been simulated for the asynchronous scenarios to save computational time.As can be seen in the figure, SUD also results in a very poor performance in this case.Other detection schemes such as MLSE were not implemented due to the high computational complexity required by the algorithm.
Figure 10 depicts the same as Figure 9, only the channel model is different (OPOBS instead of HPLOS).The results show the same tendencies as elaborated in explaining the previous figure.Figures 11 and 12, respectively shows the BER with respect to E b /N 0 in the case of applying Walsh-Hadamard codes.Since these codes exhibit worse correlation properties in asynchronous case than the extended Gold codes [31], the performance has deteriorated.However, the same tendency can be observed as in the previous figures (but BER is definitely higher).
Figure 13 shows the Frobenius norm of the deviation between the true discrete-time channel matrix R and its estimated version R given in (43).In all cases, SHN detector with SOM adaptation method is applied.The transient behavior of the norm can be studied based on four different channel models such as HPLOS with extended Gold codes, HPLOS with Walsh-Hadamard codes, OPOBS with extended Gold codes and OPOBS with Walsh-Hadamard codes, respectively.However, in all cases the adaptation method reaches quickly (in approximately ten iterations) a minimal Frobenius norm around 10 independently of the code sequences and the propagation model.We may wonder that the convergence of the norm (i.e., the convergence of the adaptation algorithm) seems to be faster here than in the synchronous case in Figure 7.The reason lies in the parallel update of matrix elements in a block fashion in contrast with the synchronous case.As a result, the matrix elements are updated 250 times within one iteration in the asynchronous case due to the block size N B = 250.
Further simulation results, including dynamic behaviour of the structure (for instance, change in the number of active users), can be found in [32].

COMPUTATIONAL COMPLEXITY
The computational complexity of the proposed detector is evaluated as the complexity of different blocks in Figure 4.The RAKE receiver is already implemented and widely used in spread spectrum systems.The only additional requirement related to RAKE receiver is the accessibility of the vector b (i.e., soft output values) which is assumed to be implemented.To determine the computational requirements of the other three blocks, we must take into consideration some system parameters, for example, bit rate of communication, chip rates, block size, etc.We consider the third generation universal mobile telecommunication system where CDMA is already standardized as the basic modulation scheme.In our model, we assume packets at the input ( 18), thus we focus on one slot.Following the standard, one slot is defined to last for 0.625 ms containing 2560 chips.Each user chooses the rate of communication from 32 kbps up to 2 Mbps, thus the number of bits in one slot changes from 20 up to 1280.Of course, with higher speed communication, only less active users could be accommodated in the system.Namely, only two 2 Mbps users are allowed to transmit at the same time, and the maximal number is 128 at 32 kbps.However, in a fully loaded system, the task is to demodulate 2560 symbols per slot.It infers 2560 decisions per slot, equivalent to approximately 4 million decisions per second.From [23], it is clear that 20 iterations must be sufficient.It results in approximately 80 million iterations per second.Thus, to generate random values for internal noise sources in the neurons, the noise chip must be capable of generating 80 million values per second in the upper part of Figure 4.This is an easy task for today's DSP devices.
The stochastic Hopfield network should process these noise values, that is, the SHN must compute 80 million iterations per second.In one iteration, each neuron computes the updating rule defined in (27) which entails 2559 multiplications and 2561 additions.This adds up to 5120 operations per iteration per neuron.The resulting computational requirement is approximately 410 billion operations per second which seems to be a huge number.However, it can be handled with distributed computing.Namely, we can take into account that (1) we can assign one specific DSP device to every symbol of each user, that is, one DSP for every neuron; (2) there are zero elements in the matrix W, that is, the intersymbol interference occurs locally, thus only the neighboring symbols can overlap each other.Zero elements in W decrease the required computation.
Depending on the circumstances (available hardware infrastructure, network load), we can apply one DSP also for more neurons, then our results must be multiplied by the number of neurons computed by the DSP device.Based on the second condition 3 × 128, symbols interfere in the worst case (128 users on the channel) which entails 383 multiplication, 385 addition, and one random number generation in each neuron equivalently 769 operations per slot per iteration.Using the above mentioned values approximately 25 million operations per second per neuron are required for SHN MUD.For instance, Texas TMS 320c 5502 neuron integrated circuit is capable of reaching 100 million iterations per second, thus four neurons fit in one IC.On the other hand, it is very cheap (approximately $8), which yields a low cost implementation of the proposed detection scheme.
The HyNN represents the same complexity, only the decision function differs, which does not need special attention.As has already been seen, the adaptation method contains linear operations at the end of every packet, resulting in the update of matrix R, which comprises 2560 × 2560 elements.For the update of each component, 2562 multiplications and two additions are required at every 0.625 ms (see, e.g., (39)).This results in approximately 4 million operations per second per matrix element, thus all elements cannot be updated using the same hardware; distributed architecture is needed for the adaptation method.

CONCLUSIONS AND FURTHER WORKS
In this paper, a novel MUD scheme was presented based on the theory of neural networks and adaptive detectors.The new detector is also capable of serving many users in heavily loaded environments.The robustness of the method lies in the application of the stochastic recurrent neural network and the self-organizing feature map.As an improvement, a control unit is also proposed which could switch between stochastic and hysteretic operations depending on the circumstances.However, the rules of how to do so have not yet been investigated.The system has been tested by extensive simulations.Simulation results are promising and they show that the performance of the proposed scheme is close to the one of the optimal detector.
For further work, the authors plan to provide a more rigorous mathematical analysis of this system, including the cooling schedule, which plays an important role in the stochastic recurrent neural network.Furthermore, a separate SOM seems to be feasible in the controller part too which could recognize what operation the detector should follow; stochastic, hysteretic, or both.For doing this, continuous adaptation will be required.

Figure 4 :
Figure 4: The structure of the proposed detector.

Figure 5 :
Figure 5: The structure of the channel matrix in the synchronous case.

Figure 7 :
Figure 7: The synchronous case: Frobenius norm versus number of iterations.

Figure 8 :
Figure 8: The synchronous case: cumulated BER versus number of iterations.

Figure 9 :
Figure 9: The HPLOS channel with extended Gold codes.

Figure 10 :
Figure 10: The OPOBS channel with extended Gold codes.

János
Levendovszky received the M.S. degree in electrical engineering at the Budapest University of Technology and Economics, Hungary, in 1986.He obtained his Ph.D. degree (Candidate of Science) from the Hungarian Academy of Sciences in 1989 in the field of adaptive signal processing.Since then, he has been visiting scholar at Oxford University, UK, and conducted research on neural network theory at the Department of Mathematics, Katholieke Universiteit Leuven, Belgium.Presently, he is an Associate Professor of electrical engineering at the Department of Telecommunications at the Budapest University of Technology and Economics.He teaches and researches in the area of information and communication theory, networking, and soft computing.Lászl ó Pap graduated in 1967 from the Telecommunication Branch of the Electrical Engineering Faculty at the Technical University of Budapest.He received the candidate of sciences and the academic doctor of sciences degrees in 1980 and 1992, respectively.Since 1992, he has been a Professor and Head of the Telecommunications Department, and since 1994, he has been the dean of the Electrical Engineering and Informatics Faculty.His main interest includes communication theory, encoding, modulation, synchronization, further mobile communications, and the theory of modern telecommunication systems.E. C. van der Meulen received his M.S. degree in mathematics from the University of Leiden, The Netherlands, in 1962 and his Ph.D. degree in statistics from the University of California, Berkeley, in 1968.In 1993, he received the Doctor honoris causa degree from the Technical University of Budapest, Hungary.From 1968 to 1975, he served on the faculty of the Department of Statistics at the University of Rochester, Rochester, NY.From September 1972 to January 1974, he was on leave at the Mathematical Center, Amsterdam, The Netherlands.He has been a Professor of Mathematics at the Katholieke Universiteit Leuven, Belgium, since September 1975.His research interests include multiuser information theory and information-theoretic statistical inference.He is a member of the editorial board of the American Journal of Mathematical and Management Sciences.Dr. van der Meulen is a member of IMS, ISI, and the Dutch Society for Statistics.Since 1990, he has served as Chairman of the Information Theory Chapter in the Benelux Section of the IEEE.The American Journal of Mathematical and Management Sciences awarded him the 1997 Jacob Wolfowitz Prize.In 1998, the Czech Academy of Sciences awarded him the B. Bolzano Honorary Medal.He is a Fellow of the IEEE.

Table 1 :
Probability of being in the DC.