It was observed [22–25] that (4) can be written as
where is a column vector with elements, is an matrix, and implies the Hermitian transpose, where (5) corresponds to the HNN energy function in (2). In order to use the HNN to perform MLSE equalization, the cost function (4) that is minimized by the Viterbi MLSE equalizer must be mapped to the energy function (5) of the HNN. This mapping is performed by expanding (4) for a given block length and a number of CIR lengths , starting from and increasing until a definite pattern emerges in and in (5). The emergence of a pattern in and enables the realization of an MLSE equalizer for the general case, that is, for systems with any and , yielding a generalized HNN MLSE equalizer that can be used in a singlecarrier communication system.
Assuming that , , and contain complex values these variables can be written as [22–25]
where and are column vectors of length , and is an matrix, where subscripts and are used to denote the respective inphase and quadrature components. is the crosscorrelation matrix of the complex received symbols such that
implying that it is Hermitian. Therefore, is symmetric and is skew symmetric [22, 23]. By using the symmetric properties of , (5) can be expanded and rewritten as
which in turn can be rewritten as [22–25]
It is clear that (9) is in the form of (5), where the variables in (5) are substituted as follows:
Equation (9) will be used to derive a general model for MQAM equalization.
4.1. Systematic Derivation
The transmitted and received data block structures are shown in Figure 2, where it is assumed that known tail symbols are appended and prepended to the block of payload symbols. (The transmitted tails are to and to and are equal to )
The expressions for the unknowns in (9) are found by expanding (4), for a fixed data block length and increasing CIR length and mapping it to (9). Therefore, for a singlecarrier system with a data block of length and CIR of length , with the data block initiated and terminated by known tail symbols, and are given by
where and are respectively determined by
where and and denote the real and complex components of the CIR coefficients. Also,
where and is determined by
and is determined by
where with and again denoting the real and complex components of the respective vectors.
4.2. Training
Since the proposed equalizer is based on a neural network, it has to be trained. The HNN MLSE equalizer does not have to be trained by providing a set of training examples as in the case of conventional supervised neural networks [28]. Rather, the HNN MLSE equalizer is trained anew in an unsupervized fashion for each received data block by using the coefficients of the estimated CIR to determine in (13) and in (14), for , which serve as the connection weights between the neurons. , , , and fully describes the structure of the equalizer for each received data block, which are determined according to (11), (12), (15) and (16), using the estimated CIR and the received symbol sequence. and therefore describe the connection weights between the neurons, and and represent the input of the neural network.
4.3. The Iterative System
In order for the HNN to minimize the energy function (5), the following dynamic system is used
where is an arbitrary constant and is the internal state of the network. An iterative solution for (19) is given by
where again is the internal state of the network, is the vector of estimated symbols, is the decision function associated with each neuron and indicates the iteration number. is a function used for optimization.
To determine the MLSE estimate for a data block of length with CIR coefficients for a MQAM system, the following steps are executed:

(1)
Use the received symbols and the estimated CIR to calculate , , and according to (11), (12), (15) and (16).

(2)
Initialize all elements in [] to .

(3)
Calculate .

(4)
Calculate [.

(5)
Go to step and repeat until , where is the predetermined number of iterations. ( iterations are used for the proposed equalizer.)
As is clear from the algorithm, the estimated symbol vector is updated with each iteration. contains the best linear estimate for (it can be shown that contains the output of a RAKE reciever used in DSSS systems) and is therefore used as input to the network, while contains the crosscorrelation information of the received symbols. The system solves (4) by iteratively mitigating the effect of ISI and produces the MLSE estimates in after iterations.
4.4. The Decision Function
4.4.1. Bipolar Decision Function
When BPSK modulation is used, only two signal levels are required to transmit information. Therefore, since only two signal levels are used, a bipolar decision function is used in the HNN BPSK MLSE equalizer. This function, also called a bipolar sigmoid function, is expressed as
and is shown in Figure 3. It must also be noted that the bipolar decision can also be used in the MQAM model for equalization in 4QAM systems. This is an exception, since, although 4QAM modulation uses four signal levels, there are only two signal levels per dimension. By using the model derived for MQAM modulation, 4QAM equalization can be performed by using the bipolar decision function in (21), with the output scaled by 'n factor .
4.4.2. Multilevel Decision Function
Apart from 4QAM modulation, all other MQAM modulation schemes use multiple amplitude levels to transmit information as the "AM" in the acronym MQAM implies. A bipolar decision function will therefore not be sufficient; a multilevel decision function with distinct signal levels must be used, where is the modulation alphabet size.
A multilevel decision function can be realized by adding several bipolar decision functions and shifting each by a predetermined value, and scaling the result accordingly [29, 30]. To realize a level decision function for use in an MQAM HNN MLSE equalizer, the following function can be used:
where is the modulation alphabet size and is the value by which the respective bipolar decision functions are shifted. Figure 4 shows the fourlevel decision function used for the 16QAM HNN MLSE equalizer, for , and .
Due to the timevarying nature of a mobile wireless communication channel and energy losses caused by absorption and scattering, the total power in the received signal is also timevariant. This complicates equalization when using the MQAM HNN MLSE equalizer, since the value by which the respective bipolar decision functions are shifted, , is dependent on the power in the channel and will therefore have a different value for every new data block arriving at the receiver. For this reason the level decision function in (22) will change slightly for every data block. is determined by the Euclidean norm of the estimated CIR and is given by
where and are the th respective inphase and quadrature components of the estimated CIR of length as before.
Figure 4 shows the fourlevel decision function for different values of to demonstrate the effect of varying power levels in the channel. Higher power in will cause the outer neurons to move away from the origin whereas lower power will cause the outer neurons to move towards the origin. Therefore, upon reception of a complete data block, is determined according to the power of the CIR, after which equalization commences.
4.5. Optimization
Because MLSE is an NPcomplete problem, there are a number of possible "good" solutions in the multidimensional solution space. By enumerating every possible solution, it will be possible to find the best solution, that is, the sequence of symbols that minimizes (4) and (5), but it is not computationally feasible for systems with large and . The HNN is used to minimize (5) to find a nearoptimal solution at very low computational cost. Because the HNN usually gets stuck in suboptimal local minima, it is necessary to employ optimization techniques as suggested [31]. To aid the HNN in escaping less optimal basins of attraction simulated annealing and asynchronous neuron updates are often used.
Markov Chain Monte Carlo (MCMC) algorithms are used together with Gibbs sampling in [32] to aid optimization in the solution space. According to [32], however, the complexity of the MCMC algorithms may become prohibitive due to the so called stalling problem, which result from low probability transitions in the Gibbs sampler. To remedy this problem an optimization variable referred to as the "temperature" can be adjusted in order to avoid these small transition probabilities. This idea is similar to simulated annealing, where the temperature is adjusted to control the rate of convergence of the algorithm as well as the quality of the solution it produces.
4.5.1. Simulated Annealing
Simulated annealing has its origin in metallurgy. In metallurgy annealing is the process used to temper steel and glass by heating them to a high temperature and then gradually cooling them, thus allowing the material to coalesce into a lowenergy crystalline state [28]. In neural networks, this process is imitated to ensure that the neural network escapes less optimal local minima to converge to a nearoptimal solution in the solution space. As the neural network starts to iterate, there are many candidate solutions in the solution space, but because the neural network starts to iterate at a high temperature, it is able to escape the less optimal local minima in the solutions space. As the temperature decreases, the network can still escape less optimal local minima, but it will start to gradually converge to the global minimum in the solution space to minimize the energy. This state of minimum energy corresponds to the optimal solution.
The output of the function in (20) is used for simulated annealing. As the system iterates, is incremented with each iteration, and produces a value according to an exponential function to ensure that the system converges to a nearoptimal local minimum in the solution space. This function is give by
and shown in Figure 5. This causes the output of to start at a nearzero value and to exponentially converge to 1 with each iteration.
The effect of annealing on the bipolar and fourlevel decision function during the iteration cycle is shown in Figures 6 and 7, respectively, with the slope of the decision function increasing as is updated with each iteration. Simulated annealing ensures nearoptimal sequence estimation by allowing the system to escape less optimal local minima in the solution space, leading to better system performance.
Figures 8 and 9 show the neuron outputs, for the real and complex symbol components, of the 16QAM HNN MLSE equalizer for each iteration of the system with and without annealing. It is clear that annealing allows the outputs of the neurons to gradually evolve in order to converge to nearoptimal values in the dimensional solution space, not produce reliable transmitted symbol estimates.
4.5.2. Asynchronous Updates
In artificial neural networks, the neurons in the network can either be updated using parallel or asynchronous updates. Consider the iterative solution of the HNN in (20). Assume that , and each contain elements and that is an matrix with iterations as before.
When parallel neuron updates are used, elements in are calculated before elements in are determined, for each iteration. This implies that the output of the neurons will only be a function of the neuron outputs from the previous iteration. On the other hand, when using asynchronous neuron updates, one element in is determined for every corresponding element in . This is performed times per iteration—once for each neuron. Asynchronous updates allow the changes of the neuron outputs to propagate to the other neurons immediately [31], while the output of all of the neurons will only be propagated to the other neurons after all of them have been updated when parallel updates are used.
With parallel updates the effect of the updates propagates through the network only after one complete iteration cycle. This implies that the energy of the network might change drastically, because all of the neurons are updated together. This will cause the state of the neural network to "jump" around on the solution space, due to the abrupt changes in the internal state of the network. This will lead to degraded performance, since the network is not allowed to gradually evolve towards an optimal, or at least a nearoptimal, basin of attraction.
With asynchronous updates the state of the network changes after each element in is determined. This means that the state of the network undergoes gradual changes during each iteration. This ensures that the network traverses the solution space using small steps while searching for the global minimum. The computational complexity is identical for both parallel and asynchronous updates [31]. Asynchronous updates are therefore used for the HNN MLSE equalizer. The neurons are updated in a sequential order: .
4.6. Convergence and Performance
The rate of convergence and the performance of the HNN MLSE equalizer are dependent on the number of CIR coefficients as well as the number of iterations . Firstly, the number of CIR coefficients determines the level of interconnection between the neurons in the network. A long CIR will lead to dense population of the connection matrix in (10), consisting of in (11) and (12), which translates to a high level of interconnection between the neurons in the network. This will enable the HNN MLSE equalizer to converge faster while producing better maximum likelihood sequence estimates, which is ultimately the result of a high level of diversity provided by a highly dispersive channel. Similarly, a short CIR will result in a sparse connection matrix , where the HNN MLSE equalizer will converge slower while yielding less optimal maximum likelihood sequence estimates.
Second, simulated annealing, which allows the neuron outputs to be forced to discrete decision levels when the iteration number reaches the end of the iteration cycle (when ), ensures that the HNN MLSE equalizer will have converged by the last iteration (as dictated by ). This is clear from Figure 9. For small , the output of the HNN MLSE equalizer will be less optimal than for large . It was found that the HNN MLSE equalizer produces acceptable performance without excessive computational complexity for .
4.7. Soft Outputs
To enable the HNN MLSE equalizer to produce soft outputs, in (24) is scaled by a factor . This allows the outputs of the equalizer to settle between the discrete decision levels instead of being forced to settle on the decision levels.