It was observed [22–25] that (4) can be written as
where
is a column vector with
elements,
is an
matrix, and
implies the Hermitian transpose, where (5) corresponds to the HNN energy function in (2). In order to use the HNN to perform MLSE equalization, the cost function (4) that is minimized by the Viterbi MLSE equalizer must be mapped to the energy function (5) of the HNN. This mapping is performed by expanding (4) for a given block length
and a number of CIR lengths
, starting from
and increasing
until a definite pattern emerges in
and
in (5). The emergence of a pattern in
and
enables the realization of an MLSE equalizer for the general case, that is, for systems with any
and
, yielding a generalized HNN MLSE equalizer that can be used in a single-carrier communication system.
Assuming that
,
, and
contain complex values these variables can be written as [22–25]
where
and
are column vectors of length
, and
is an
matrix, where subscripts
and
are used to denote the respective in-phase and quadrature components.
is the cross-correlation matrix of the complex received symbols such that
implying that it is Hermitian. Therefore,
is symmetric and
is skew symmetric [22, 23]. By using the symmetric properties of
, (5) can be expanded and rewritten as
which in turn can be rewritten as [22–25]
It is clear that (9) is in the form of (5), where the variables in (5) are substituted as follows:
Equation (9) will be used to derive a general model for M-QAM equalization.
4.1. Systematic Derivation
The transmitted and received data block structures are shown in Figure 2, where it is assumed that
known tail symbols are appended and prepended to the block of payload symbols. (The transmitted tails are
to
and
to
and are equal to
)
The expressions for the unknowns in (9) are found by expanding (4), for a fixed data block length
and increasing CIR length
and mapping it to (9). Therefore, for a single-carrier system with a data block of length
and CIR of length
, with the data block initiated and terminated by
known tail symbols,
and
are given by
where
and
are respectively determined by
where
and
and
denote the real and complex components of the CIR coefficients. Also,
where
and
is determined by
and
is determined by
where
with
and
again denoting the real and complex components of the respective vectors.
4.2. Training
Since the proposed equalizer is based on a neural network, it has to be trained. The HNN MLSE equalizer does not have to be trained by providing a set of training examples as in the case of conventional supervised neural networks [28]. Rather, the HNN MLSE equalizer is trained anew in an unsupervized fashion for each received data block by using the coefficients of the estimated CIR to determine
in (13) and
in (14), for
, which serve as the connection weights between the neurons.
,
,
, and
fully describes the structure of the equalizer for each received data block, which are determined according to (11), (12), (15) and (16), using the estimated CIR and the received symbol sequence.
and
therefore describe the connection weights between the neurons, and
and
represent the input of the neural network.
4.3. The Iterative System
In order for the HNN to minimize the energy function (5), the following dynamic system is used
where
is an arbitrary constant and
is the internal state of the network. An iterative solution for (19) is given by
where again
is the internal state of the network,
is the vector of estimated symbols,
is the decision function associated with each neuron and
indicates the iteration number.
is a function used for optimization.
To determine the MLSE estimate for a data block of length
with
CIR coefficients for a M-QAM system, the following steps are executed:
-
(1)
Use the received symbols
and the estimated CIR
to calculate
,
,
and
according to (11), (12), (15) and (16).
-
(2)
Initialize all elements in [
] to
.
-
(3)
Calculate
.
-
(4)
Calculate [
.
-
(5)
Go to step
and repeat until
, where
is the predetermined number of iterations. (
iterations are used for the proposed equalizer.)
As is clear from the algorithm, the estimated symbol vector
is updated with each iteration.
contains the best linear estimate for
(it can be shown that
contains the output of a RAKE reciever used in DSSS systems) and is therefore used as input to the network, while
contains the cross-correlation information of the received symbols. The system solves (4) by iteratively mitigating the effect of ISI and produces the MLSE estimates in
after
iterations.
4.4. The Decision Function
4.4.1. Bipolar Decision Function
When BPSK modulation is used, only two signal levels are required to transmit information. Therefore, since only two signal levels are used, a bipolar decision function is used in the HNN BPSK MLSE equalizer. This function, also called a bipolar sigmoid function, is expressed as
and is shown in Figure 3. It must also be noted that the bipolar decision can also be used in the M-QAM model for equalization in 4-QAM systems. This is an exception, since, although 4-QAM modulation uses four signal levels, there are only two signal levels per dimension. By using the model derived for M-QAM modulation, 4-QAM equalization can be performed by using the bipolar decision function in (21), with the output scaled by 'n factor
.
4.4.2. Multilevel Decision Function
Apart from 4-QAM modulation, all other M-QAM modulation schemes use multiple amplitude levels to transmit information as the "AM" in the acronym M-QAM implies. A bipolar decision function will therefore not be sufficient; a multilevel decision function with
distinct signal levels must be used, where
is the modulation alphabet size.
A multilevel decision function can be realized by adding several bipolar decision functions and shifting each by a predetermined value, and scaling the result accordingly [29, 30]. To realize a
-level decision function for use in an M-QAM HNN MLSE equalizer, the following function can be used:
where
is the modulation alphabet size and
is the value by which the respective bipolar decision functions are shifted. Figure 4 shows the four-level decision function used for the 16-QAM HNN MLSE equalizer, for
,
and
.
Due to the time-varying nature of a mobile wireless communication channel and energy losses caused by absorption and scattering, the total power in the received signal is also time-variant. This complicates equalization when using the M-QAM HNN MLSE equalizer, since the value by which the respective bipolar decision functions are shifted,
, is dependent on the power in the channel and will therefore have a different value for every new data block arriving at the receiver. For this reason the
-level decision function in (22) will change slightly for every data block.
is determined by the Euclidean norm of the estimated CIR and is given by
where
and
are the
th respective in-phase and quadrature components of the estimated CIR of length
as before.
Figure 4 shows the four-level decision function for different values of
to demonstrate the effect of varying power levels in the channel. Higher power in
will cause the outer neurons to move away from the origin whereas lower power will cause the outer neurons to move towards the origin. Therefore, upon reception of a complete data block,
is determined according to the power of the CIR, after which equalization commences.
4.5. Optimization
Because MLSE is an NP-complete problem, there are a number of possible "good" solutions in the multidimensional solution space. By enumerating every possible solution, it will be possible to find the best solution, that is, the sequence of symbols that minimizes (4) and (5), but it is not computationally feasible for systems with large
and
. The HNN is used to minimize (5) to find a near-optimal solution at very low computational cost. Because the HNN usually gets stuck in suboptimal local minima, it is necessary to employ optimization techniques as suggested [31]. To aid the HNN in escaping less optimal basins of attraction simulated annealing and asynchronous neuron updates are often used.
Markov Chain Monte Carlo (MCMC) algorithms are used together with Gibbs sampling in [32] to aid optimization in the solution space. According to [32], however, the complexity of the MCMC algorithms may become prohibitive due to the so called stalling problem, which result from low probability transitions in the Gibbs sampler. To remedy this problem an optimization variable referred to as the "temperature" can be adjusted in order to avoid these small transition probabilities. This idea is similar to simulated annealing, where the temperature is adjusted to control the rate of convergence of the algorithm as well as the quality of the solution it produces.
4.5.1. Simulated Annealing
Simulated annealing has its origin in metallurgy. In metallurgy annealing is the process used to temper steel and glass by heating them to a high temperature and then gradually cooling them, thus allowing the material to coalesce into a low-energy crystalline state [28]. In neural networks, this process is imitated to ensure that the neural network escapes less optimal local minima to converge to a near-optimal solution in the solution space. As the neural network starts to iterate, there are many candidate solutions in the solution space, but because the neural network starts to iterate at a high temperature, it is able to escape the less optimal local minima in the solutions space. As the temperature decreases, the network can still escape less optimal local minima, but it will start to gradually converge to the global minimum in the solution space to minimize the energy. This state of minimum energy corresponds to the optimal solution.
The output of the function
in (20) is used for simulated annealing. As the system iterates,
is incremented with each iteration, and
produces a value according to an exponential function to ensure that the system converges to a near-optimal local minimum in the solution space. This function is give by
and shown in Figure 5. This causes the output of
to start at a near-zero value and to exponentially converge to 1 with each iteration.
The effect of annealing on the bipolar and four-level decision function during the iteration cycle is shown in Figures 6 and 7, respectively, with the slope of the decision function increasing as
is updated with each iteration. Simulated annealing ensures near-optimal sequence estimation by allowing the system to escape less optimal local minima in the solution space, leading to better system performance.
Figures 8 and 9 show the neuron outputs, for the real and complex symbol components, of the 16-QAM HNN MLSE equalizer for each iteration of the system with and without annealing. It is clear that annealing allows the outputs of the neurons to gradually evolve in order to converge to near-optimal values in the
-dimensional solution space, not produce reliable transmitted symbol estimates.
4.5.2. Asynchronous Updates
In artificial neural networks, the neurons in the network can either be updated using parallel or asynchronous updates. Consider the iterative solution of the HNN in (20). Assume that
,
and
each contain
elements and that
is an
matrix with
iterations as before.
When parallel neuron updates are used,
elements in
are calculated before
elements in
are determined, for each iteration. This implies that the output of the neurons will only be a function of the neuron outputs from the previous iteration. On the other hand, when using asynchronous neuron updates, one element in
is determined for every corresponding element in
. This is performed
times per iteration—once for each neuron. Asynchronous updates allow the changes of the neuron outputs to propagate to the other neurons immediately [31], while the output of all of the
neurons will only be propagated to the other neurons after all of them have been updated when parallel updates are used.
With parallel updates the effect of the updates propagates through the network only after one complete iteration cycle. This implies that the energy of the network might change drastically, because all of the neurons are updated together. This will cause the state of the neural network to "jump" around on the solution space, due to the abrupt changes in the internal state of the network. This will lead to degraded performance, since the network is not allowed to gradually evolve towards an optimal, or at least a near-optimal, basin of attraction.
With asynchronous updates the state of the network changes after each element in
is determined. This means that the state of the network undergoes
gradual changes during each iteration. This ensures that the network traverses the solution space using small steps while searching for the global minimum. The computational complexity is identical for both parallel and asynchronous updates [31]. Asynchronous updates are therefore used for the HNN MLSE equalizer. The neurons are updated in a sequential order:
.
4.6. Convergence and Performance
The rate of convergence and the performance of the HNN MLSE equalizer are dependent on the number of CIR coefficients
as well as the number of iterations
. Firstly, the number of CIR coefficients determines the level of interconnection between the neurons in the network. A long CIR will lead to dense population of the connection matrix in
(10), consisting of
in (11) and
(12), which translates to a high level of interconnection between the neurons in the network. This will enable the HNN MLSE equalizer to converge faster while producing better maximum likelihood sequence estimates, which is ultimately the result of a high level of diversity provided by a highly dispersive channel. Similarly, a short CIR will result in a sparse connection matrix
, where the HNN MLSE equalizer will converge slower while yielding less optimal maximum likelihood sequence estimates.
Second, simulated annealing, which allows the neuron outputs to be forced to discrete decision levels when the iteration number reaches the end of the iteration cycle (when
), ensures that the HNN MLSE equalizer will have converged by the last iteration (as dictated by
). This is clear from Figure 9. For small
, the output of the HNN MLSE equalizer will be less optimal than for large
. It was found that the HNN MLSE equalizer produces acceptable performance without excessive computational complexity for
.
4.7. Soft Outputs
To enable the HNN MLSE equalizer to produce soft outputs,
in (24) is scaled by a factor
. This allows the outputs of the equalizer to settle between the discrete decision levels instead of being forced to settle on the decision levels.