A primer on equalization, decoding and non-iterative joint equalization and decoding

In this article, a general model for non-iterative joint equalization and decoding is systematically derived for use in systems transmitting convolutionally encoded BPSK-modulated information through a multipath channel, with and without interleaving. Optimal equalization and decoding are discussed first, by presenting the maximum likelihood sequence estimation and maximum a posteriori probability algorithms and relating them to equalization in single-carrier channels with memory, and to the decoding of convolutional codes. The non-iterative joint equalizer/decoder (NI-JED) is then derived for the case where no interleaver is used, as well as for the case when block interleavers of varying depths are used, and complexity analyses are performed in each case. Simulation results are performed to compare the performance of the NI-JED to that of a conventional turbo equalizer (CTE), and it is shown that the NI-JED outperforms the CTE, although at much higher computational cost. This article serves to explain the state-of-the-art to students and professionals in the field of wireless communication systems, presenting these fundamental topics clearly and concisely.


Introduction
Equalization and decoding are two essential aspects of any wireless communication system.The equalizer is tasked with reversing the effect of the communication channel on the transmitted information signal, while the decoder receives the equalized symbol sequence and attempts to correct errors that might have been caused during transmission.
The Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm, also known as the maximum a posteriori probability (MAP) algorithm, is useful when designing equalizer and decoder algorithms [1,2].The BCJR algorithm receives soft probabilistic information regarding the input and produces a posteriori probabilistic information regarding the output, and is aptly called a soft-input soft-output (SISO) algorithm.The availability of reliable soft information at the input allows for more accurate a posteriori estimates to be produced at the output, which improves overall system performance [3][4][5].
The equalizer can be designed by using the Viterbi algorithm (VA), also known as the maximum likelihood sequence estimation (MLSE) algorithm, which makes use of the min-sum algorithm to find the most probable transmitted sequence [2,6,7].The VA performs optimally but it is only able to produce hard estimates at the output, and is therefore not an attractive choice when the equalizer is followed by a SISO decoder.The VA can be modified to produce suboptimal posterior probabilistic information at the output, resulting in the soft output Viterbi algorithm (SOVA) in [8], but because of its suboptimal nature the overall system performance will also be suboptimal.Using the MAP algorithm, the equalizer is able to produce optimal posterior probabilistic information regarding the transmitted information, which can be exploited by the SISO decoder.
While the VA and SOVA can also be used for decoding, the MAP algorithm is the algorithm of choice when the output of the decoder is fed back to be used by the equalizer, a technique known as turbo equalization [3,4].However, when the estimates of the uncoded transmitted symbols are taken directly from the output of the decoder, i.e., when no turbo equalization is performed, the MLSE algorithm will suffice.http://asp.eurasipjournals.com/content/2013/1/79Joint equalization and decoding can be performed by MLSE or MAP algorithms and by employing a supertrellis, as shown in [9], given that the depth of the interleaver is limited by computational complexity limitations.The joint equalizer and decoder achieves optimal noniterative equalization and decoding.Since the input of the joint equalizer and decoder is ISI-corrupted coded transmitted symbols, and the output is the equalized and decoded symbol estimates, no posterior probabilistic information is required at the output.Therefore, the MLSE algorithm can also be used.The MAP algorithm will perform just as well, albeit with approximately twice the computational complexity.It has also been shown in [10] that joint decoding of turbo codes can be done on a super-trellis as well.
Figure 1 shows a block diagram of the communication system considered in this article.The source information is encoded, after which an interleaver is used to separate adjacent coded bits temporally.The coded bits are mapped to modulation symbols chosen from a modulation alphabet D, after which the symbols are used to modulate the carrier before transmission.The transmitted information passes through a multipath white Gaussian noise channel and is received by the receiver antenna.After reception the signal is demodulated and matched filtered in order to produce a received symbol sequence.The CIR is estimated using a number of known pilot symbols, and is provided as input to the equalizer together with the received symbol sequence.The equalizer reverses the effect of the multipath channel and produces optimal symbol estimates (MAP) or an optimal sequence estimate (MLSE) regarding the transmitted coded bits after interleaving.The output of the equalizer is deinterleaved and provided as input to the decoder, which produces optimal estimates regarding the uncoded transmitted information in the form of log-likelihood ratios (LLR), which are mapped back to bits to produce a final estimate of the source information.When joint equalization and decoding is performed, the equalizer, deinterleaver, and decoder are replaced by one functional block, as indicated by the dashed line around these functional blocks.Equalization and decoding are performed simultaneously, producing LLR estimates of the source information at the output.
Equalization, decoding, and joint equalization and decoding will be discussed in the context of the MAP algorithm, but for completeness the MLSE algorithm will also be discussed.The MLSE and MAP algorithms are discussed next, after which these algorithms will subsequently be related to equalization and decoding.

The MLSE and MAP algorithms
The MLSE and MAP algorithms are used for equalization as well as for the decoding of convolutional codes [1,2,6,7,11].The MLSE algorithm is able to optimally estimate the most probable sequence of transmitted symbols/codewords (depending on equalization/decoding), while the MAP algorithm exactly estimates the probability of each transmitted symbol/codeword.These algorithms are underpinned by Bayes' rule of conditional probability, which states that where P(d t = d (m) ) is the prior probability of transmitting symbol/codeword d (m) time instant t.For equalization, the symbols d (m) are chosen from a given modulation where σ is the noise standard deviation and (m) t is a cost function for the purpose of minimizing the Euclidean distance between the received symbol/codeword and the symbol/codeword to be estimated, which will be discussed in Sections 3 and 4 for equalization and decoding, respectively.Since P(d t = d (m) ) and P(r t ) are independent of the choice of d (m) , they can be absorbed into a normalization constant β, which, since In order to apply Bayes' rule over a transmitted block of N symbols/codewords, the product rule can be used to express (1) for a sequence of length N such that where any symbol/codeword d t in the symbol/codeword sequence d = {d 1 , d 2 , . . ., d N } can be substituted for any d (m) , and where r = {r 1 , r 2 , . . ., r N } is the received symbol/codeword sequence.

The MLSE algorithm
In 1972, Forney [11] showed that the VA [6,7,11], first developed by Viterbi in 1967 to decode convolutional error correction codes, can be used to determine the most likely transmitted sequence.This is done by using a trellis, a special graph, or remerging tree structure, representing all possible combinations of transmitted symbols, to determine the solution with the lowest cost through the trellis.The sequence of symbols with the lowest cost maximizes the probability that said sequence was transmitted, thus producing the optimal estimate for the transmitted sequence [2,11].The VA, or MLSE algorithm, attempts to minimize the cumulative cost function = N t=1 m t in (4), which in turn maximizes P(d|r) in (4).After minimizing in (4), there is no sequence of transmitted symbols/codewords that is more likely to have been transmitted.However, the probability of each estimated symbol/codeword in the sequence will not necessarily be maximized among all possible transmitted symbols/codewords.As stated before, the MAP algorithm is used to calculate exact posterior probabilities on each symbol/codeword.The minsum algorithm is used to find the MLSE solution and is discussed next.

The min-sum algorithm
In order to perform optimal sequence estimation, the min-sum algorithm is used.Viterbi [6] and Forney [11] showed that the min-sum algorithm can be used by constructing a trellis and tracing a path, corresponding to the most likely transmitted symbol/codeword sequence through the trellis.The min-sum algorithm allows for the elimination of more costly contending paths at each state on a trellis, thereby greatly reducing the computational complexity due to complete enumeration.For equalization, there will always be M L−1 possible paths at every stage in the trellis, where M is the modulation alphabet size and L is the channel memory length.Similarly, for decoding, there will always be 2 K−1 remaining paths at every stage in the trellis (if convolutional encoding is performed in GF(2)), where K is the encoder constraint length which introduces memory to the transmitted codewords.M L−1 and 2 K−1 also correspond to the number of respective states in the trellis for each time instant t, for the equalizer and the decoder.
Figure 2 shows a trellis with M L−1 = 2 K−1 = 4 states, corresponding to an equalizer used in a system where a BPSK modulated symbol sequence of length N is transmitted through a multipath channel with a CIR of length L = 3, or to a decoder used to decode a coded sequence of N codewords where a convolutional encoder with constraint length K = 3 is used.The trellis is initiated by assuming that it starts at state S 1 at time instant t = 0 and it is terminated at state S 1 at time instant t = N.For equalization, each state represents a unique combination of modulation symbols s m , where m = 1, 2, . . ., M, from a modulation alphabet D of size M, and for decoding each state represents a possible codeword at the output of the convolutional decoder.The edges or transitions from state S i , i = 1, 2, 3, 4 at time instant t to any other state S j , j = 1, 2, 3, 4 at time instant t + 1 describes the likelihood of this occurrence, which will be a maximum if  2) is a minimum.Table 1 shows the values associated with states S j , j = 1, 2, 3, 4 for equalization and decoding, respectively.
For equalization, each state will have M incoming transitions (from the left), while there will be two incoming transitions for decoding.a But as stated above, there will only be 2 L−1 = 2 K−1 = 4 remaining paths at each time instant in the trellis, which means that all but one path has to be eliminated at each state.To eliminate more costly paths at each state in the trellis at time t, the cumulative cost function = P t=1 m t in (4) is calculated for all possible paths leading to the said state up to time t, where the cost of the contending paths are compared, and the path with the highest cost (corresponding to the lowest probability), is eliminated.Therefore, at each state all the previous t,j→i 's are accumulated, and where there are contending paths, the path with the largest accumulated cost is eliminated.
There will therefore be 2 2 = 4 surviving paths in each stage on the trellis for stages beyond t = 2.When stages t = N − 2 to t = N are considered, the number of allowed transitions decrease, because of the known tail symbols at the end of the transmitted symbol sequence.For the last few states in the trellis the contending paths are also eliminated, until only one possible path remains at stage t = N.This path is then traced back to determine the most probable sequence of transmitted symbols/codewords.
The MLSE equalizer/decoder produces outputs from a set of M symbols for equalization or from a set of uncoded bits for decoding.These estimates are called hard outputs, since the estimates do not contain any probabilistic information regarding the reliability of those estimates.The

Table 1 Equalizer/decoder state values
MAP equalizer can be used to produce probabilistic information as an indication of the reliability of the estimates.The MAP algorithm is discussed next.

The MAP algorithm
Following the development of the Viterbi MLSE decoding algorithm [6], the BCJR algorithm [1], named after its developers Bahl-Cocke-Jelinek-Raviv, also known as the MAP algorithm, was developed in 1974, also for the decoding of convolutional codes.In the artificial intelligence community, this algorithm was developed independently by Pearl [12] and is called BP.The MAP algorithm is able to produce the posterior probability of each symbol in the estimated transmitted sequence, as opposed to maximizing the probability of the whole transmitted sequence, as done by the Viterbi MLSE equalizer [2,6,7,11].
The aim of the MAP algorithm is to maximize the posterior probability distribution for each transmitted symbol/codeword.While the MLSE algorithm assumes the prior probabilities P(d) of the transmitted symbols to be equal, the MAP algorithm is able to exploit the prior probabilities P(d), if necessary, in order to enhance the quality of posterior probabilistic information on each individual transmitted symbol.Like the MLSE algorithm, the MAP algorithm uses the model in (4) on a trellis, but unlike the MLSE algorithm, the MAP algorithm propagates the transition probabilities forward from past states to future states, as well as backwards from future states to past states, after which the marginalized probability for each estimated symbol/codeword d t is produced, given past and future information.That is [1,13] where again d (m) , m = 1, 2, . . ., M, is the mth symbol chosen from a modulation alphabet D of size M for equalization, or d (m) is the mth codeword chosen from a list http://asp.eurasipjournals.com/content/2013/1/79 of M codewords produced by a convolutional encoder for decoding, r is the received sequence, and N is the number of symbols/codewords in the received sequence.
Referring to Figure 2, the probability of a transition from state S j,t−1 (at time t − 1) to state S i,t (at time t), where i, j = 1, 2, . . ., M L−1 for equalization and i, j = 1, 2, . . ., 2 K−1 for decoding, is given by (6) where β is a normalization constant and P(r t |S j,t−1 , S i,t ) is given by By considering the relevant transition value b d ξ , P(d t ) can be written as [3,4] where L(.) denotes the LLR operation and dt is an estimate of d t .Therefore, substituting ( 7) and ( 8) in ( 6) yields which completely describes the probability of a transition from S j,t−1 (or d j ) at time t − 1 to S i,t (or d i ) at time t, based on all available information.Thus, for a transition labeled d ξ = 1 and L( dt ) equal to any large positive value, c P(d t = d (m) ) will be large, confirming the transition.Similarly, for a transition labeled d ξ = −1 and L( dt ) equal to any large negative value P(d t = d (m) ) will be large, also confirming the transition.However, for a transition labeled d ξ = 1 and L(d ξ ) equal to any large negative number, or for a transition labeled d ξ = −1 and L( dt ) equal to any large positive number, P(d t = d (m) ) will be small.Thus, if the prior information L( dt ) is in agreement with the transition value d ξ , the probability of that transition will increase, confirming that transition.Otherwise, if the prior information L( dk ) contradicts the transition value d ξ , the probability of that transition will be decreased.
Propagating the transition probabilities ω j→i,t (S j,t−1 , S i,t ) across the whole sequence from left to right and from right to left, respectively, and marginalizing according to (5), will produce the posterior probabilities of each estimated d k .The sum-product algorithm achieves exactly this [13].

The sum-product algorithm
The sum-product algorithm, also known as the forwardbackward algorithm, uses the trellis in Figure 2 to perform marginalization.This algorithm follows three steps [13]: 1. Determine the forward pass messages from left to right on the trellis.2. Determine the backward pass messages from right to left on the trellis.3. Multiply, scale and accumulate (marginalize) probabilities at each stage of the trellis.
To determine the forward pass messages on the trellis, let a counter t run from left to right (from 1 to N) on the trellis and compute for each state in the trellis where j represents the parent states of the current state at stage i of the trellis and ω t (S j,t−1 , S i,t ) is the probability associated with the transition from S j,t−1 to S i,t .Note that α j,0 = 1.Similarly, to determine the backward pass messages on the trellis, let a counter t run from right to left (from N − 1 to 1) on the trellis and compute for each state in the trellis where again j represents the parent states of the current state at stage i of the trellis and ω t (S j,t , S i,t+1 ) is the probability associated with the transition from S j,t to S i,t+1 .Note that β i,t = 1.Finally, the exact marginalized symbol probability is determined by summing over all states at each time instant t corresponding to a transition of either The MAP algorithm can also produce soft bits.The soft bits, also called LLRs, can be determined by where the sign of L( dt ) indicates whether dt = −1 or dt = 1, and |L( st )| is a measure of the confidence of that estimate.

Equalization
A mobile communication system transmission channel is characterized by multipath and fading.Multipath is the phenomenon resulting from time spreading of the transmitted signal as it is transmitted through the channel.
Fading, on the other hand, results from time variations in the structure of the transmission medium, causing the nature of the multipath channels to vary with time [2].During transmission, the transmitted symbols pass through the channel, which acts like a filter.The channel has a continuous impulse response, which is estimated at the receiver, to aid in the estimation of the transmitted information.
Each coefficient, or tap, in the impulse response of a multipath fading channel is modeled as a continuous function of time, where each coefficient in the impulse response corresponds to symbol period intervals tT s .As such, a tapped delay line is used to model the behavior of this channel, as shown in Figure 3. Figure 3 indicates that the tth transmitted symbol s t is delayed by T s seconds L − 1 times, where L is the CIR length and T s is the symbol period.Each delayed copy of s t is multiplied by h (t) l , l = 1, 2, . . ., L − 1, corresponding to the lth delay branch at time t.Therefore, the tth received symbol can be described by [2,11] where n t is the tth noise sample from the distribution N (μ = 0, σ 2 = 1).Each h (l) t is a sample taken from one of L time-varying functions, where t corresponds with the tth transmitted symbol and l = 1, 2, . . ., L − 1 is the CIR tap number.Each CIR tap is modeled as an independent uncorrelated Rayleigh fading sequence, using the model in [14].
If it is assumed that the CIR is time-invariant for the duration of a data block, (13) can be rewritten as where s t denotes the tth complex symbol in the transmitted sequence of N symbols chosen from an alphabet D containing M complex symbols, r t is the tth received symbol, n t is the tth noise sample from the distribution N (μ = 0, σ 2 = 1), and h l is the lth coefficient of the estimated CIR. Equalization is performed under the assumption that each CIR coefficient h l is time-invariant for the duration of a data block.The CIR h = {h 0 , h 1 , . . ., h L−1 } therefore completely describes the multipath channel for a given received data block, assuming that the data block is sufficiently short so as to render the CIR time-invariant.
The equalizer takes as input the received symbol sequence r as well as the CIR h.
To estimate the transmitted sequence of length N optimally in a wireless communication system transmitting modulated symbols though a multipath channel, the cumulative cost function in ( 4) must be minimized [2,11].Here s = {s 1 , s 2 , . . ., s N } is the most likely transmitted sequence that will maximize (4).Although the minimization of (15) will maximize (4), the resulting sequence estimates will only be optimal in the sequence sense.Depending on the application and whether the equalizer is followed by a decoder, either optimal sequence estimation or exact posterior probabilistic symbol estimation can be performed using the MLSE or MAP algorithms discussed above.

Decoding
In a mobile communication system, the transmitted signal is subjected to energy losses due to multipath and fading as well as interference due to thermal noise, resulting in unreliable estimates of source information in the receiver.In order to correct errors in the receiver, error control coding (ECC) is used to introduce controlled redundancy to the source information.The decoder exploits the structure introduced to the encoded symbol sequence by the encoder, to reconstruct the source information from the estimated coded symbols, correcting errors while doing so.
ECC plays an important role in digital communication systems.As the name suggests, ECC is used to allow for the correction of errors in the received symbol sequence.ECC is performed by adding controlled redundancy to the information being transmitted.Since the redundant information is mathematically related to the original information, errors can be corrected [2].Although the redundancy adds an overhead to the transmitted data, the performance increase overshadows this drawback.During the last few decades, ECC has been the subject of much research and significant contributions and advancements have been made.
The coding scheme considered in this article is convolutional encoding, as it is most often used in turbo equalization.A convolutional code is generated by passing the information sequence to be transmitted through a linear finite state shift register, consisting of K stages.d  The binary input is shifted into the shift register k bits at a time, producing n output bits.The code rate is therefore R c = k/n.The encoder can be expressed in terms of n generator polynomials, describing the connections between the states of the encoder shift register.Figure 4 shows the rate R c = k/n = 1/3, constraint length K = 3, convolutional encoder considered in this article.
c (1)   c (2)   c (3)    The generator polynomials that describe the connections between the elements in the shift register are given in sequence form as g 1 =[ 100], g 2 =[ 110] , and g 3 =[ 011], which can also be written in octal form as G = [ 4,6,3] or in polynomial form as G =[ 1, 1 + X, X + X 2 ].For every input source bit s t the encoder produces k = 3 output bits c (1)  t , c (2)  t , and c (3)  t , where c (1) (16) where ⊕ is the exclusive OR operation.It is assumed that the encoder starts in the all-zero state for every data block that is encoded.Also, the encoder is forced into a zerostate after the data block is encoded by appending K − 1 zero bits to the data block.This is done to enable decoding using an MLSE or a MAP decoder [1,6,7].
Each convolution encoder has a corresponding state diagram, which is used to map state transitions to encoder outputs, which in turn are used to determine the most likely state transitions during decoding.The state diagram of the encoder in Figure 4 is shown in Figure 5.Each state contains two bits representing the two leftmost elements in the encoder shift register.As bits are shifted through the encoder k = 1 bit at a time, transitions occur, with each state transition producing n = 3 output bits.The dashed lines are associated with transitions resulting from a zero at the input of the encoder, and solid lines are association with a one at the input of the encoder.The decoding of convolutional codes is closely related to equalization discussed in Section 3, in that the min-sum (Viterbi MLSE) and sum-product (MAP/BCJR) algorithms can also be used for decoding.The MAP algorithm is an attractive choice for use in iterative equalization/decoding algorithms.It is attractive for two reasons: first because it includes prior probabilistic information in the estimation, and second, because it provides soft posterior estimates regarding individual coded or uncoded symbols, which in turn can be used as prior information in subsequent iterations.
To decode a convolutional code using the MLSE or MAP algorithms discussed above, the cumulative cost function in ( 4) is minimized [6,11], where r t = {r t } at time instant t, selected based on the encoder output generated by the relevant state transition.e Here, k is the number of output bits generated by the encoder.The MLSE or MAP algorithms can be applied to find either the most probable transmitted sequence of codewords, or the most probable transmitted uncoded and/or coded symbols.The output of the latter can be used in a turbo equalizer as feedback to aid the equalizer in making more informed decisions.

Non-iterative joint equalization and decoding (NI-JED)
In conventional single-carrier wireless communication systems, where the coded information is transmitted through a multipath channel, equalization and decoding, as explained in Section 3 and 4, are performed separately.However, joint equalization and decoding can also be performed on a single trellis when it is assumed that the coded symbols are not interleaved before transmission, or that certain assumptions are made regarding the structure of the interleavers used to interleave the coded symbols.Interleavers allow for the temporal separation of adjacent coded symbols during transmission, so that when burst errors occur in response to a loss of signal power during fading, the resulting errors in an error burst will be distributed throughout the data block after deinterleaving.This leads to performance gains in the receiver, as the equalizer and the decoder are able to infer information and correct single errors more accurately, as opposed to correcting consecutive, or burst errors.Therefore, interleavers are employed in a wireless communication system where the transmitted signal is subject to multipath and fading [2].
In [9], an optimal joint equalizer and decoder is presented, where the equalizer and the decoder are jointly modeled on one trellis, a super-trellis according to [9], and a block interleaver is used to interleave the coded symbols before transmission.The computational complexity of this joint equalizer and decoder, however, is not only exponentially related to the channel memory length and the encoder constraint length, but also to the interleaver depth.The interleaver depth determines the degree of separation between the coded symbols, which has a direct effect on system performance.In order to improve system performance the interleaver depth has to be increased, which results in an exponential increase in computational complexity.Ideally a random interleaver should be used for maximum performance gains, but a random interleaver has a devastating effect on the causality relationship between transmitted and deinterleaved received symbols.This joint equalizer/decoder is therefore only feasible when limited depth block interleavers are used.Joint equalization and decoding using a super-trellis is discussed next, first for systems where no interleaving is performed, and then for systems where depth-limited interleavers are used.

No interleaving
Noting the striking similarities between MAP equalization and MAP decoding in Section 2, a possible joint equalization and decoding algorithm can be envisioned.In order to equalize and decode a received symbol sequence jointly, the cumulative costs for equalization in (15) and decoding in (17) must be combined in order to equalize while decoding.Either the MLSE or MAP algorithms can be used for this purpose, since the super-trellis encapsulates all the available information and therefore no exchange of information between equalizer and decoder is necessary.No prior information is therefore required when calculating state transitions on a super-trellis.
Consider a system transmitting non-interleaved coded information through a multipath channel of length L. A rate R c = k/n constraint length K convolutional encoder is used to produce a sequence c of coded bits of length N c from an uncoded bit sequence s of length N u .The number of super-trellis states required to perform joint equalization and decoding is the product of the number of states of the individual trellises required to perform equalization and decoding separately.Therefore, the number of supertrellis states is M L−1 M K−1 = M (L−1)+(K− 1) , where M = 2 due to BPSK modulation as well as GF(2) encoding.
When the MAP equalizer was considered, the probability of a transition from uncoded symbol s j transmitted and time t − 1 to s i transmitted at time t http://asp.eurasipjournals.com/content/2013/1/79was calculated (see ( 6)), and when the MAP decoder was considered, the probability of a transition from codeword c j transmitted at time t − 1 to c i transmitted at time t was calculated (although d j and d i were used instead for both the equalization of symbols and decoding of codewords).When modeling the equalizer and decoder jointly, however, transitions between super states are calculated on a super-trellis, where each state represents the uncoded symbols that constitute a codeword, together with interfering symbols of previous codewords due to multipath.Suppose a communication system encodes a source bit sequence s of length N u with a rate R c = k/n = 1/3, constraint length K = 3, convolutional encoder with arbitrary generator polynomials g 1 = {g (1) 2 } and g 3 = {g 3 }, in order to produce a coded bit sequence c of length N c = N u /R c , which is BPSK modulated and transmitted through a multipath channel with a CIR h of length L. The encoder produces the coded bit sequence of length N c from the length N c uncoded bit sequence s = {s 1 , s 2 . . ., s N u −1 , s N u }.

Channel length L
Given any channel memory length L, encoder constraint length K, code rate R c = k/n = 1/3, the number of bits needed to be represented by a superstate can be determined by Q + (K − 1), where , which in turn can be used to determine the number of super-trellis states by M = 2 Q+(K+1) .Keeping track of the ISI undergone by the coded symbols due to multipath, and replacing those symbols with their corresponding uncoded symbols, a model for the received symbols can be derived.The probability of a transition from superstate S j,t−1 superstate S i,t is given by ω j→i,t (S j,t−1 , S i,t ) = P(r (1)  t , r (2)  t , r (3)  t |S j,t−1 , S i,t )P(s t ), (31 which can be rewritten in terms of the uncoded symbols represented by each state t , r (3) where P(r (1)  t , r (2)  t , r (3)  t |s t , s t−1 , . . ., s t−log 2 (M) ) i,t can be determined by minimizing the receiver equations such that (1)  u s t−m ⊕ g (2)  u s (t−m)−1 ⊕ g (3)  u s (t−m)−2 ) h l (g (1)  w s t−o ⊕ g (2)  w s where Finally, since there are always only two equiprobable codewords that can follow each codeword, P(s t ) = 0.5.Therefore, which completely describes the transition from state S j,t−1 to S i,t .The model derived for the NI-JED without interleaving can be used to jointly equalize and decode BPSK modulated coded information transmitted through a multipath channel with CIR length L, where the convolutional encoder has constraint length K and the code rate is R c = k/n = 1/3.This model can be extended for higher order modulation alphabets, larger encoder constraint lengths, and different code rates.f

Computational complexity
The computational complexity is determined by counting the number of computations performed for each received data block, and expressed in terms of the uncoded block length N u , the CIR length L and the encoder constraint length K and the modulation alphabet size M.The computational complexity of the NI-JED without interleaving, for the MLSE and MAP algorithms, is determined by and Figure 6 shows the normalized computational complexity g of this joint equalizer and decoder, employing the MLSE algorithm, for a CIR length of L = 2 to L = 10, constraint lengths K = 2, K = 4, and K = 16, and modulation alphabet sizes of M = 2, M = 4, M = 16, and M = 64.Figure 7 shows the complexity of employing the MAP algorithm for the same parameters, where the respective complexity curves are only slightly higher than their corresponding MLSE complexity curves in Figure 6.It is clear that the computational complexity grows exponentially with an increase in channel memory, encoder constraint length, and modulation alphabet size, and it should be clear that computational complexity for long channel memories and large encoder constraint lengths is too high for feasible implementation.

Block interleaving
In [9], it was demonstrated that equalization and decoding can be performed jointly by transforming the convolution decoder trellis into a super-trellis, and then using this super-trellis to perform equalization while decoding.This is a novel idea in the sense that all the available information is processed as a whole, and there is no exchange of information between independent subunits.This approach achieves optimal performance, because all calculations on the super-trellis are performed with complete information.The only limitation is that the interleaver has to be a block interleaver and that this interleaver has a certain depth limit due to the exponential relationship between the interleaver depth and the computational complexity.
Contrary to this approach, turbo equalization makes use of two independent subunits, namely a MAP equalizer and a MAP decoder, with each unit being supplied with information regarding the decisions made by the other unit [3][4][5].This results in assumptions and estimations being made by the respective subunits based on incomplete information, which in turn results in suboptimal calculations during subsequent iterations, ultimately leading to suboptimal performance.Only by iteratively exchanging extrinsic information between the equalizer and decoder, where this information is used as prior information on the calculations in the next iteration, can the performance be increased.However, the turbo equalizer is not limited by the structure of the interleaver, since interleaving and deinterleaving are performed during each turbo iteration.
It was shown in [9] that the NI-JED performs optimally and therefore outperforms the turbo equalizer, but its computational complexity grows unimaginably as the interleaver depth and the CIR length increase.Despite the excellent performance of the NI-JED, it is not a viable solution for practical systems.Although the computational complexity of a conventional turbo equalizer (CTE) is not as high, it is still exponentially related to the encoder constraint length, as well as the channel memory length, but it is not influenced by the structure of the interleaver.
In this section, this author attempts to derive a general model for the NI-JED proposed in [9], which can be presented as a function of encoder constraint http://asp.eurasipjournals.com/content/2013/1/79The task of the NI-JED is to produced the MAP of each uncoded source symbol at time t from the deinterleaved received symbol sequence, or The probability of a transition from state S j,t−1 to state S i,t , where i, j = 1, 2, . . ., M is given by ω j→i,t (S j,t−1 , S i,t ) = P(r (1)  t , r (2)  t , r (3) where t , r (3) i,t can be determined by minimizing the receiver equations in (39) such that 1,t = r (1)  t − L−1 l=0 h l (s t−l ) where the calculation of each i,t , i = 1, 2, . . ., n is determined by the structure of the encoder.Finally, since there are only two equiprobable bits that can be transmitted, P(s t ) = 0.5.Therefore, which completely describes the transition from state S j,t−1 to S i,t .

Interleaver depth: D = 2n = 6
In the above explanation, the depth of the block interleaver was D = n.It is subsequently assumed that an interleaver with depth D = 2n is used.Applying the 2n × N u /2 interleaver in Figure 10 to the coded symbols N u , c (2) (64) and P(s t ) = 0.5 as before.

Computational complexity
The computational complexity of the NI-JED, with block interleaving, is determined by counting the number of computations performed for each data block received, and expressed in terms of the uncoded block length N u , the CIR length L and the encoder constraint length K, the modulation alphabet size M, and the interleaver depth D. The computational complexity of the NI-JED without interleaving, using the MAP algorithm, is determined by Figure 12 shows the normalized computational complexity for a CIR length of L = 2 to L = 10, encoder constraint lengths K = 2, K = 4, K = 6, a modulation alphabet size of M = 2 and interleaver depths of D = 3, D = 6, D = 9, and D = 12.It is clear that the computational complexity grows exponentially with an increase in channel memory, encoder constraint length and interleaver depth.
Figure 13 shows the normalized computational complexity for the same parameters but with a fixed interleaver depth of D = 6 and varying modulation alphabet sizes of M = 2, M = 4, M = 16, and M = 64, resulting in an increase in complexity with an increase in modulation alphabet size.

Simulation results
The performance of the NI-JED is evaluated for BPSK modulation with and without interleaving, and it is compared to that of a CTE, where the number of CTE iterations is Figure 14 shows the BER performance of the NI-JED compared to that of the CTE, for various interleaver depths and a channel length of L = 2. From Figure 14 it can be seen that the performance of both algorithms improves with an increase in the interleaver depth, except for an increase from D = 1 to D = 3.It is clear that the NI-JED outperforms the CTE, even thought the CTE uses multiple iterations, while the NI-JED performs close to the coded additive white Gaussian noise (AWGN) bound when the interleaver depth is D = 9.In Figures 15 and 16, the performance of the NI-JED is again compared to that of the CTE for various interleaver depths, for L = 3 and L = 4, respectively.As before, the NI-JED outperforms the CTE.From Figures reffig14, 15, and 16, it is clear that the NI-JED is superior in terms of performance, and it is in fact optimal.Even though the NI-JED outperforms the CTE, its vast computational complexity inhibits it from finding practical application.The CTE is therefore used as an alternative solutions in practical systems.

Conclusions
In this article, optimal equalization and decoding using the MLSE (min-sum) and MAP (sum-product) algorithms were discussed.It was shown how the MLSE algorithm can be used to determine the most likely sequence of estimates, while the MAP algorithm can be used to determine optimal posterior probabilities regarding the transmitted symbols or codewords.NI-JED was also discussed, first assuming no interleaving and then assuming that special block interleavers were used for interleaving.As a result, a general model was derived for systems transmitting convolutionally encoded BPSK modulated information through a multipath channel of length L, where the information is interleaved with an interleaver of depth D = dk, http://asp.eurasipjournals.com/content/2013/1/79and where the uncoded information is encoded with a rate R c = 1/k encoder with constraint length K.The computational complexity was analyzed by counting the number of computations performed, given certain system parameters.The complexity of the NI-JED without interleaving grows exponentially with an increase in either channel memory length or encoder constraint length, while the complexity of the NI-JED with block interleaving is exponentially related to the channel memory length, encoder constraint length, and the interleaver depth.From these analyses, it is clear that NI-JED is extremely expensive in terms of the number of computations required, even for moderate channel memory lengths, encoder constraint lengths, and interleaver depths.In order to achieve acceptable performance in a multipath fading environment, block interleavers with depths of multiple orders of the encoder output length are required to separate adjacent coded symbols sufficiently.Under these conditions the NI-JED becomes infeasible.Ideally, a random interleaver will allow for maximum performance gains, but the NI-JED cannot be applied when interleaving is performed with a random interleaver.The CTE is therefore applied in systems transmitting randomly interleaved coded information through a multipath channel.
Turbo equalization is used as an alternative to optimal NI-JED, which is not feasible because of computational complexity constraints discussed before, to the extent that approximate inference via the iterative exchange of information is the last resort.Turbo equalization is not optimal, as demonstrated in this article, but it is the best alternative amongst all iterative joint equalization and decoding solutions, as its constituent parts-the MAP equalizer and the MAP decoder-produce optimal posterior estimates about the respective coded and uncoded transmitted symbols.

Endnotes
a It is assumed that GF(2) decoding is used as usual.b d ξ ∈ {−1, 1} for BPSK.c The magnitude of L( dt ) gives an indication of the confidence of that estimate.d K is also known as the constraint length.e During decoding, the output bits c t = {c t } are made bipolar because the elements of c t are compared to received symbols, and not bits.f Joint equalization and decoding using a higher order modulation alphabet will require convolutional encoding to be performed in GF(M), where M is the modulation alphabet size.g The computational complexity is normalized by the number of coded transmitted symbols N c .h Note that the channel coefficients are not shown in Figure 9. i This step in not mentioned, nor implied in [9], but has been inferred by the author of this article.j The min-sum algorithm can also be used.

1 Figure 1
Figure 1 Wireless communication system block diagram.

Figure 2
Figure 2 Trellis diagram used to aid the explanation of the equalizer and the decoder.

Figure 3
Figure 3 Tapped delay line used to simulate ISI in a multipath fading channel.

Figure 4
Figure 4 Rate 1/3 convolutional encoder used in this article.

,
and v = ((u+1) mod 3)+1 and w = ((u+2) mod 3)+1.Also m, n and o starts at 0 and increases by 1 each time uncoded symbols from the previous time instant are used to create the coded symbol interfering with the current received symbol such that
the uncoded and coded block lengths are N u = 600 and N u = 1800, respectively, the various channel lengths are L = 2, L = 3, and L = 4, and interleaver depths of D = 1, D = k, D = 2k, and D = 3k were used, where k = 3 is the number of encoder output bits.Uncoded and coded data block lengths are N u = 600 and N c = 1800, respectively.
alphabet D of size M, where m = 1, 2, ..., M, and for decoding, codewords symbols d(m)are chosen from a list of M possible codewords.Since it is assumed that the received symbol/codeword sequence is corrupted by white Gaussian noise, the probability of receiving r t and time instant t having transmitted d