Code-Aided Estimation and Detection on Time-Varying Correlated Mimo Channels: A Factor Graph Approach

This paper concerns channel tracking in a multiantenna context for correlated ﬂat-fading channels obeying a Gauss-Markov model. It is known that data-aided tracking of fast-fading channels requires a lot of pilot symbols in order to achieve su ﬃ cient accuracy, and hence decreases the spectral e ﬃ ciency. To overcome this problem, we design a code-aided estimation scheme which exploits information from both the pilot symbols and the unknown coded data symbols. The algorithm is derived based on a factor graph representation of the system and application of the sum-product algorithm. The sum-product algorithm reveals how soft information from the decoder should be exploited for the purpose of estimation and how the information bits can be detected. Simulation results illustrate the e ﬀ ectiveness of our approach.


INTRODUCTION
Communication over time-varying fading channels has been studied intensively during the last decade [1][2][3].The introduction of turbo coding and channel interleaving gave rise to astounding performance results.In particular, channel interleaving [2][3][4] combined with coding can combat the adverse conditions, originating from the time varying nature of the channel, by spreading channel errors, caused by deep fades, over the full length of the frame.When further applying multiple transmit and receive antennas (resulting in a so-called MIMO transmission), high data rates and high diversity gains can be achieved simultaneously.However, in order to fully exploit these advantages, accurate knowledge of the channel state is required.Although a lot of research effort has been focused on this subject [5][6][7][8][9][10][11], estimation and tracking of fading channels remains a major challenge.
The Kalman filter/smoother [12] is a powerful tool to obtain the minimum mean-squared error (MMSE) estimate of a parameter varying according to a discrete-time linear model.This technique is particularly convenient for pilotassisted estimation of a time-varying channel [7][8][9].However, estimating a time-varying channel in the presence of unknown data symbols is not possible by straightforward Kalman filtering/smoothing.This has led to the introduction of several modified approaches for the estimation of a time-varying channel (see [7,10,11] and references therein).The problem related to the unknown symbols was circum-vented by introducing an iterative decision-directed structure.
Several years ago, it has been recognized that Kalman filtering can be interpreted as a message-passing algorithm (the sum-product (SP) algorithm) on a factor graph [13].Ever since, the SP algorithm has been applied to a various number of estimation problems [14][15][16][17], capitalizing on the concepts from [18,19]: the algorithm iterates between decoding and estimation, whereby the estimator accepts information from the decoder about the unknown data symbols.In [14] the estimation of a linear dynamical noise process is considered.In [15], the authors consider the tracking of a time-varying complex gain for single-input single-output (SISO) channels.A similar problem is considered in [16,17], namely, phase noise estimation.As elaborated upon, the SP algorithm runs into practical difficulties in the presence of unknown data symbols.The problems are alleviated by representing and computing messages in an efficient fashion.
In this paper, we apply these ideas to the factor graph of a flat-fading correlated multiple-input multiple-output (MIMO) system with bit-interleaved coded modulation (BICM).The temporal behavior of our channel is modeled as a first-order autoregressive model [11,20], whereas the spatial correlation abides by the findings from [21,22].As we will show, the complexity of the SP algorithm, in its exact form, is exponential in the block length.To overcome this problem, we introduce a suitable approximation.The resulting code-aided estimator exploits information about EURASIP Journal on Applied Signal Processing the received signal as well as soft information from the decoder in a systematic manner.This paper is organized as follows.A short introduction on factor graphs is given in Section 2. The system model is described in Section 3.This is followed by a factor graph representation of the receiver and derivation of the SP algorithm on this graph.In Section 5 the practical estimation algorithm is derived.Before conclusions are drawn, the performance of the proposed algorithm is illustrated in Section 6.

FACTOR GRAPHS AND THE SUM-PRODUCT ALGORITHM
In this section, we briefly outline the basic ideas behind factor graphs and the sum-product algorithm.We refer to [13,23] for a more profound analysis.

Factor graph
A factor graph is an elegant method to express the factorization of a function depending on many variables.As an example, consider the factor graph depicted in Figure 1.The graph represents the factorization of the following function: We observe two types of nodes: function nodes (indicated by squares) and variable nodes (indicated by circles).When a function depends on some variable, there is a connection between the corresponding function node and variable node.
It is interesting to note that any type of function is suitable for a factor graph representation, however, throughout this paper, we will only consider the factorization of probability density functions.

Sum-product algorithm
In addition to visualizing the factorization of a (complicated) function, factor graphs also allow us to compute the marginals of that function in a systematic manner.The marginal of a function f (x 1 , . . ., x N ) with respect to the variable x i is defined as where ∼ {x i } represents the set containing all variables, except x i .If (some of the) variables are continuous, the summations with respect to these variables in (2) should be replaced by integrals.
The SP algorithm is a message-passing algorithm, that provides an efficient way to compute the marginals (2).Messages are computed in the different nodes based on the incoming messages on these nodes.Depending on the type of node, function node or variable node, the outgoing messages are computed according to variable node: function node: The message-passing algorithm is initiated at nodes of degree 1, that is, nodes which are connected to one neighboring node only.Messages travel on the graph until all ingoing and outgoing messages of all nodes have been computed.If the graph contains no cycles, the algorithm is assured to converge, and the marginal with respect to a certain variable is obtained as the product of a pair of in-and outgoing messages on the corresponding variable node: If the graph does contain cycles, the algorithm becomes iterative and the computed marginals are no longer assured to be exact.The larger the cycles are, the more accurately the computed marginals will approximate the true marginals.

SYSTEM MODEL
We consider a flat-fading MIMO channel with N T transmit and N R receive antennas.The transmitter, based on BICM (as illustrated in Figure 2), encodes and interleaves a sequence of L information bits b = [b 1 , . . ., b L ].The resulting coded bits are mapped to a sequence of K coded symbol vectors a k , k = 1, . . ., K, each of dimension N T × 1.The nth entry of a k denotes the coded symbol, transmitted by the nth antenna at instant k.The mapping is described by a bijective mapping function M : {0, 1} MNT → Ω NT , where Ω denotes a 2 M -ary signal set, that is, with {a k [m], m = 1, . . ., MN T } denoting the MN T coded bits that are contained in the symbol vector a k .Irrespective of the type of mapping function, whether it concerns a singleor multidimensional [24] mapping, we can generally state that each symbol vector a k depends on MN T bits.Note that inserting a bit interleaver between the encoder and the modulator spreads the burst errors, introduced by the time-selective fading channel.This way, the channel appears to be uncorrelated from the decoder's point of view and the time diversity provided by the fading channel is fully exploited.Assuming a flat-fading channel, the received signal after matched filtering can be captured in the following discretetime model: where y k is a N R × 1 vector of received signal samples at time instant k, H k denotes the N R × N T channel matrix, a k denotes the N T × 1 transmitted symbol vector, with an average energy per symbol equal to E s , and w k is a N R × 1 vector of independent white complex Gaussian noise samples with independent real and imaginary parts each with a variance equal to N 0 /2.We introduce the matrix of received samples In practice, the channel coefficients corresponding to the links between the different transmit and receive antennas will not be (totally) uncorrelated.The impact of this spatial correlation can be modeled by decomposing the channel matrix at each time instant as follows [21,22]: where Σ T and Σ R denote the transmit and receive array correlation matrices and where N denotes a N R × N T matrix containing i.i.d.zero-mean, unit-variance complex Gaussian elements.Various models have been proposed to characterize the temporal behavior of fading channels.Capitalizing on the information-theoretic results from [25], we adopt a firstorder autoregressive model or Gauss-Markov model in this paper.Accordingly, our fading channel can be modeled as where N k represents a N R × N T matrix containing i.i.d.zeromean, unit-variance complex Gaussian elements.We further assume that the channel retains the steady state statistics given by ( 8) at instant k = 1.Thus, H k will be a stationary process with the following properties, for all time instants k: where X (n,m) denotes the (n, m)th entry of the matrix X.The coefficient α (with |α| < 1) is related to the Doppler spread f d according to the first-order approximation of Jakes' channel model [26]: where T is the symbol period and J 0 (•) denotes the zerothorder Bessel function of the first kind.The closer α to 1, the smaller the Doppler spread and the slower the fading.Channel model ( 9) is general and permits both temporal and spatial correlations.Note that a similar channel model was adopted in [20] for single-input multiple-output (SIMO) channels.Several other channel models can be considered as special cases of our model.The quasi-static correlated fading model from [21,22] is obtained by setting α = 1.The fastfading model from [11] with uncorrelated antennas can be cast into this general model by setting Σ T = I and Σ R = I.
To facilitate the analysis in the remainder of the paper, we introduce a vector notation of the channel matrix h = vec(H T ), where the different rows of H are transposed and stacked in the N T T R × 1 column vector h.Based on this new notation, we can rewrite the channel state (9) in the following manner: where we introduced the array correlation matrix Σ .= Σ R ⊗ Σ T (with ⊗ denoting the Kronecker product) and where the N T T R × 1 vector n k contains i.i.d.zero-mean, unit-variance Gaussian elements.

DETECTION AND ESTIMATION USING THE SP ALGORITHM
The main objective of the receiver in digital communication systems is to detect the transmitted information bits.In order to do so, the receiver requires an accurate estimate of the channel matrix (at each time instant).In this section, we adopt the concepts introduced in Section 2 to the detection and estimation problem at hand.The resulting algorithm yields channel estimates and reveals how these estimates should be applied in order to detect the information bits.The theoretical derivations from this section are transformed into a practical algorithm in Section 5.

Factor graph
Considering the information bits b, the data symbol matrix A = {a k } k=1,...,K , and the set of all channel gain matrices H = {H k } k=1,...,K as variables, we can write their joint a posteriori distribution as p(H) Interleaver where we assumed that the transmitted symbols are independent with respect to the channel.This is a reasonable assumption, since it is hard to obtain accurate channel knowledge at the transmitter side in fast-fading channels and it is therefore difficult to exploit channel knowledge for selecting optimal transmission strategies.Observing the Markov chain behavior of the channel (9), we can factor the joint probability of the channel matrices at different time instants 1, . . ., K as follows: where p(H k | H k−1 ) is fully determined by ( 9) where h k = vec(H T k ).The flat-fading channel model (7) further implies that where denotes the Frobenius norm.Interpreting p(b, A, H | Y) as a function of the variables b, A, and H and taking the factorizations ( 13), (14), and ( 16) into account, we obtain the factor graph depicted in Figure 3; for more clarity, a detail of Figure 3 is presented in Figure 4. We assume that the information bits are independent.The node marked C represents the constraint on the coded bits, enforced by the code.Together with the interleaver and the mapper nodes, this part of the graph represents the factorization of p(A | b).

Sum-product algorithm
The SP algorithm permits us to compute the marginals of p(b, A, H | Y).The purpose of the receiver consists in detecting the information bits; hence, the only relevant marginals are the a posteriori probabilities of the information bits p(b l | Y) for all l.In order to recover these, we compute the corresponding messages on the factor graph.
Unfortunately, the graph from Figure 3 contains cycles.It is well known [13] that in this scenario (i) the SP algorithm produces approximations of the marginals, instead of the exact marginals, and (ii) the SP algorithm becomes iterative.Although suboptimal, the SP algorithm still produces good results, as long as the cycles are not too short, and sufficiently many iterations are performed [23].
We will distinguish two phases within the iterative algorithm: a detection phase and an estimation phase.Information about the coded symbols and the channel is exchanged between these two stages.(1) To compute the extrinsic information of the coded symbol vectors P e (a k ).This information is required for the channel estimation, as explained in Section 4.2.2.(2) To return the a posteriori probabilities of the information bits, after convergence of the SP algorithm.
A typical iterative detector operates according to the turbo principle by exchanging the so-called extrinsic information between the demapper and the decoder.Although a thorough investigation of these parts is not within the scope of the present paper, we provide a short overview of their interaction.The interested reader is referred to [4,13,24] for more details.At the start of the detection phase, we receive channel information from the estimator by means of the messages 1 P e (H k ) defined in Figure 4. Together with the information obtained from the observation y k , we compute the messages P LH (a k ) according to the SP rule (4), that is, The message P LH (a k ) can be interpreted as the likelihood (LH) of the observation y k given the transmitted symbol vector a k and the a priori distribution of the channel P e (H k ).The operation referred to as demapping converts these symbol likelihoods into coded-bit likelihoods by accepting from the decoder extrinsic information on the coded 1 Section 4.2.2 considers how to compute P e (H k ).bits, where P e (a k [m ]) denotes the extrinsic information with respect to the m th bit of the kth symbol vector, provided by the decoder.A description of F M→D (•) can be found in [4,24].Similarly, the decoder accepts a deinterleaved version of the bit likelihoods P LH (a k [m]) and a priori information of the information bits P a (b l ) to update the extrinsic information P e (a k [m]): For various codes, evaluation of F D→M (•) can be done in a computationally efficient manner [27][28][29].Iterations between the demapper and decoder are performed until convergence.The detection phase ends by returning the extrinsic symbol vector probabilities P e (a k ) = m P e (a k [m]) to the estimator.
When the entire SP algorithm has converged, the decoder computes the extrinsic probabilities of the information bits in an efficient manner: The resulting a posteriori probabilities of the information bits are obtained as Based on (21), final decisions with respect to the information bits are made.Algorithm 1 summarizes the operation of the detector.

Estimation
The estimation phase corresponds to the SP operation on the nodes p( H) and p(Y | A, H).At the beginning of the estimation phase, we have the extrinsic symbol vector probabilities P e (a k ) at our disposal.The goal of the estimator is to update the extrinsic channel probabilities P e (H k ) and feed these back to the detector.We distinguish two types of messages in the evaluation of the sum-product algorithm: forward and backward messages.

Forward message passing
In the forward message-passing phase, we compute the messages P f k|k−1 (H k ), P f k|k (H k ), and P LH (H k ) which are defined in Figure 4.The relation between these messages is found by a straightforward application of the sum-product rules (3) and (4).Based on (4), we deduce the following relations: From (3), we obtain Combining ( 22), (23), and ( 24), we obtain a recursive relation between P k|k (H k ) and P k−1|k−1 (H k ) of the form Note that when a variable is defined over a continuous domain (i.e., C NT ×NR in the case of H k ), representation and computation of the messages is a major complexity issue in the SP algorithm.In Section 5, we will tackle this particular problem.

Backward message passing
Based on the SP rules, we can also compute the backward messages from Figure 4, Again, we obtain a backward recursive relation between these messages:

Information to the detector
As readily seen from Figure 4, P e (H k ) follows from (3), Finally, the estimator returns this extrinsic information about the channel matrix to the detector.The operation of the entire estimation is summarized in Algorithm 2.

Regarding complexity
An important issue with respect to factor graphs is how the messages are scheduled along the graph during the SP calculation.A proper scheduling of the messages can reduce the computational complexity of the receiver.As outlined in Section 4.2.1, the detector itself is iterative.Iterations occur between the demapper and decoder or within the decoder itself (e.g., turbo-like codes).To minimize the overhead caused by the estimation, we propose to embed the estimation into this iterative detection process.Our intent is to perform only a single demapping or decoding iteration within each detection stage and to maintain, rather than reset, state information at the beginning of the detection phase.More (1) input: P e (a k ), ∀k (from detector) (2) initialize P f 0|0 (H 0 ) and Algorithm 2: Description of estimator operation.
specifically, the value I MAX in Algorithm 1 is set equal to I MAX = 1 and the initialization P e (a k [m]) = 1/2, for all k, m is ignored.Furthermore, when the decoding process itself is iterative, only one decoding iteration per detection iteration is performed.

PRACTICAL ESTIMATION ALGORITHM
In this section we derive a practical iterative estimation algorithm based on the results from the previous section.Before we evaluate the SP algorithm, we recall that representation and computation of the messages in the SP algorithm is not always straightforward.In particular, messages that operate on continuous variables are often difficult to represent or can lead to intractable update rules (e.g., an intractable integration in (22) or ( 27)).However, a few message types render a fairly easy representation.Gaussian probability density functions (pdfs), for example, are entirely defined by their mean and covariance matrices.This allows a very straightforward representation.As we observe from (23), P LH (H k ), and also P f k|k (H k ) and P b k|k (H k ) are no Gaussian pdfs, but rather a mixture of Gaussian pdfs.Furthermore, the number of terms in this mixture grows exponentially with increasing time index k for P f k|k (H k ) and with decreasing k for P b k|k (H k ).Hence, the exact representation and computation of these messages becomes intractable.In order to solve this problem, we perform a well-chosen approximation.The idea is to approximate each of these messages, again, by a single-Gaussian pdf (instead of a mixture of Gaussian pdfs).
In order to do so, we approximate the distribution P LH (H k ) with the following distribution: where a k is defined as the soft-symbol decision based on the extrinsic probabilities The error induced by this approximation is minor when the distribution P e (a k ) has a pronounced peak, that is, when P e (a k = a) ≈ 1 for a particular a and P e (a k = a * ) 1 for a * = a.Hence, as long as the detector provides reliable information, the approximation is accurate.We conjecture that the approximation is quite accurate in any relevant context, since, in general, code-aided estimation schemes only perform well when they have access to sufficiently reliable information about the unknown symbols.
The approximation in formula (30) allows us to represent P LH (H k ) by a Gaussian pdf.Since the product of Gaussian pdfs (as in ( 24) and ( 26)) and marginalization of a Gaussian pdf (as in ( 22), (23), and ( 27)), results in a Gaussian pdf again, all forward and backward messages on the graph turn out to be Gaussian pdfs.Hence, all messages within the SP algorithm can easily be represented by their mean and covariance matrices.
In the next two paragraphs, we tackle the actual computation of these messages.We consider two scenarios: correlated receive antennas and uncorrelated receive antennas.

Correlated receive antennas Σ R = I
As shown in Figure 3, the estimation phase corresponds to the upper part of the factor graph.It is readily seen from ( 12) and ( 30) that this part of the factor graph represents the following state-space model: where we introduced the N R × N T N R matrix The evaluation of the SP algorithm on a factor graph representing a state-space model similar to (32) has been considered in [13,23].The main conclusion was that the SP algorithm boils down to a straightforward Kalman smoother.As we elaborated upon, all messages on the factor graph are Gaussian pdfs.The recursive relations between these are obtained by evaluating ( 25) and ( 28) for Gaussian pdfs.This results in a Kalman smoother, which defines the relation between the mean and covariance matrices of these Gaussian pdfs.We refer to [12,23] for the Kalman filter/smoother update rules.

Uncorrelated receive antennas Σ R = I
When receive correlation is absent or ignored, the section of the factor graph corresponding to p( H) turns out to be decoupled.We can factorize the nodes corresponding to p(y p(H k | H k−1 ) as follows: where h (n)  k denotes the nth column of H T k .Similarly, we can decouple the approximation for P LH (H k ) in (30): Note that the latter is valid for any Σ R .We can easily take these factorizations into account by replacing the grey area in our original factor graph from Figure 3 with the grey area from Figure 5.The state-space equations that correspond to this part of the factor graph are now given by for n = 1, . . ., N R .Again, evaluation of the SP algorithm boils down to Kalman smoothing.However, compared with the general case Σ R = I, the complexity has been reduced significantly.Instead of one large Kalman smoother, we encounter a bank of N R parallel Kalman smoothers.Furthermore, the bulk of the required computations are common to all these Kalman smoothers.As seen from ( 36), only the observations y (n) k differ among the state equations for different antennas.
The other inputs remain the same and the Kalman smoothers share common covariance matrices (whereas the mean vectors differ).Breaking up the state equations according to (36) yields a reduction in the computational complexity proportional to N 2 R .

Known data symbols: initialization
If all the transmitted symbols are known to the receiver, the message P e (a k = a) is 1 when a equals the actual value of the kth transmitted symbol vector, and 0 otherwise.Thus, the resulting factor graph contains only the parts p( H) and p(Y | A, H) from Figure 3, along with the input messages P e (a k = a).This graph is cycle-free, hence, the a posteriori probability functions computed by the SP algorithm are exact.Naturally, this algorithm amounts to a standard dataaided Kalman smoother.
In practice, of course, we wish to transmit unknown coded symbols over the fading channel.Still, we periodically insert some known symbols to provide initial channel estimates and to prevent the algorithm from diverging.Divergence can occur due to the inherent ambiguities between the channel parameters and the unknown symbols (as mentioned in [5,11]).
In the first iteration, no information is available about the unknown symbols and estimation is performed based on these pilot symbols only.More specifically, for instants k corresponding to unknown data symbol vectors, the messages P LH (H k ) are ignored.This is equivalent to equating the soft symbols to zero in the state space (32) or (36).For each instant k that corresponds to a pilot symbol, a k is replaced by the actual value of the transmitted pilot symbol a k .

SIMULATION RESULTS
We present simulation results for a MIMO BICM scheme [24,30] with N T = 2 transmit antennas and N R = 2 receive antennas.At the transmitter side, we assembled a rate 1/2 recursive convolutional code with octal polynomials (37, 31) 8 , a random interleaver, and a BPSK symbol mapper.The channel is generated according to (9) and (11) for two different fading rates f d T = 0.02 and f d T = 0.005.The results shown in Figures 6 and 7 are for spatially uncorrelated channels (Σ T = Σ R = I), whereas the impact of antenna correlation is considered in Figure 8. Frames consists of 1440 coded information bits, and a number of pilot symbols are periodically   inserted to provide initial channel estimates and to avoid divergence.Pilot symbol energy is set equal to the average data symbol energy.The bit energy to noise ratio (E b /N 0 ) is computed without taking the energy required for pilot symbol transmission into account.
Figure 6 illustrates the channel-tracking performance, by comparing the mean value of the messages P e (H k ) with the true channel H k .In the first iteration, only information about the pilot symbols is used, so that the algorithm corresponds to a pure data-aided Kalman smoother.As we observe, the ability to track the channel substantially improves after a few iterations.As expected, exploiting information from the decoder about the unknown coded symbols in the second and further iterations improves the channel estimation.
The curves in Figure 7 correspond to the BER performance exhibited on our MIMO time-varying fading channel ( f d T = 0.005 on the left and f d T = 0.02 on the right, both with no antenna correlation).We compare the performance of the iterative detector where the channel estimates are provided solely based on pilot symbols with the performance of an iterative code-aided estimation scheme, where the code-aided estimator is embedded in the iterative detector (as explained in Section 4.2.2).In Figure 7 (left), we inserted one pilot symbol (on each antenna) every 20 coded symbols, which correspond to a 5% pilot overhead.The performance of the iterative algorithm after convergence is close to the known-channel performance.Comparing the BER after the first iteration to the BER after convergence, we observe a 2 dB gain that results from iterating between the detection and the estimation.The code-aided estimator also yields more than 1 dB gain compared to the pilot-based estimator, after convergence.in (9)).This gain is obtained thanks to the interleaver, which spreads the error bursts, caused by occasionally deep fades, over the entire codeword.This property has been widely examined [2,3] and emphasizes the benefit of using BICM for fading channels.Considering the estimation, we increased the number of pilots to 10% (insertion of 1 pilot symbol every 10 coded symbols) to avoid divergence of the iterative SP algorithm.The gain from exploiting the code is apparent again.
Finally, we consider the BER performance on a fading 2 × 2 MIMO channel with transmit antenna correlation.We assume that the correlation matrix is given by Simulation results are shown for ρ = 0.8 and ρ = 0.95.We further consider two different scenarios: (i) the receiver knows the transmit correlation ρ and takes it into account in the SP computation; (ii) the receiver does not know the correlation and assumes ρ = 0. Figure 8 shows the BER performance after 3 iterations for a fading rate of f d T = 0.02.For ρ = 0.8, the difference between the two scenarios is minor.Only for tightly coupled (ρ = 0.95) antennas, a significant performance gain is observed when taking the correlation into account.Observe also the well-known result that less correlated channels exhibit a better performance than more correlated channels.

CONCLUSIONS
By means of factor graph theory we have derived an iterative algorithm for joint code-aided estimation and detection on a time-varying flat-fading MIMO channel with spatial correlation.The tightly coupled estimation and detection algo-rithms exchange messages in accordance with the SP algorithm.The estimation algorithm boils down to a Kalman smoother that uses soft-symbol information provided by the decoder.Since MIMO detection often involves iterative decoding, we can limit the computational overhead caused by the estimation by embedding the estimation stages into the detection stages.When the receive antennas do not exhibit correlation, we can split the Kalman smoother into a bank of parallel Kalman smoothers, which significantly reduces the complexity.
Simulation results have shown that a significant performance improvement (in terms of BER) is obtained by exploiting information from the unknown transmitted symbols compared to estimation based on pilot symbols only.Also, ignoring the spatial correlation leads to a minor performance degradation, as long as the correlation is not too high.

Figure 3 :
Figure 3: Factor graph representation of p(b, A, H | Y), up to a multiplicative constant.The grey area is shown in detail in Figure 4.

Figure 4 :
Figure 4: Details of the grey area from Figure 3, including messages.

Figure 5 :
Figure 5: Details of the grey area from Figure 3, when receive antennas are uncorrelated (Σ R = I).

Figure 6 :
Figure 6: Comparison of the estimated channel and the true channel, in a convolutionally encoded system with f d T = 0.02, E b /N 0 = 6 dB, and 10% pilot symbols (Σ T = Σ R = I).

Figure 7 :
Figure 7: BER performance of convolutional code with f d T = 0.005 and 5% pilot symbols (left) and f d T = 0.02 and 10% pilot symbols (right).