Linear precoding in distributed MIMO systems with partial CSIT

We study the transmission problem in a distributed multiple-input multiple-output (MIMO) system consisting of several distributed transmitters and a common receiver. Assuming partial channel state information at the transmitter (CSIT), we propose a low-cost weighted channel matching and scattering (WCMS) linear precoding strategy. The proposed precoder can be decomposed into two parallel modules: channel matching (CM) and energy scattering. The signals generated by the CM modules from different transmitters provide a coherent gain with improved power efficiency. The use of the scattering modules provides robustness against CSIT uncertainty. By properly combining these two modules, WCMS can achieve coherent gain proportional to the accuracy of the available CSIT as well as robustness against CSIT error. WCMS is simple and fully decentralized and thus is highly suitable for a distributed MIMO system. Numerical results demonstrate that WCMS indeed achieves significant gains in distributed MIMO environments with partial CSIT.


Introduction
Consider a distributed multiple-input multiple-output (MIMO) system involving several distributed transmitters and a common receiver [1][2][3][4][5][6][7][8]. All the transmitters share the same information to be transmitted. Every transmitter is equipped with multiple antennas, so is the receiver. Thus, the channel between a transmitter and the receiver forms a local MIMO link. The overall system consists of several such local links. Examples of this system can be found in the scenarios such as distributed MIMO systems [1][2][3][4] or parallel relays serving the same destination after recovering the data from the source in relaying systems [5][6][7][8].
If full channel state information at the transmitter (CSIT) for all the local links is available at every transmitter, the overall system can be regarded as an equivalent MIMO system and optimized using a centralized strategy [9,10]. However, this full CSIT assumption can be very costly due to its heavy requirement on feedback links. The problem becomes very stringent when the total number of antennas involved is large (e.g., in a so-called large-scale MIMO setting [11,12]).
Partial CSIT is a more realistic assumption. In particular, each transmitter can acquire the CSIT of its local reverse link (from the receiver) using channel estimation. With time division duplex (TDD) and based on channel reciprocity, such CSIT can also be used for the local forward link [13]. This can greatly reduce the burden on the feedback requirement, but then a decentralized transmission method is necessary. The problem becomes more complicated when the CSIT on each local link is not reliable, which can be caused by, e.g., channel variation due to high mobility [14]. Improving channel estimation accuracy alone cannot solve the problem completely.
The standard singular value decomposition (SVD)based techniques [15] do not perform well when CSIT is not reliable. Conventional space-time coding techniques [16,17], on the other hand, are not efficient in making use of the available CSIT. It is a challenging task to develop transmission techniques that are efficient in exploiting the advantage of partial CSIT and, in the meanwhile, robust against the CSIT error.
Linear precoding techniques have widely been studied for MIMO systems, mostly in centralized scenarios [14,[18][19][20][21][22][23][24]. Optimization techniques have been applied to provide robust performance in case of imperfect CSIT [14,23,24]. These precoding techniques usually involve complicated matrix operations, such as matrix inversion, QR decomposition, or SVD. The related complexity is quite high, which can be a major obstacle in practice, especially for a large-scale MIMO setting.
In this article, we study a weighted channel matching and scattering (WCMS) strategy for distributed MIMO systems with partial CSIT. It involves a linear precoder at each transmitter that can be decomposed into two parallel modules, one for channel matching (CM) and the other for energy scattering. The signals generated by the CM modules from different transmitters add coherently at the receiver, resulting in a coherent gain with improved power efficiency. On the other hand, the use of the scattering modules provides robustness against CSIT uncertainty. By properly combining these two modules, we can achieve the coherent gain proportional to the accuracy of the available CSIT as well as robustness against CSIT error. The WCMS scheme is fully decentralized since each transmitter only needs to know the CSIT about its own link. The scheme is also very simple and there is no need for SVD or matrix inversion, and thus it is highly suitable for large-scale MIMO systems.
Our discussions will be based on both mutual information analysis and numerical simulation. For the former, the channel capacity of the distributed MIMO system under consideration is still unknown. We will show that the WCMS scheme can perform close to a referenced centralized scheme. We conjecture that this centralized scheme can perform close to or even beyond the capacity of the decentralized system, which implies that the WCMS performance is not far from the optimum. For numerical simulation, we focus on an iterative linear minimum mean squared error (LMMSE) detection technique. An extra precoding stage is introduced to reduce the potential performance loss in practically coded systems due to the fluctuation in symbol reliabilities. Simulation results show that the proposed scheme indeed achieves significant gain in distributed MIMO environments with partial CSIT.
The remainder of the article is organized as follows. Section 2 introduces the system model and Section 3 provides some mutual information analysis results of centralized MIMO systems with imperfect CSIT. In Section 4, a distributed WCMS scheme is presented and its outage performance is analyzed. The implementation of the WCMS method in a low-density parity-check (LDPC)coded system with iterative detection is discussed in Section 5. Section 6 concludes the whole article.

System model
Consider K transmitters {T 1 ,. . ., T k ,. . . T K } serving a common receiver cooperatively to deliver a common information sequence d, as illustrated in Figure 1. The receiver has N r antennas and each transmitter has N t antennas. The received signal is given by where r is a N r ×1 received signal vector, H k is the N r ×N t channel transfer matrix between the receiver and the transmitter T k , η a vector of complex additive white Gaussian noise samples with mean 0 and variance σ 2 I, and y k a N t ×1 signal vector transmitted by T k with a zero mean and a power constraint P k , i.e., where E[•] is the expectation operation and Q k = E[y k y k H ] is the transmission covariance matrix of T k . Note that all {y k } in (1) carry the same information sequence d as mentioned previously. This is different from the transceiver design in the conventional multiple access channel where different transmitters have different information sequences to transmit [25,26].
We assume that partial channel state information (CSI) is available at the transmitters, and full CSI is available at the receiver. Following [14,27], we model the link between T k and the receiver as where H k is the actual channel transfer matrix and is perfectly known at the receiver, -H k and e H k represent, respectively, the known and unknown parts of the channel matrix at T k , and α k ∈ [0,1] is a measure on the CSIT quality. In practice, -H k in (2) can be obtained by using channel estimation and e H k is related to channel variation during the channel estimation period [14]. In TDD systems, -H k can be estimated from the signal received in the last frame and a k in this case is the correlation coefficient of channel matrices in two adjacent frames [27]. Throughout the article, we assume that the entries of  distributed (i.i.d.) circularly symmetric complex Gaussian random variables with zero mean and unit variance.
The following is a basic assumption in this article (referred to as the distributive assumption).

Centralized transmissions
The capacity of the distributed system in (1) is generally unknown. To circumvent the difficulty, we first remove the distributive assumption temporarily and consider several centralized transmission strategies in this section. We then use the obtained results as an approximate upper bound to evaluate the performance of the distributed transmission strategy developed in the next section.
With centralized control, the overall system can be regarded as an equivalent MIMO system. For simplicity, we assume here a common CSIT quality index α for all {T k }, i.e., α 1 =···= α K = α. Substituting (2) into (1a), we have We assume that -H is known by all transmitters, which implies a centralized transmission control. We also assume a total power constraint for y as where Q = E[yy H ] and P is the total power constraint. Take decomposition Q = FF H . Then Q can be realized by a linear precoding scheme as Here we assume that x is a coded sequence with i.i.d. symbols with E[x] = 0 and E[xx H ] = I.

3.1.Optimal centralized transmission with perfect CSIT
Let us first consider the extreme case of α = 1. (We will consider 0 ≤ α < 1 in Section 3.3.) In this case, the CSIT is perfect and we have H ¼ V a unitary matrix consisting of eigenvectors of M and Λ a diagonal matrix consisting of eigenvalues of M. Then the optimal precoder is given by the so-called water-filling W in (6a) is a diagonal matrix obtained by WF [28] over the diagonal entries of Λ where w(i, i) and λ(i, i) are, respectively, the ith diagonal elements of matrices W and Λ, [a] + ≡ max{0, a}, and μ is a constant to meet the total power constraint P:

Suboptimal CM precoder
We now consider the following suboptimal CM precoder where -H H is the conjugate transpose of the channel matrix -H and c is a constant for power adjustment. Clearly, this is a very simple solution. Figure 2 compares its average mutual information performance (based on Gaussian signaling) with WF for Rayleigh-fading MIMO channels. The channels are assumed to be unchanged during a frame but change independently from frame to frame. The performance in Figure 2 is averaged over different frames. It is seen that the CM precoder can achieve performance close to the capacity in the median rate region. This observation can be explained as follows. Let (7) as Equation (8) can also be obtained by replacing W in the optimal WF precoder (6a) by c does not affect mutual information. Therefore, the performance difference between (6a) and (8) . This implies that c -D has an approximate WF effect similar to W and explains the close performance of (6a) and (8).
For comparison, we also show in Figure 2 the results of equal power (EP) allocation, in which the transmitted signal is given by y = c'x with c' a constant to meet the power constraint. It is seen that CM outperforms EP in the lowto-median rate regions. When rate becomes very high, EP becomes superior. This is because EP is asymptotically optimal when rate R → ∞. In practice, the median rate region (e.g., R ≈ min{N t , N r }) is of most interest, which indicates the attractiveness of the CM option.

Centralized control with partial CSIT
As illustrated in Figure 2, the CM precoder performs well when CSIT is reliable. On the other hand, when there is no CSIT (i.e., α = 0), according to the conjecture in [29], the precoding operation given below is optimal (for a sufficiently small outage probability): The signal power in the above scheme is scattered evenly over all directions and so we will refer to it as scattering precoding.
In the general case of 0 < α < 1, the optimal precoder structure that achieves ergodic capacity of the system in (3) (under the assumption that the entries of e H k n o are i.i.d. circularly symmetric complex Gaussian random variables) is given by y ¼ and Ω is a real diagonal matrix [27].
According to the best of the authors' knowledge, the optimal precoder structure that achieves outage capacity is still an open problem even for a centralized transmission. The following conjecture is given to circumvent the difficulty.

Conjecture 1:
The optimal precoder F in (5) that achieves the outage capacity of the system in (3) is given and Ω is a real diagonal matrix.
Under the above conjecture, to find the outage capacity, we only need to find the real diagonal matrix Ω, which is a simpler task than finding a full matrix F. It can be done by exhaustive search when the size of H is relatively small. However, the search becomes complicated as the size of H increases.
Alternatively, consider a suboptimal but simple WCMS precoder defined below where β ∈ [0, 1] is a variable to be optimized according to the CSIT quality α and the power constraint P, and c is a constant for power adjustment. In (10), x 0 and x d are two statistically independent coded sequences obtained by encoding the same information sequence d with different coding schemes, and both of them consist of i.i.d. symbols with zero mean and unit variance. Here, the use of x 0 provides a beamforming effect as it steers more energy at good eigenmodes. On the other hand, the use of x d provides a scattering effect since it scatters the energy evenly in all eigenmodes. Clearly, the scheme in (10) provides a weighted combination of these two effects.
Since x 0 and x d in (10) are uncorrelated, Q in (4) is given by where Q CM and Q ES , respectively, are the transmission covariance matrices for the CM and energy scattering schemes in (7) and (9). Clearly, the scheme in (10) Figure 2 The average mutual information performance of WF and CM precoders in centralized MIMO systems. SNR = P/σ 2 , K = 1, and α = 1.
reduces to (7) and (9), respectively, when β = 1 for α = 1, and β = 0 for α = 0. For a general α, the optimal β can be obtained by a one-dimensional exhaustive search. This search is based on channel statistical distributions and so the complexity involved is modest. In practice, it can be performed at the receiver and then the result is fed back to the transmitters.
Replacing the CM precoder in (10) with the WF one, we can obtain a weighted WF and scattering (WWFS) precoder as follows Compared with (10), the WWFS approach in (12) involves SVD and thus is more complicated than the WCMS approach. Despite its high complexity, the former cannot be guaranteed to outperform the latter except the extreme cases of α = 1 and α = 0, as seen from the numerical results provided in Figure 3. This is possible since both approaches are suboptimal in general. Figure 3 compares the outage performance of the WCMS scheme (assuming Gaussian signaling) with the outage capacity (under Conjecture 1) obtained using full search (FS) for a centralized MIMO channel. The channels are assumed to be unchanged in a frame but change independently from frame to frame. The instantaneous mutual information of each frame is calculated and compared with the target rate R to obtain the outage probability. The curves are obtained based on the system in (3) without the distributive assumption. Therefore, the outage probability in Figure 3 is a lower bound of the outage of the distributive system in (1). It is seen that the WCMS performance is quite close to the outage capacity. We have made similar observations for other system parameters within the range of our computing power. We expect that WCMS also performs well in more general cases but we do not have a rigorous justification so far.
As it is very time consuming to compute the outage capacity for relatively large system sizes (even using Conjecture 1), we will use the centralized WCMS performance as a coarse estimate of the lower bound of the outage performance of the distributed schemes in our discussions hereafter.

Distributed linear precoding and outage analysis
In this section, we discuss design technique for distributed linear precoding and study its theoretical performance based on mutual information analysis.

Distributed linear precoding
Return to the system in (10). Let us segment y, -H; and x as Then the transmitted signal y k from T k is given by This construction almost meets Assumption 1, since only -H k is required to generate y k . A subtle point is that the power constraint tr{Q} ≤ P in (4) involves the global coordination. We can change (4) to the local constraints below Correspondingly, we rewrite (13a) as where c k is a constant to meet the power constrain P k at T k . Recall that the common weighting coefficient β for all {T k } in (13c) results from the assumption that the CSIT quality α is the same for all {T k }. In a more general case, α may vary with T k , and so does β. In the latter case, we rewrite (13c) as Then the transmission scheme becomes fully distributed. We will refer to the scheme in (13d) as the distributed WCMS precoding.

Coherent gain
A distributed WCMS precoder can provide the so-called coherent gain when α k > 0, ∀k, as analyzed below. Define (1), (2), and (13d), we have We call r coh the coherent component of r, where It is important to note from the definition that all In (16), The inequality in (17a) can be verified from the definition of trace [30]. The inequality in (17b) holds since are Hermitian and positive semi-definite, and so [30] tr where λ n (A) is the nth eigenvalues of matrix A in the increasing order. From (17a) and (17b), all the terms in (16) are positive and add "coherently", which is the cause of coherent gain. We can see such gain more clearly in the special case of N r = N t = 1 when -A k reduces to a sca- The terms We emphasize that coherent gain is the consequence of the inequality in (17b), which holds for positive semidefinite Hermitian matrices . Such gain is highly desirable from power efficiency point of view, which is a noticeable advantage of CM precoding. We will demonstrate this advantage using numerical results in Section 4.3. The key for such gain is that in (16) are positive semi-definite.
Coherent gain can also be demonstrated by examining the system capacity in the special case of β 1 =···= β K = 1 for α 1 =···= α K = 1. In this case, (14) reduces to The capacity is given by We refer to the capacity of a single link to transmitter semi-definite, then there is a possibility that C < C k . Thus distributed cooperation may not necessarily lead to the capacity improvement. On the other hand, (20) can be rewritten as [30] C ¼ Denote by Furthermore, when -A k È É are all positive semi-definite and so is From (21) and (23), CM precoding guarantees that C ≥ C k , which implies a coherent effect.

Numerical results
Now we present some numerical results to demonstrate the performance of the proposed scheme. Figure 4 provides the outage performance of distributed WCMS for Rayleigh fading and common CSIT quality. Centralized WCMS is used as the reference. From Figure 4, distributed and centralized WCMS schemes have quite close performance. Recall (from Figure 3) that centralized WCMS can perform close to the capacity with global CSIT. (The performance of centralized WWFS is included in Figure 4 for reference.) It is thus reasonable to conjecture that the distributed WCMS performance is also not far from the optimum, although we are not able to provide more rigorous analysis. Figure 5 provides the outage performance of distributed WCMS considering large-scale fading and individual CSIT quality. The number of transmitters is K = 3. For simplicity, we fix the large-scale fading factor seen by T k to be a constant g LS (k), k = 1, 2,. . ., K. (The values of {g LS (k)} are given in the caption of Figure 5.) We assume that {g LS (k)} are known by all transmitters and the power constraints {P k } can be adjusted correspondingly. For distributed WCMS, this power allocation is given by The Rayleigh fading factors are the same as in Figure 4. Two search methods are used to find {β k } in (13d). Method 1 is the K-dimensional exhaustive search. Method 2 assumes that β k /β 1 = α k /α 1 , k = 2,. . .,K and finds the optimal β 1 using the exhaustive search. From Figure 5, we can see that the two methods obtain almost the same performance. Next we consider the performance of the proposed scheme when the CSIT quality varies. For simplicity, we assume that α k = α for all {T k }. Figure 6 provides the outage performance of distributed WCMS considering large-scale fading and common CSIT quality. The system settings are the same as those in Figure 5 except that the same CSIT quality index α is assumed for all {T k }. We can make observations from Figure 6 similar to those from Figure 4. The distributed WCMS can obtain significant performance improvement as the CSIT quality increases, and it performs closely to centralized WCMS. Figure 7 examines the performance of distributed WCMS in a multicarrier OFDM system. The system settings are as follows: The number of transmitters is K = 3. The large-scale fading factors are the same as in Figure 6. For the Rayleigh fading, each transmitter sees a MIMO ISI channel with N t = 2 transmit antennas, N r = 2 receive antennas, and L = 4 delay taps that are uniformly distributed. The total average gain over the L paths between each transmit-receive antenna pair is normalized to 1. The number of OFDM subcarriers is J = 32. The extended channel model in Section 5.1 is adopted to represent the MIMO OFDM systems with a single channel transfer matrix. Comparing Figures 6 and  7, we can see that the gaps between the distributed and centralized WCMS schemes are narrower in the multicarrier case than in the single carrier one, implying that the performance loss incurred by the distributive requirement becomes smaller.

Transmission in systems with practical coding
The precoder design discussed in Section 4 is assessed by mutual information analysis. In this section, we will consider practical coding and decoding issues. Recall that y k in the distributed WCMS precoder in (13)    sufficient to guarantee good performance in practically coded systems. For example, we may simply generate {x k } by segmenting a single FEC-coded sequence c. In this way, different elements in c undergo different channel conditions and suffer from different levels of interference. Such variation of channel conditions may bring about adverse effect on performance. Below, we will outline a treatment to the problem using an extra stage of linear precoding [31,32].

Diversity linear precoding
We first introduce an extended channel model as follows. Let  transfer matrix of the MIMO channel seen by the transmitter T k . We define an extended channel model, for which H k is obtained by sampling -H k T times as : ð25Þ In practice, H k can be obtained by using the MIMO channel T times consecutively in time domain or T subcarriers in frequency domain [31,32]. Clearly, all the precoding techniques discussed so far can be applied to H k defined in (25). The system size is The matrix augmentation in (25) does not lead to significant complexity increase, as explained in Section 5.2 later.
With the extended channel model, we can ensure that all the elements in c will undergo the same channel condition, which leads to more robust performance. We will discuss the related details in the subsequent sections.
Let c be a length (KN t + N r )T coded sequence generated by underlying FEC coding scheme using d as its input. Without loss of generality, the average power per entry of c is normalized to 1. With random interleaving, we can treat c as a sequence with i.i.d. symbols [33][34][35][36][37]. Now we generate {x k } by an extra stage of linear precoding c as follows where {S k = I T ⊗ s k } are used to segment the vector Pc with ⊗ being the Kronecker product. We construct s 0 and s k , k = 1, 2,. . .,K, as follows These matrices segment Pc in a straightforward way. Note that the size of s 0 is N r × (KN t + N r ) while the size of each s k , k ≥ 1, is N t × (KN t + N r ). This is to match the sizes of -H H k and I N t in (13), respectively. [The Kronecker product S k = I T ⊗ s k means that all the matrices {S k } are block diagonal, which is consistent with the extended channel model in (25).] It can be verified that {x k } defined in (26) meet requirements (i)-(iii) mentioned above. (Recall that {x k } are constructed from c which is a coded sequence using the information sequence d as input. Hence, all {x k } will be related to the information sequence d.) We will explain the function of P below.
The matrix P in (26) is for diversity coding. In this article, we select P to be a size (KN t + N r )T × (KN t + N r )T normalized Hadamard matrix for the following reasons [31,32]. First, a normalized Hadamard matrix is unitary and hence of full-rank. No capacity loss is caused by such P. Second, for a MIMO channel, different beams have different channel gains in general. The use of a Hadamard matrix can result in a diversity advantage. (We will return to this issue in Section 5.3.) Third, fast Hadamard transform (FHT) [38] allows efficient calculation, which is discussed in Section 5.2.
The diversity effect provided by P follows the discussions in [39,40] in essence. A main feature of the approach taken in this article is the use of {F k } in Section 4 to best exploit the available CSIT so as to further improve the performance.

Detection principles
Substituting (26) and (13d) into (1), we have The optimal detection of c involves both the linear channel constraint and the nonlinear FEC coding constraint. This means a very high complexity in general. A suboptimal way is to consider the two constraints separately, which incurs certain performance loss. This loss can be compensated by the iterative detection outlined below.
The iterative detection consists of an elementary signal estimator (ESE) for the channel constraint and a decoder (DEC) for the FEC coding constraint [36], as illustrated in Figure 8. Based on the received signal r and the a priori information of c (in the form of mean E[c] and covariance vI), ESE first performs the standard LMMSE detection of c, with output given by [41] c where R ≡ Cov(r, r) = vAA H + σ 2 I. Then c(i) can be estimated symbol-by-symbol using (28). The DEC performs standard a posteriori probability decoding using the output of the ESE as a priori information. The decoding output is used to update the values of E[c] and v for the sequence c. Then ESE Figure 8 The structure of an iterative receiver.
performs LMMSE detection again. This process continues iteratively.
From (28), we can see that the complexity of LMMSE detection mainly lies in the calculation of R -1 . It can be verified that R is a block diagonal matrix with T blocks and each block has a size of N r × N r . This means that R -1 can be calculated with a complexity of O(T ⋅ N r 3 ) [42]. All the other operations in (28) can be performed by FHT or block-by-block. Hence, the related complexity is modest for practical systems.
Note that the above LMMSE detection is suboptimal for the ESE. However, combined with the iterative detection, it can obtain near-optimal performance as shown in [43].

Diversity gains
In general, the elements of the received vector r = r(i) in (27) have different signal-to-interference plus noise ratios (SINRs). This is caused by the fluctuation of channel gains and interference levels in different antenna to antenna links. Consequently, the elements ofĉ ¼ĉ i ð Þ f g in (28) also have different SINRs. Such SINR fluctuation may lead to noticeable performance loss whenĉ i ð Þ f g are used as the inputs to a decoder.
A treatment to the problem is outlined in [31,32] using a linear precoding matrix. This is the function of matrix multiplication Pc in (26a). It is shown in [31,32] that all the LMMSE estimation outputs have equal SINR when P is a Hadamard matrix with a sufficiently large size. This can be seen intuitively as follow. The ith column of P can be regarded as a spreading sequence and so each c(i) is spread into a sequence after the matrix multiplication Pc. This clearly has a diversity effect. Notice that there is no redundancy involved here since P is a square unitary matrix. Also, the related precoding and detection complexity is kept low when a Hadamard matrix is used for P, thanks to the FHT. The related discussions can be found in [31,32].
Alternatively, we may also consider a conventional space-time code for diversity gain. However, this is not straightforward. Note that two stages of linear precoding are involved in (27). The first stage is the use of P and the second is the WCMS precoding discussed in Section 4. The linear operations in these two stages are combined to form the overall linear transform matrix A in (27). An efficient iterative LMMSE detection algorithm is available for the system in (27) based on FHT [31,32]. On the other hand, detection complexity may become a serious problem when a conventional space-time coding technique is combined with the linear precoding technique discussed in Section 4.

Simulation in LDPC-coded systems
We now present some numerical examples to demonstrate the efficiency of the proposed scheme. Figure 9 shows the simulated frame error rate (FER) performance of the proposed scheme in a distributed MIMO OFDM system. The system setting is the same as in Figure 7. The number of transmitters is K = 3 and the large-scale fading factors are g LS (1) = 0.5, g LS (2) = 1, g LS (3) = 2. For Rayleigh fading, each transmitter sees a MIMO ISI channel with N t = 2 transmit antennas, N r = 2 receive antennas, and L = 4 delay taps. In simulation, the FEC code is Figure 9 Simulated FER performances of the distributed WCMS precoding in a distributed MIMO OFDM system with large-scale fading and common CSIT quality α. Transmission rate R = 2 bits/symbol. N t = N r = 2. L = 4, J = 32, K = 3, P k = P·g LS (k)/ P g LS (k'), k = 1, 2,. . .,K. g LS (1) = 0.5, g LS (2) = 1, g LS (3) = 2. a (3, 6) regular LDPC code followed by length-4 spreading, random interleaving and QPSK modulation, and the transmission rate is R = 2 bits/symbol. The frame length is set to 4,096 information bits.
From Figure 9, we can see that the system performance improves progressively as the CSIT quality improves and similar trends as in Figure 7 can be observed. Compared with the no CSIT case, a considerable performance gain of about 3.9 dB (at FER = 10 -2 ) is obtained in the case of perfect CSIT, which is slightly less than the analytical result (i.e., 4.6 dB) in Figure 7. We expect that more sophisticated coding schemes can be used to improve performance and we are still working on this.

Conclusions
In this article, a WCMS linear precoding strategy is proposed for distributed MIMO systems with partial CSIT. The WCMS precoder consists of two parallel modules, one for CM to obtain a coherent gain, and the other for energy scattering to provide robustness against CSIT uncertainty. Both analytical and simulation results demonstrate that the proposed strategy can achieve the coherent gain proportional to the accuracy of the available CSIT as well as robustness against CSIT error. The strategy is very simple and fully decentralized and thus is highly suitable for distributed MIMO systems.