On base station cooperation using statistical CSI in jointly correlated MIMO downlink channels

This article studies the transmission of a single cell-edge user's signal using statistical channel state information at cooperative base stations (BSs) with a general jointly correlated multiple-input multiple-output (MIMO) channel model. We first present an optimal scheme to maximize the ergodic sum capacity with per-BS power constraints, revealing that the transmitted signals of all BSs are mutually independent and the optimum transmit directions for each BS align with the eigenvectors of the BS's own transmit correlation matrix of the channel. Then, we employ matrix permanents to derive a closed-form tight upper bound for the ergodic sum capacity. Based on these results, we develop a low-complexity power allocation solution using convex optimization techniques and a simple iterative water-filling algorithm (IWFA) for power allocation. Finally, we derive a necessary and sufficient condition for which a beamforming approach achieves capacity for all BSs. Simulation results demonstrate that the upper bound of ergodic sum capacity is tight and the proposed cooperative transmission scheme increases the downlink system sum capacity considerably.


Introduction
Multi-antenna systems, widely known as multiple-input multiple-output (MIMO), have shown considerable gain in spectral efficiency and attracted much attention in recent years, e.g., [1]. There have also been strong interests in utilizing MIMO to cope with multiuser scenarios. However, achieving the theoretical capacity gains in practical cellular environments is problematic because of interferences. More recently, base station (BS) cooperation [2][3][4][5][6][7][8][9][10] was proposed as a means to improve the performance of cell-edge user and mitigate the problem of inter-cell interference. This is greatly motivated by the fact that BSs may be connected via a wired backbone and the channel state information (CSI) can be shared among the BSs for coordinated transmission. Such BS cooperation in the downlink in particular leads to enormous throughput gains as compared to the conventional single-BS (or single-cell) signal processing where the co-channel interference (often from other cells) is treated as noise.
Coordinated BS transmission in the downlink is often analyzed using a large MIMO Gaussian broadcast channel (BC) model, with the challenge of incorporating perantenna or per-BS power constraints. MIMO BC capacity region with a sum-power constraint has been well established in [11][12][13][14] using the uplink-downlink duality. The achievable rate region of MIMO BC under perantenna power constraints has also been studied in [15]. Recently, this result has been extended to cope with general linear transmit power constraints in [16,17]. If the dirty-paper-coding based optimal nonlinear precoder or the minimum-mean-squared-error based optimal linear pre-coder is used, the result in [15][16][17] can be directly applied to BS cooperation with per-BS power constraints. However, the gain offered by BS cooperation depends greatly on the level of CSI that can be exploited in the optimization. To investigate the full diversity and multiplexing benefits, most previous studies [15][16][17] assumed to possess perfect CSI at the transmitter, but such results may be severely offset by expensive overheads for acquiring the CSI [18][19][20]. User mobility also will increase fading rate and make accurate CSI difficult to maintain. For this reason, exploiting statistical CSI at the transmitter side is often more appealing due to much lower overhead. The uplink and downlink statistical CSI are also usually reciprocal in both frequency-division-duplex and time-division-duplex systems [21,22].
Capacity analysis and transceiver designs using statistical CSI at the transmitter are highly dependent on the assumption of the channel model. The conventional modeling approach has been the Kronecker model [23][24][25] which separates spatial correlation at the transmitter and receiver ends. Recent measurement campaigns, however, demonstrated that mutual correlation between the transmitter and receiver may be important, which makes the Kronecker model inadequate [26,27]. The jointly-correlated channel model in contrast not only accounts for the correlation at both ends, but also characterizes their mutual dependence. Recently, [28] derived a closed-form upper bound for the ergodic capacity of the jointly-correlated MIMO channel. Beamforming is a simple linear precoding strategy, in which the transmit covariance matrix is of unit-rank. The optimality of beamforming for some single-user MIMO channels have been studied for the Kronecker model [29,30], the double-scattering model [31], and the virtual representation model [32], when the transmitter has partial knowledge of the channel. These results have also been extended to the MIMO multiple access channel (MAC) in [32][33][34][35][36].
In this article, we aim to investigate coordinated downlink transmission with cooperative BSs assuming that the mobile user has perfect CSI but the BSs know only statistical CSI. Our main contribution is that the jointly correlated MIMO channel model in [26] is adopted to account for the spatial correlations of the antennas at the BSs and the user and between them. We first present an optimal transmission scheme to maximize the ergodic sum capacity of this channel with per-BS power constraints, from which two important results are revealed: (i) the transmit signals of all BSs are mutually independent; and (ii) the optimal transmit directions for each BS align with the eigen-directions of the BS's own transmit-side correlation matrix. We then employ matrix permanents to derive a closed-form tight upper bound for the ergodic sum capacity of the jointly correlated MIMO channel. Based on this bound, we propose an iterative power allocation algorithm using convex optimization techniques, which converges within only a few iterations. Also, we establish the beamforming optimality conditions for all the BSs. Our study for BS cooperation in the jointly-correlated MIMO downlink channel generalizes the result in [33].
The rest of this article is organized as follows. We present the system model in Section 2. In Section 3, we propose the capacity-achieving optimal transmit scheme and derive the ergodic sum capacity upper bound. Utilizing the bound, we then develop the optimal power allocation policies. Finally, we establish the beamforming optimality conditions for all BSs. Simulation results are presented in Section 4, and we conclude the article in Section 5.

Notations
We use uppercase and lowercase boldface letters to denote matrices and vectors, respectively. I N is an N × N identity matrix and 0 denotes an all-zero matrix, while and 1 is an all-one matrix. The matrix inequality ≽ shows the positive semi-definiteness. The superscripts (·) H , (·) T , and (·)* represent the conjugate-transpose, transpose, and conjugate operations, respectively. We use E{·} to denote expectation with respect to all random variables within the brackets, and use A ʘ B to denote the Hadamard product of A and B. We use [A] kl or the lower-case representation a kl to denote the (k,l)th entry of A, and a k denotes the kth entry of the column vector a. The operators tr(·), det(·), and Per(·) represent the matrix trace, determinant, and permanent, respectively, and diag(x) denotes a diagonal matrix with x along its main diagonal.

System model
We consider a downlink cellular network consisting of m BSs, labeled as BS 1 , ..., BS m , which are equipped with antennas, respectively, and a single cell-edge user with N r antennas. To improve the performance of the cell-edge user, the BSs are connected by a wired backbone that allows information to be reliably exchanged among them. At the mobile user in the baseband, the received signal can be written in vector form as where MIMO channel matrix between BS i and the user; and Γ i denotes the large-scale fading between BS i and the user. It is assumed that x i and H i satisfy the following power constraints and We find it useful to define the total transmitted power as P m i=1 P i , the total number of transmit antennas t , and the transmit signal-to-noise ratio (SNR) as ρ P N 0 . In this article, we consider the jointly-correlated MIMO channel model [28], given by wherẽ deterministic matrix with nonnegative elements, and H iid,i for i = 1, 2, ..., m are statistically independent N r × N (i) t random matrices of independent and identically distributed (i.i.d.) zero-mean unit-variance entries. Note that H iid,i is not necessarily Gaussian. The matrices D i and M i reflect the line-ofsight (LOS) and scattering components of the channel, respectively.
Also, we define √ m D m , which has at most one nonzero element in each row and each column. Without loss of generality, we assume that the nonzero elements of D are real, with indices (l, l) for 1 ≤ l ≤ min (N t , N r ).
From (4), the transmit correlation matrices of the BSs and the receive correlation matrix of the user can be expressed, respectively, as and where F t,i and F r are diagonal matrices, U t,i and U r are the eigenvector matrices of the transmit and receive correlation matrices, respectively.
For ease of exposition, the channel coupling matrix is usually defined as [26] and the power constraint (3) can be rewritten as In the above, the (k,l)th element of Ω i , i.e., ω (i) kl , corresponds to the average power of the (k,l)th element of kl , which captures the average coupling between the kth receive eigenmode and the lth transmit eigenmode of BS i .

Statistical CSI-aided coordinated BS transmission
Here, we first devise the optimal transmit scheme for each BS, and then derive a closed-form upper bound for the ergodic sum capacity using the matrix permanents. Based on the capacity bound, we develop low-complexity power allocation solutions using convex optimization techniques, followed by discussion of the beamforming optimality conditions for the BSs.

Optimal transmit scheme
We here assume that the mobile receiver has perfect instantaneous CSI, whereas the BSs only have the statistical CSI including U t,i , U r , D i and M i (and thus Ω i ) (i = 1, 2, ..., m), and this information can be exchanged among the BSs via the wired backbone. Under these assumptions, the ergodic sum capacity of the downlink system is achieved by selecting the transmitted signal vector x to follow a zero-mean proper Gaussian distribution [1].
The power constraint (2) can be rewritten as tr (Q ii ) = P t N t P = P i , for i = 1, 2, ..., m, and the ergodic sum capacity is given by being the eigenvector matrix, and the diagonal matrix of the corresponding eigenvalues. The following theorem addresses the optimal transmit direction of each BS. Theorem 1 The ergodic sum capacity is achieved if the BS transmit signals are all mutually independent (i.e., Q opt ij = 0 , for i ≠ j), and the eigenvector matrix of Q opt ii for the jointly-correlated channel (4) is given by U i = U t,i . The ergodic sum capacity is then expressed as Proof: Form (4) and (5), the channel matrix H can be expressed as wherẽ DefiningQ U H t QU t and substituting (14) into (12) where Note that the optimization condition is met since onal matrices all of which have their diagonal entries being all 1s except for the (l,l)th entry as -1. As Π l is a unitary matrix, (17) can be written as Note thatH is given by (15) and lH l has the same distribution asH , since D is a diagonal matrix plus the fact that the entries of M ʘ H iid are independent and their distributions being symmetric, reversing the sign of some columns does not alter the distribution. Thus, we have From Jensen's inequality, it follows that [37-39] where the matrix 1 2 Q + lQ H l has entries equal to those ofQ except for the off-diagonals in the lth row and lth column, which are zero. In particular, its trace is identical to that ofQ . As a result, nulling the off-diagonal entries of any column and the corresponding row of Q can only increase I Q . Using the same process N t times, (17) is maximized with a diagonalQ , i.e.,Q = .
Theorem 1 reveals that the transmitted signals of all BSs should be mutually independent and the optimal signaling directions of the i-th BS align with the eigenvectors of the transmit-side correlation matrix of the MIMO channel of the i-th BS. This results extend the prior results in [29,37,40,41] to the more general channel model given by (4).

Ergodic sum capacity upper bound
After knowing the optimal transmit directions of the BSs, the remaining challenge is to determine the eigenvalues of the capacity-achieving input covariance matrix Q ii for i = 1, ..., m. This is equivalent to optimally allocating the available transmit power over the optimized transmit eigen-directions that are determined by Theorem 1.
In the most general case, it is difficult to derive exact closed-form solutions for the power allocation problem. The main obstacle lies in the complexity in evaluating the expectation in (13) which is usually done by stochastic averaging over a large number of random samples. In this section, our approach is to derive a tight upper bound for the expectation in (13) which can serve as an approximation to the ergodic capacity. Based on this, we develop closed-form power allocation solutions which will be presented in Section 3.3.
Due to the concavity of the log(·) function, C is upper bounded by where  (22) can be rewritten as The expectation derivation in (23) is based heavily on exploiting linear-algebraic concepts and the properties of matrix permanents. The permanent of a matrix is defined in a similar fashion to the determinant. The primary difference is that when taking the expansion over minors, all signs are positive. The permanents of M × N matrices have been investigated in [28,42]. We introduce the definitions and properties of matrix permanents in Appendix 1.
From these definitions, we extend the results of [28] to the case of multiple BSs. We can derive a closedform expression for the upper bound on the ergodic sum capacity.
From Theorem 2, we can see that the upper bound of ergodic sum capacity depends on the average SNR and the eigenmode channel coupling matrices Ω i , for i = 1, 2, ..., m. Low-complexity algorithms about the computation of the matrix permanent were developed in [28].

Optimizing the power allocation policies
We now consider the transmitter power allocation optimization problem. Based on the upper bound in Theorem 2, we develop low-complexity power allocation solutions using convex optimization techniques and then propose a simple iterative water-filling algorithm (IWFA) for approaching the optimal power allocation policy.
From (25), the power allocation optimization problem can be formulated as The above problem is a concave optimization problem [28] and the solution can be evaluated by employing standard convex optimization algorithms. In the following, we derive necessary and sufficient conditions for the optimal solution using the Karush-Kuhn-Tucker (KKT) conditions.
Theorem 3 The expected mutual information upper boundC u (λ)is concave with respect to l, and the necessary and sufficient conditions for the optimal power allocation are given by Proof: See Appendix 2.
Since the right-hand-side of (31) is independent of λ (i) j , we propose a simple IWFA to evaluate the optimal power allocation policy which satisfies (31). Simulation results, to be given in Section 4, will demonstrate that this proposed approach works very well and is highly efficient; typically converging after only a few iterations, with the first iteration achieving near-optimal performance. The proposed algorithm includes the following steps: Step 1 Initialize λ 0 = 1,C u λ k = log Per γ B k , and k = 0. Step 3 Calculate λ t , and i = 1, ..., m, with the power constraints Step 4 CalculateC u λ k+1 = log Per γ B k+1 .
Step 6 Set k := k + 1 and return to Step 2 until the algorithm converges or the iteration number is equal to a predefined value.
In the above, the superscript k specifies the corresponding variable in the kth iteration so that l k stands for the value of l in the kth iteration. In Step 1 of the first iteration, l is initialized to 1, i.e., equal-power allocation. Note, however, that l could also be initialized in a different way. For example, it is expected that the channel statistics change smoothly frame by frame, where a more appropriate initialization would be the optimal value of l from the previous frame. In Step 3, the conventional water-filling algorithm is performed with the required variables p(l i(j) ) and q(l i(j) ) calculated in Step 2. Following the calculation ofC u (λ) in Step 4, Step 5 is performed to guarantee convergence [28]. In Step 6, the convergence of the algorithm can be determined by checking whether C u λ k+1 −C u λ k or λ k+1 − λ k is less than some predefined value for a given precision.

Optimality of beamforming
Here, we investigate the optimality of beamforming (i.e., rank-one transmission) [29][30][31][32][33][34] in the context of the multi-BS cooperation systems. We derive a necessary and sufficient condition for the optimality of beamforming in the multi-BS cooperation systems. For BS i , we assume that the transmit eigenmodes satisfy the following conditions where τ and ω For multi-BS cooperation systems, the transmit covariance matrices of all the BSs that achieve the sum capacity are of unit-rank (i.e., beamforming is optimal for all the BSs) if and only if the following inequality is fulfilled: Note that the proof is nontrivial generalization of the techniques in [33] to the jointly correlated MIMO multi-BS cooperation systems. We can make the following observations.
• Ifh ij for j = 1, ..., N (i) t are i.i.d., the left-hand-side of (36) remains unchanged when j varies from 2 to N (i) t , and the right-hand-side of (36) is maximized for i = 2. It means that if the condition for i = 2 holds, then it is also held for all other i. Thus, inserting j = 2 into (36) gives the following condition:

Numerical results
In this section, we present numerical results to evaluate the tightness of the capacity bound, and demonstrate the efficiency and performance of the proposed transmitter optimization approach. We consider downlink transmission for coordinated cellular networks with two BSs and single user. Assumption that all cases have the same total transmit power of BSs, i.e., P 1 = P 2 . For the jointly correlated channel, we set the LOS D = 0. The spatial channel model (SCM) [43,44] is used to generate channel matrices of two independent links, i.e., Ω 1 , Ω 2 . The simulation environment is set to be urban microcell, with the BS antenna spacing d BS = 0.5λ, and the user antenna spacing d UE = 0.5λ. The number of distinguishable paths is set to be 1, i.e., flat fading. For each link, 1,000 time samples are generated for the calculation of the statistical CSI. The path loss model is 31.5 + 35 log 10(d) (d(m)) and site to site distance is set to 1,000 m. In all of the following figures, the horizontal axis (SNR) indicates the received SNR. Figure 1 illustrates the results of the ergodic sum capacity of the joint IWFA and the individual IWFA (where the BSs cannot cooperate) with t = 4, N r = 4 . As we can see, the capacity of joint IWFA is greater than that of individual IWFA. For comparison, the results for the exact ergodic sum capacity are also shown, which were obtained by numerically evaluating (13) using a constrained optimization. In addition, Equal-power allocation and beamforming are optimal in the high and low SNR regimes, respectively. Figure 2 demonstrates the convergence of the proposed IWFA for optimal power allocation. In this figure, the SNR r is set to 20 dB, and the algorithm is initialized using l 0 = 1. From these results, we see that the proposed IWFA converges after only a few iterations. Figure 3 compares the ergodic sum capacity upper bound in (25) with the Monte-Carlo simulation results of the ergodic mutual information by averaging over a large number of independent realizations of H with t = 4, N r = 8 . It shows that the ergodic sum capacity upper bound is tight. Employing this result of the ergodic sum capacity upper bound, the proposed IWFA for optimal power allocation can be achieved the true channel capacity.
To investigate the beamforming optimality condition, the inequality (36) can be rewritten as 2 . We see that the region where beamforming is optimal gets larger with increasing number of BSs. Note that these curves lie below the τ (1) is the largest transmit eigenmode. It can be seen that as m increases, the curves get closer to the τ (1) line. Figure 5 illustrates the beamforming optimality condition (38) for different antenna spacing. We set the number of BSs m = 5. The condition is plotted as a function of ρ 1 τ (1) 1 2 . We see that the region where beamforming is optimal gets larger as the user antenna spacing decreases and remains almost unchanged as the BS spacing changes.

Conclusion
We optimized the transmission for jointly correlated MIMO channels using statistical CSI over cooperative cellular networks with multiple BSs and a single celledge receiver user. We proposed an optimal transmit scheme to maximize the ergodic capacity and showed that the transmitted signals of all the BSs should be mutually independent and the optimal transmitted directions are the eigenvectors of the BS's own transmit correlation matrix. We also derived a closed-form tight upper bound for the ergodic capacity, based on which we developed low-complexity power allocation solutions using convex optimization techniques and a simple IWFA. Finally, we derived the beamforming optimality conditions for all the BSs.
Definition 2 The extended permanent of A is defined as According to the above definition, one can easily establish a number of important properties of the matrix permanent, as given in the following lemma [28]. where denotes the partial derivative ofC u (λ) with respect to λ where p(l i(j) ) and q(l i(j) ) are given by (33) and (34), respectively. Thus, (50) becomes Substituting (52) into (48) and eliminating the slack variable μ, the KKT conditions become (31) and (32), where (a) + = max{0, a} andν i = 1 ν i .
beamforming is optimal if (56) is satisfied as an equality for j = 1, and as a strict inequality for all other j. This means that beamforming of BS i along the strongest transmit eigenmode is optimal. That is, where Equivalently, the beamforming optimality conditions for BS i can be written as We now use the matrix inversion formula [45] to give and as a result, we get where Therefore, (57) and (58) can be rewritten as and Combining (60), (64), and (67), we obtain the desired condition in (36).