Optimal Power Allocation with Channel Inversion Regularization-Based Precoding for MIMO Broadcast Channels

Zero-forcing (ZF) precoding scheme can achieve the asymptotic sum capacity as dirty-paper coding (DPC) in multiple-input multiple-output broadcast (MIMO-BC) channel when the number of users, K , approaches inﬁnity. However, the gap between ZF and DPC is not negligible in a practical range of K , that is, K ≤ 100. The capacity loss is partly due to the excessive transmission power penalty incurred by ZF when the channel matrix of the selected user subset is poorly conditioned. To avoid this power penalty, we propose to use a variation of ZF, channel inversion regularization (CIR), as a precoding scheme in MIMO-BC channels. But, unlike the interference-free ZF, the problem of maximizing sum-rate capacity using CIR precoding becomes nonconvex, which cannot be solved by water-ﬁlling strategy. Thus, we propose an e ﬃ cient algorithm based on gradient projection (GP) as the optimal power allocation strategy for selected users, and show that the proposed CIR precoding scheme can achieve asymptotically the optimum sum-rate of the DPC strategy. Moreover, simulation results show that the CIR precoding scheme with the proposed optimal power allocation scheme achieves better sum-rate performance than ZF for a wide range of K .


INTRODUCTION
Multiple-input multiple-output (MIMO) systems demonstrate considerable capacity gain compared to single-input single-output systems (SISO) in a single-user communication system [1,2]. But the application of MIMO is not limited to the single-user case. Recently, more and more research has been conducted to investigate the application of MIMO to multiuser systems in wireless networks.
This paper considers a MIMO-broadcast channel (MIMO-BC), where a transmitter equipped with multiple antennas at the base station communicates with several users equipped with multiple antennas. The capacity region of MIMO-BC channels has been characterized in [3][4][5][6] recently. In [5,6], the duality between MIMO-BC and MIMO multiaccess (MIMO-MA) is established and exploited to calculate the capacity of MIMO-BC in an efficient manner. Moreover, the sum capacity of MIMO-BC channels can be achieved theoretically by applying dirty-paper coding (DPC) [7] to serve multiple users simultaneously at the transmitter [4]. Based on interference precancellation at the transmitter, the multiuser encoding DPC is the optimal capacity achieving strategy for MIMO-BC. Though several practical implementations of DPC were proposed in [8,9], DPC strategy still remains to be very difficult to be implemented for practical systems due to its high computation complexity incurred by successive encoding and decoding.
To circumvent the complexity of DPC, a simple suboptimal linear zero-forcing (ZF) precoding scheme is investigated in [10] and demonstrates achieving asymptotically the sum capacity as DPC in MIMO-BC channel if the number of users K approaches infinity. However, achieving optimum ZF sum-rate requires the highly complicated exhaustive search of all possible user subsets to select the best user subset that yields the highest sum-rate, which is clearly computationally infeasible for large K. Hence, [10] proposed a low-complexity suboptimum algorithm based on semiorthogonal user selection (SUS). The basic idea of SUS algorithm is to find a subset of users with best semiorthogonal channel vectors. As the search space of SUS is greatly reduced as compared to the exhaustive search, the complexity of SUS is significantly lower but it can still achieve the sum-rate of optimal ZF asymptotically as K → ∞. However, the SUS scheme requires perfect channel state information (CSI) at the base station, demanding huge amount of feedback information from users to the base station. To alleviate the CSI feedback burden imposed by SUS, two other suboptimum schemes [11,12] can approach the performance of SUS scheme but with much less CSI feedback and computation complexity.
All these schemes based on ZF, however, incur nonnegligible performance loss compared to DPC when K is relatively small. This sum-rate performance loss is partly due to the excess transmission power penalty incurred at the transmitter by ZF. More specifically, when the number of users K is relatively small, for example, less than 100, it is not always possible to find a group of M users with nearly orthogonal channels and the channel matrix of selected user subset is thus often poorly conditioned. It is known that inverting a poorly conditioned matrix unavoidably results in the reduction of effective channel gain [10]. In addition, the sum-rate loss caused by this reduced effective channel gain is more prominent in the case of low SNR.
In this paper, we propose to use channel inversion regularization (CIR) [13] as an alternative precoding scheme to alleviate the reduction of effective channel gain for MIMO-BC channels. Rather than applying pseudoinverse directly, CIR scheme adds a multiple of identity matrix before inverting to regularize an inverse. Hence, unlike the interference-free ZF, the zero-interference condition is no longer satisfied with CIR precoding scheme. As a result, the achieved sum-rate capacity of CIR scheme cannot be reduced to a simple convex optimization problem as ZF scheme, and hence the well-known transmission power "water-filling" strategy may not be applicable.
Instead, the problem of CIR scheme becomes a nonlinear nonconvex optimization problem. To solve globally this nonconvex optimization problem, we can resort to the global difference of convex (d.c.) optimization technique via recognizing that its objective function can be written as the difference of two convex functions. In [14], a similar global d.c. optimization approach has been proposed to find the global optimum of the power allocation problem in digital subscriber lines (DSLs) interference channels with much less complexity than the existing exhaustive search method. But this global optimization approach is not the ideal choice for designing the highly efficient algorithm required by wireless fading channel, where the algorithm for power allocation has to be done for every fading block. As an alternative, we then propose to use a low-complexity gradient-projection (GP) method, which is an extension of the unconstrained steepest descent method, as the power allocation scheme at the transmitter for the selected user subset. Because the feasible region of our particular power allocation optimization problem is a geometrically simple simplex, GP method is well suited as a candidate for solving it efficiently. The asymptotic convexity analysis in Section 5 shows that the nonconvex optimization problem reduces to a convex optimization problem when the number of users in the system is sufficiently large. Thus, local GP method can find the global optimum solution when K is sufficiently large. Moreover, simulation results in various wireless scenarios also indicate that the sum-rate performance loss of local GP method is negligible compared to that of global d.c. approach, justifying the use of local GP method for MIMO-BC channels.
The remainder of this paper is organized as follows. Section 2 begins with the system model of MIMO-BC and describes the ZF precoding scheme. Section 3 presents the proposed scheme based on CIR with optimal power control, and considers both global d.c. optimization approach and local GP method for solving the nonconvex optimization problem. The analysis of asymptotic sum-rate performance and convexity is presented in Section 4, showing that the proposed CIR precoding scheme can achieve asymptotically the optimum sum-rate offered by the DPC strategy. Section 5 provides simulation results on the performance of the proposed scheme under various conditions. Finally, concluding remarks are made in Section 6.

System model
In this paper, we consider a MIMO broadcast system using one transmitter to serve K users in a single-cell wireless scenario. The transmitter at the base station is equipped with M transmit antennas, and user k with N k receiver antennas. The channel between user k and base station is modeled as a zero-mean circularly symmetric Gaussian matrix H k . The received signal vector of user k is written as where x ∈ C M×1 is the transmitted signal from the base station, H k ∈ C Nk×M the channel gain matrix between base station and user k, n k ∈ C Nk×1 the additive white Gaussian noise (AWGN) at the kth user with covariance matrix E{n k n H k } = I Nk , and y k the received signal vector of user k.
Assume that the transmitter at the base station has an average power constraint P over M antennas, that is, E{tr(xx H )} ≤ P. The entries of H k are assumed to be independent, and H k is assumed to be unchanged for the duration of a frame. Also the channel matrix and AWGN are normalized such that the entries of H k and n k all have unit variance. For ease of presentation, N k = 1, k = 1, . . . , K is assumed in the following discussions without loss of generality in this paper. As argued in [10, Section VIII], the most straightforward way of extension to multiple receive antennas is to treat each antenna of the kth user with N k > 1 as a separate user, assuming N k receiver antennas do not coordinate. This uncoordinated receiver strategy results in K k=1 N k single-antenna users. Thus, in the limit of K → ∞, we can extend all the algorithm development and asymptotic analysis presented in this paper to multiple receive antennas without loss of generality.
Using transmitter linear precoding at the base station, the transmitted signal vector x can be expressed as a linear combination of transmitted symbols for each user. User streams are separated by different precoding vectors, namely, beamforming directions. Let s k , w k , and P k be the data symbol, precoding column vector, and allocated power for user k, respectively. The transmitted signal of linear precoding schemes can thus be expressed as As a result, the received signal of user k can be written as where h k is a row vector referring to the kth user's channel when N k = 1, and the second term represents the interference caused by all other users to user k.
Treating the interference from all other users as an additive Gaussian white noise (AWGN) and assuming singleuser detection is employed, the achievable sum-rate of linear precoding for MIMO-BC channels is given by [10] In contrast, the sum-rate capacity of MIMO-BC channel achieved by DPC strategy was shown in [3,6] where |·| denotes the determinant of the matrix inside.

Zero-forcing precoding scheme (see [10])
ZF precoding scheme selects S ≤ M precoding vectors such that where † denotes the Moore-Penrose pseudoinverse of the matrix.
Substituting the ZF precoding vector w i ∈ W(S) into (4), the achievable sum-rate of the ZF scheme for MIMO-BC channels is Pi≥0, i∈S Pi≤P i∈S where γ i is the effective gain of subchannel i given by Under the condition of zero-interference among parallel subchannels, water-filling strategy [15] gives the optimal power allocation P i for the optimization problem (7). Since there are many possible choices of selected user subsets, the achievable sum-rate capacity of optimum ZF is defined in [10] as R ZF = max S⊂{1,...,K}:|S|≤M R ZF (S).

PROPOSED SCHEME
Despite its simplicity and asymptotic sum capacity optimality, as a suboptimal linear precoding scheme, the main disadvantage of ZF scheme is the excess transmission power penalty incurred at the transmitter. In particular, the effective channel gain is greatly reduced when the selected H(S) is poorly conditioned, which is often the case with small number of users in the system. A channel matrix is called poorly conditioned if the spread of its singular values is large.
In this paper, to alleviate the reduction of effective channel gain, we propose to use CIR [13] as an alternative linear precoding scheme in conjunction with user selection algorithms, for example, SUS [10]. Rather than performing pseudoinverse on the channel matrix of selected users H(S) directly as conducted in ZF scheme, the CIR scheme adds a multiple of identity matrix before inverting to regularize an inverse for MIMO-BC channels.
Instead of performing Moore-Penrose pseudoinverse to obtain W(S) using (6), we compute the CIR precoding matrix as

W(S) = H(S) H H(S)H(S)
where I is an S × S identity matrix and β ≥ 0 is a load factor introduced for controlling the amount of interference allowed. Clearly, by choosing β sufficiently large, the inverse of a poorly conditioned H(S) in (6) can be made to behave as well as desired.
Note that in the case of β = 0, (9) is equivalent to the simple channel inverse of ZF precoding scheme, given by pseudoinverse in (6). The amount of interference increases with β, and can be controlled by β as desired. One reasonable choice of β is to maximize the SINR at the receiver under the assumption of equal power for users is given in [13] as

EURASIP Journal on Advances in Signal Processing
Substituting the CIR precoding vector w i ∈ W(S) in (9) into (4), we can obtain the achievable sum-rate of MIMO-BC channels with CIR as the linear precoding scheme as It is worth noting that when β > 0, the zero-interference condition h i w j = 0, for all i / = j is no longer satisfied with the CIR precoding scheme. In other words, the received signal at user i contains the interference from other users, denoted by j∈S, j / = i P j |h i w j | 2 . This residual interference from other users makes the optimization problem (11) dramatically different from the convex optimization problem (7) resulting from ZF precoding scheme. In fact, the optimization problem in (11) cannot be reduced to a simple convex optimization problem, and is essentially a nonlinear nonconvex optimization problem with many local optima. Unlike the closed-form water-filling solution of convex optimization problem (7), the optimum power allocation for the nonlinear nonconvex optimization problem (11) defies a closed-form solution, and requires a constrained numerical optimization algorithm to find the optimal power allocation for selected user subset S.
The global optimum of nonconvex optimization problem generally cannot be found by conventional optimization techniques which are only capable of finding one of the local optima. To solve the nonconvex optimization problem (11) globally, we can resort to global optimization techniques that are developed to solve globally multiextreme optimization problems arising from important practical applications [16]. Yet the successful application of global optimization methods to solve practical problems depends heavily on exploiting the specific structure of a problem.

Global d.c. optimization approach
The objective function in optimization problem (11) can be decomposed into the following form: where where P = [P 1 , . . . , P S ] T is power allocation vector for the selected users. Because the sum of convex functions is convex, g(P) = − i∈S log 2 (u i (P)) is convex if −log 2 (u i (P)) is convex, that is, log 2 (u i (P)) is concave, where u i (P) = 1 + j∈S, j / = i P j |h i w j | 2 . Since log 2 x is a nondecreasing concave function, and u i (P) is a concave function as it is a linear function of P, log 2 (u i (P)) is concave because the composition function w = v • u is concave if v is concave and nondecreasing, and u is concave [17]. Similarly, we can show that f (p) is also convex. Hence, −R(P) = f (P) − g(P) is a direct connection function. It follows that where D P ∈ R N : 0 ≤ P i , i ∈ S and i∈S w i 2 P i ≤ P} are a convex set in R N . Thus, max R(P) in (11) is a global d.c. optimization problem [16] similar to that formulated for optimal spectrum balancing (OSB) in digital subscriber lines (DSLs) interference channels [14]. This particular class of d.c. optimization problems with only linear constraints can be transformed into their equivalent global concave minimization problems [18]. The prismatic branch and bound (PBnB) algorithm was introduced for solving the concave minimization formulation for OSB in DSL in [14], which can find the global optimum efficiently via only solving a sequence of linear programing (LP) subproblems. The detailed description of PBnB algorithm was presented in [14] and is omitted in this paper for the sake of brevity.

Local gradient projection approach
To solve the resulting nonlinear optimization problem of CIR scheme efficiently, we propose to use local gradientprojection (GP) method [19], which is an extension of the unconstrained steepest descent method particularly suitable for constrained optimization problem with convex feasible region. Since the feasible region of our particular power allocation optimization problem (11) is a geometrically simple simplex, GP method can find the optimum solution of optimization problem (11) very efficiently.
Recall that the feasible region of optimization problem (11) is defined by In order to apply the GP method to our particular optimization problem in (11), we have to first show how Y. Xu and T. Le-Ngoc 5 Initialization: (18) (2) P l i = P l i + s i g l i , i ∈ S (3) P l i obtained by prejecting P l i onto S(P), i ∈ S (4) Applying Armijo rule along feasible direction to compute δ l (5) P l+1 Algorithm 1: Optimal power allocation based on GP method.
to perform projection onto the constraint set given by the simplex S(P) and the gradient of the objective function in (11), respectively. The projection of point outside a simplex, that is, Then, the projection of a point X onto this hyperplane is The gradient g i is derived as The proposed algorithm using GP method is presented in Algorithm 1.
The GP algorithm is guaranteed to converge to a stationary point [19]. In most of the numerical simulations we conducted for the CIR precoding with optimal power control, the GP algorithm can converge successfully to a local maximum solution in less than 40 iterations.

ASYMPTOTIC ANALYSIS
In this section, we first study the asymptotic sum-rate performance of our proposed CIR precoding in conjunction with SUS user selection algorithm in MIMO-BC channels in the limit of K → ∞. Then, we show that the proposed local GP algorithm can achieve the global optimum of the nonconvex optimization problem (11) as K → ∞. Throughout the proof in this section, we assume that |S| = M is almost surely true with sufficiently large K as argued in [10].

Asymptotic sum-rate analysis
It has been shown in [10] that the ZF precoding scheme with SUS algorithm has the same asymptotic sum-rate as that of DPC. Thus, we only need to prove that the CIR precoding vectors W(S) are equivalent to the ZF precoding vectors as We define the measure of orthogonality between two arbitrary vectors a and b as Clearly, as a and b become nearly orthogonal to each other, ρ(a, b) tends to closely approach zero.
To maximize the sum-rate using ZF precoding scheme for MIMO-BC channels, the user selection algorithms SUS in [10] chooses a group of users with nearly orthogonal channel responses and also with sufficiently large magnitude of the channel responses.
SUS algorithm proposed in [10] introduced a small positive constant α to control the degree of semiorthogonality among selected users. More specifically, at iteration number i, user k is considered in the next iteration if and only if ρ(g i , h k ) < α is satisfied, where g i is the orthogonal vector derived from h k by using the Gram-Schmidt procedure [10]. Using some simple but tedious algebraic operations, we can show that ρ(g i , h k ) < α implies an α-orthogonal H(S). A set of selected channel vectors H(S) is called ε-orthogonal if ρ(h i , h j ) ≤ ε, for all i, j ∈ S, i / = j. It can be shown that for an arbitrary small positive ε, we can construct an ε-orthogonal H(S) by SUS algorithm with probability one with sufficiently larger number of user K [10]. Hence, the channel vectors of selected users obtained by SUS algorithm become increasingly more orthogonal as K increases, and eventually these M vectors form an orthogonal basis for the space of C M in the limit of K → ∞.
Recall that the CIR precoding matrix is computed by Compared with the ZF precoding matrix,

W ZF (S) = H(S) H H(S)H(S)
6 EURASIP Journal on Advances in Signal Processing it is evident that the normalized precoding vectors of both the CIR and ZF schemes are exactly the same. Therefore, the expected sum-rate performance of the CIR is equivalent to that of ZF precoding scheme as K → ∞. Since ZF precoding scheme was shown in [10] to achieve asymptotically the optimum sum-rate, the proposed CIR precoding with SUS can thus achieve asymptoticallythe optimum sum-rate capacity equal to that of the optimum DPC strategy as K → ∞, that is,

Asymptotic convexity analysis
The asymptotic convexity analysis for the nonconvex optimization problem (11) reveals that the objective function can be regarded as a convex function of users' power allocation vector P when K → ∞. The d.c. representation in (14) establishes the equivalence between the original sum-rate maximization problem (11) and its d.c. formulation, that is, max R(P) ≡ min(−R(P)) = min[ f (p) − g(p)]. As previously discussed, H(S) obtained by SUS can be made as arbitrarily orthogonal as desired with sufficient large K. In other words, the CIR precoding vectors W CIR (S) given in (20) are just a scaled rotation of H(S) in the limit of K → ∞. Hence, we have where x y indicates that lim K → ∞ x/ y = 1. In (a), we use W CIR (S) in (20); (b) follows from lim x → 0 log 2 (1 + x) x log 2 e; inequality (c) is obtained by ρ(h i , h j ) ≤ ε, for all i, j ∈ S, i / = j, P i ≤ P and decreasing the denominator from ( h j 2 + β) Therefore, the nonconvex component −g (p) can be essentially ignored in the d.c. formulation −R(P) = f (p) − g(p). As f (p) was shown to be convex, the optimization problem max R(P) ≡ min(−R(P)) in (14) can be regarded as a convex optimization problem as K → ∞. It implies that the local optimum solution obtained by GP method is actually the global optimum of the optimization problem (11), because standard optimization techniques for finding local solutions yield the global optimum [17] for a convex optimization problem. Moreover, it provides the theoretical justification for using the local GP method to find the optimum solution of our sum-rate maximization problem, as the performance loss compared to that of the global d.c. approach is negligible whilst the computational complexity reduction is significant.

NUMERICAL RESULTS
In this section, illustrative numerical sum-rate results of the CIR-SUS scheme are presented to evaluate and compare its performance with that of the ZF-SUS scheme in MIMO-BC channels. In all simulations, the number of transmit antennas is M = 4, and α = 0.4 for controlling semiorthogonality [10] in all SUS algorithms (ZF-SUS, CIR-SUS-GP, and CIR-SUS-DC). By doing this, all SUS algorithms have the same and fixed selected user subsets. All the plots are obtained by averaging over 500 independent channel realizations. Figure 1 first compares the sum-rate performance versus the number of users in the system for DPC, ZF-SUS scheme, the proposed CIR-SUS scheme with local GP method, and the CIR-SUS with global d.c. method. The SNR is set to 10 dB which is equal to the transmit power P = 10 dB. It shows that the proposed scheme can achieve a higher sum-rate capacity than ZF with the same selected user subset obtained by the SUS algorithm. Note that the gain of CIR over ZF precoding decreases with the number of users. This reduction of gain is because channel matrix of selected user subset tends to be more near-orthogonal with a higher number of users. As a result, the effective channel gain reduction encountered by ZF would become less prominent when the number of users is larger. Figure 1 also shows that the CIR-SUS with global d.c. optimization approach only leads to a slight ergodic sumrate gain over CIR-SUS with local GP method in the case of small number of users is small. Furthermore, the gain offered by global d.c. approach quickly diminishes with increasing Y. Xu and T. Le-Ngoc  a large number of users. The simulation results support our claim in Section 4.2 that the nonconvex optimization problem can be essentially regarded as a convex optimization problem with a sufficiently large number of users. Hence, in the simulation results presented in Figures 2 and 3, the sumrate performance of CIR-SUS is obtained by using local GP optimization method. Figure 2 compares the sum-rate performance of CIR-SUS and ZF-SUS versus the number of users K at three different levels of the base station transmission power, that is, P = 0 dB, 5 dB, and 10 dB. Figure 2 shows that the gain of CIR over ZF increases with lower transmission power. This is not surprising because the rate function log(1 + x) ≈ x when x is small. In other words, when transmission power is low, the rate function behaves like a linear function rather than as a logarithmic function. Hence, any reduction of effective channel gain experienced by ZF is directly translated into a linear loss of sum-rate when SNR is low.
The sum-rate performance of CIR-SUS and ZF-SUS versus the transmission power is compared in Figure 3 for both the cases of 20 users and 50 users. Clearly, the gain of CIR over ZF increases when the number of users K is less or the transmit power is lower.

CONCLUSIONS
In this paper, for MIMO-BC channels, we proposed to use CIR as the precoding scheme in conjunction with user selection algorithm to alleviate the transmission power penalty incurred by ZF. Unlike the interference-free ZF, the optimization problem of maximizing sum-rate capacity using CIR precoding is nonlinear and nonconvex. A local GP method is then proposed to solve this optimization problem efficiently with negligible sum-rate performance loss compared to the global d.c. optimization technique, but with much less computational complexity. Asymptotic analysis shows that the proposed CIR precoding scheme in conjunction with user selection algorithm can achieve the optimum ergodic sum-rate equal to that of DPC in the limit of a large number of users. Simulation results under various conditions show that the proposed CIR precoding scheme with optimal power allocation achieves better sumrate performance than ZF in MIMO-BC channels, especially in the practical range of number of users, for example, K ≤ 100, and at relatively low SNR.