Research  Open  Published:
A limited feedback scheme for massive MIMO systems based on principal component analysis
EURASIP Journal on Advances in Signal Processingvolume 2016, Article number: 64 (2016)
Abstract
Massive multipleinput multipleoutput (MIMO) is becoming a key technology for future 5G cellular networks. Channel feedback for massive MIMO is challenging due to the substantially increased dimension of the channel matrix. This motivates us to explore a novel feedback reduction scheme based on the theory of principal component analysis (PCA). The proposed PCAbased feedback scheme exploits the spatial correlation characteristics of the massive MIMO channel models, since the transmit antennas are deployed compactly at the base station (BS). In the proposed scheme, the mobile station (MS) generates a compression matrix by operating PCA on the channel state information (CSI) over a longterm period, and utilizes the compression matrix to compress the spatially correlated highdimensional CSI into a lowdimensional representation. Then, the compressed lowdimensional CSI is fed back to the BS in a shortterm period. In order to recover the highdimensional CSI at the BS, the compression matrix is refreshed and fed back from MS to BS at every longterm period. The information distortion of the proposed scheme is also investigated and a closedform expression for an upper bound to the normalized information distortion is derived. The overhead analysis and numerical results show that the proposed scheme can offer a worthwhile tradeoff between the system capacity performance and implementation complexity including the feedback overhead and codebook search complexity.
Introduction
The massive multipleinput multipleoutput (MIMO) system which deploys large numbers of transmit antennas at the base station (BS) has been listed as one of the key techniques for fifth generation (5G) cellular networks [1]. The deployment of numerous antennas enables massive MIMO systems to achieve not only higher system capacity, but also higher spectrum and energy efficiency than conventional MIMO systems [2, 3].
The superior performance of the massive MIMO systems relies on the spatial multiplexing and the minor multiuser interference. As is the case for conventional MIMO systems, this in turn requires the BS to have perfect knowledge of the downlink channel state information (CSI) [4]. In a time division duplexing (TDD) system, the channel reciprocity can be exploited to acquire the downlink CSI at the BS [5]. However, things become more challenging when the system operates in a frequency division duplexing (FDD) mode, where the channel reciprocity no longer holds. Therefore, a mobile station (MS) needs to feedback the downlink CSI through a ratelimited uplink channel. The authors in [6] drew the conclusion that the required feedback rate per user should be increased in proportion to the number of the transmit antennas for the sake of obtaining the full multiplexing gain. Therefore, feedback overhead turns into a key challenge in the massive MIMO systems.
The foundation of the works on feedback overhead reduction for MIMO systems is the correlation feature of MIMO channels. Limited feedback techniques for correlated MIMO channels were designed in [7–9]. A modified Grassmannian line packing codebook was proposed in [7], and the authors in [8, 9] rotated the codebook for i.i.d. channels with a unitary matrix to obtain the codebook for correlated MIMO channels. A systematic codebook was designed for quantized beamforming in [10], which was implemented by maps that can rotate and scale spherical caps on the Grassmannian manifold.
Furthermore, a codebook for uniform rectangular arrays (URA) for massive MIMO antennas was designed in [11, 12]. It was derived by the Kronecker product of two ULA codebooks. The authors in [13] proposed a feedback framework for FDD massive MIMO systems that divides the coverage area into subsectors, where each subsector is formed by a set of narrow beams that covers a preassigned area in azimuth and elevation. Noncoherent trelliscoded quantization and trellisextended codebooks for massive MIMO systems were proposed in [14, 15], which exploited a Viterbi decoder for CSI quantization and a convolutional encoder for CSI reconstruction. A projection based feedback compression was utilized to project the highdimensional channel space into a lower dimensional subspace [16]. However, [16] did not explain how to feedback the projection matrix.
The compressive sensing (CS)based limited feedback schemes for massive MIMO were proposed to reduce the feedback overhead by exploiting the spatial correlation of CSI [17–20]. The authors in [17] introduced CS to massive MIMO for limited feedback. A unique insight was provided that strong spatial correlations are exhibited in massive closelypacked antenna arrays, so channel vectors can be represented in sparse form in the spatialfrequency domain. Subsequently, a compressed analog feedback strategy for spatially correlated massive MIMO channels was proposed in [18]. In contrast to the strategy in [18], the lowdimensional CSI was quantized with a codebook and the preferred index was fed back in [19] and [20].
The choice of orthogonal basis, which is intended for the sparse representation of the original signal, plays an important role in the recovery of the original highdimensional signal at the BS. Two such kinds of orthogonal basis construction, the discrete cosine transform (DCT) and the KarhunenLoeve transform (KLT), are usually employed [21]. If the channel correlation matrix is neither known at the MS nor the BS, the signalindependent DCT basis is a better option. On the one hand, because of its signalindependent nature, the utilization of the DCT basis does not require the MS to inform the BS of the channel correlation matrix. On the other hand, this makes the DCT basis incapable of tracking the realtime change of channel state, which has a negative effect on system capacity. In contrast to the DCT basis, the KLT basis can excellently adapt to CSI change. Therefore, when MS and BS both know the instantaneous channel correlation matrix, the KLT basis can provide the optimal sparse representation, which promises accurate recovery even if only a small number of measurements are available. Unfortunately, the signaldependent nature of the KLT basis requests the MS to feedback channel correlation matrix instantaneously [22]. This can hardly be implemented in practical systems because of the heavy feedback overhead.
In this case, principal component analysis (PCA) can offer a tradeoff between system capacity and practical implementation [23, 24]. Compared with a DCT basis, PCA can be more adaptive to the change of the channel state, since PCA is signaldependent [23]. This guarantees PCA better system capacity than a DCT basis. Compared with a KLT basis, PCA only needs that the MS and the BS have knowledge of the channel correlation matrix in a longterm period. This makes PCA achieve feedback overhead reduction much better than a KLT basis. What is more, the most attractive characteristic of PCA is that it is effective for dimensionality reduction of highdimensional data [24], whose elements are correlated. Inspired by this, PCA has great potential to be applied to the compression of highdimensional CSI with strong spatial correlation to reduce feedback overhead in massive MIMO systems. To the best of our knowledge, there have not existed any works addressing a practical feedback scheme based on PCA.
This paper proposes a PCAbased feedback scheme for massive MIMO systems. In the proposed scheme, the MS utilizes a compression matrix, which is obtained by operating PCA on CSI observed over a longterm period, to compress spatially correlated highdimensional CSI into lowdimensional representation. After quantizing the lowdimensional CSI with a random vector quantization (RVQ) codebook, the index of the preferred codeword is fed back to the BS in each shortterm period. In order to track the channel changes and enable the BS to recover the highdimensional CSI, it is necessary for the MS to refresh and feedback the compression matrix at every longterm period. Through the dimensionality reduction processing by PCA, feedback overhead and codebook search complexity can be reduced. The contributions of the paper are summarized as follows.

A PCAbased feedback scheme for FDD massive MIMO systems is proposed. The operation procedures at the BS and the MS are divided into two types, which are longterm period operations and shortterm period operations. In more detail, the exact operation procedures both at the BS and the MS, as well as the derivation of the compression matrix at the MS, are presented. The distortion of the proposed scheme is analyzed. An upper bound to the normalized distortion is derived.

System performance comparisons of the PCAbased feedback scheme, the DCTbased CS scheme and the KLTbased CS scheme are presented. The feedback overhead and the codebook search complexity are analyzed and the system capacity performance is simulated. Looking at the simulation results and the feedback overhead analysis comprehensively, we draw the conclusion that our proposed scheme can achieve a compromise between system capacity and implementation complexity (feedback overhead and codebook search complexity).
The remainder of this paper is organized as follows. In Section 2, the massive MIMO system model is described. Section 2 first reviews the PCA method itself in Subsection and then provides the details of the proposed scheme in Subsection. Moreover, distortion of the proposed scheme is analyzed in Subsection. The feedback overhead as well as codebook search complexity comparison and numerical results follow in Section 4 and Section 5, respectively. Finally, the conclusion of this paper is presented in Secton 6.
Notation: Throughout this paper, upper and lower case boldfaces are used to describe matrix A and vector a, respectively. We denote the transpose and the conjugate transpose of matrix A or vector a by A ^{T}(a ^{T}) and A ^{H}(a ^{H}). In addiction, A ^{−1} denotes the inverse of a square matrix.
System model
We consider a downlink massive MIMO system, where there is a single cell, in which the BS equipped with N _{t} antennas serves K singleantenna MSs.
Spatially correlated massive MIMO channel
A massive MIMO broadcast channel is modeled in this section. For simplicity, but without loss of generality, a largescale uniform linear array (ULA) with an enormous number of antenna elements deployed compactly is assumed. The spatial correlations are exhibited in the massive MIMO channel model, because of the insufficient interelement spacing. Additionally, a poor scattering environment may also contribute to the spatial correlation. Different from the previous works, which only consider either insufficient interelement spacing or poor scattering environment, this paper combines the wellknown Kronecker correlation model [25] with the geometrical onering model [26, 27], so as to describe the properties of the spatial correlation of the massive MIMO channel more precisely. Since the MS is equipped with a single antenna, the channel between the k ^{th} MS and BS is denoted by a 1×N _{t} row vector h _{ k } (k=1,2,…,K). Based on the Kronecker correlation model, h _{ k } can be modeled as
where $\textbf {R}_{\text {Tx}}^{\frac {1}{2}}$ is the square root of the correlation matrix at the transmitter depicting the impact of insufficient interelement spacing and h _{one−ring} is derived from the onering model describing the spatial correlation caused by a scattering environment. Note that the correlation of the channel is time varying, due to the change of both the relative positions of scatterers and the correlation matrix at the transmitter.
In more detail, the u ^{th} row and the v ^{th} column entry of R _{Tx} (the correlation coefficient between the u ^{th} and the v ^{th} elements within the BS transmit antenna array) obeys the zerothorder Bessel function of the first kind correlation model [18], that is
where d _{ uv } is the distance between the two antenna elements and λ denotes the carrier wavelength.
As to the onering model, we assume that each MS is surrounded by Q scatterers, which are uniformly distributed on a circle with the radius r, as shown in Fig. 1. The h _{one−ring} can be given as follows [27],
In (3), h _{ kq } is the channel vector of the MS k over the q ^{th} scattering path, as given by
where d _{ kqm } is the distance between the q ^{th} scatterer of the k ^{th} MS and the m ^{th} (m=1,2…,N _{t}) BS antenna, while d _{ kqm }+r denotes the path length from the k ^{th} MS to the m ^{th} antenna via the q ^{th} path. Also, ${{e^{j{\varphi _{kq}}}}}\phantom {\dot {i}\!}$ represents the random common phase resulting from either the random perturbations of the MS location or the phase shift due to the reflection of the scatterer, and β _{ kqm } denotes the path loss of the q ^{th} scattering path, which is modeled by
where α is a constant and γ is the path loss exponent.
Downlink signal model
In the downlink transmission, ${{s_{k}} \in \mathbb {C} }$ and ${{\textbf {w}_{k}} \in {\mathbb {C}^{{N_{\mathrm {t}}} \times 1}}}$ denote the transmit signal with power constraint ${{\mathbb {E}}{\left  {{s_{k}}} \right ^{2}} = 1}$ and the column precoding vector intended for the k ^{th} MS, respectively. In this paper, zeroforcing precoding is adopted to eliminate multiuser interference [28]. Also, let n _{ k } be additive Gaussian noise with zero mean and unit variance at the MS k. Then the received signal of the k ^{th} MS can be expressed as
where P _{t} is the total transmit power of the BS. Equal power allocation is assumed with ${\frac {{{P_{\mathrm {t}}}}}{K}}$ being the power distributed to each MS.
As seen in (6), y _{ k } contains two main terms. The first term is the desired signal, while the other is the interfering signal and noise. From (6), we can derive the system capacity as
Feedback scheme for massive MIMO
Review of principal component analysis
We suppose that there are a data samples, each of which contains b characteristics. The b characteristics have complicated correlation relationships with each other, which makes it possible for dimensionality reduction with PCA. For convenience of description, let the a×b matrix X denote the original data containing the a data samples. The key point of PCA is how to derive a b×l (l<b) compression matrix ${\bar {\boldsymbol {\Psi }}}$, which is utilized to compress the highdimensional data a×b X into a lowdimensional a×l ${\bar {\mathbf {X}}}$ as follows,
in which, ${\bar {\boldsymbol {\Psi }}}$ is composed of ldominating eigenvectors, the socalled principal components, which are selected from all b eigenvectors of X.
For the sake of determining which components are to be selected, the concept of contribution rate is introduced. Consider a descending ordering of the b eigenvalues λ _{1},λ _{2}…,λ _{ b }. Then, the contribution rate of the g ^{th} eigenvalue λ _{ g } is defined as ${\frac {{{\lambda _{g}}}}{{{\sum \nolimits }_{g = 1}^{b} {{\lambda _{g}}} }}}$, while the cumulative contribution rate of the top l eigenvalues can be expressed by ${\frac {{{\sum \nolimits }_{g = 1}^{l} {{\lambda _{g}}} }}{{{\sum \nolimits }_{g = 1}^{b} {{\lambda _{g}}} }}}$. Generally, when the cumulative contribution rate of the chosen l principal components exceeds a certain level, the information loss is acceptable.
Finally, the original X can be recovered from ${\bar {\mathbf {X}}}$ by
Proposed PCAbased feedback scheme
A PCAbased feedback scheme for massive MIMO is proposed in this subsection. In the proposed scheme, different operations at the MS and the BS have different time periods, longterm period T _{l} and shortterm period T _{s}. Every longterm period T _{l} contains several shortterm periods T _{s}. In every T _{s}, the MS utilizes the compression matrix to compress highdimensional CSI into lowdimensional representation. Then, the compressed lowdimensional CSI is quantized by the RVQ codebook and the index of the preferred codeword is fed back to the BS. Because of the signaldependent nature of PCA, the compression matrix is derived by executing PCA on the CSI which is obtained through continuous channel estimation during a whole longterm period.
Compression matrix derivation
First of all, the detailed procedure for deriving the compression matrix in the n ^{th} longterm period T _{l} ^{(n)} with the PCA method is given in Table 1. We assume MS k can obtain S highdimensional channel vectors $\left ({\textbf {h}_{k}^{\left ({n,1} \right)},\textbf {h}_{k}^{\left ({n,2} \right)} \ldots,\textbf {h}_{k}^{\left ({n,S} \right)}}\right)$ through ideal channel estimation in T _{l} ^{(n)}. Here, the S channel vectors can be viewed as S data samples, each of which contains N _{t} characteristics. In order to compress the highdimensional CSI (channel vectors), we choose M (M≪N _{t}) dominating eigenvectors to compose the compression matrix ${{\bar {\mathbf {U}}^{\left (n \right)}} \in {\mathbb {C}^{{N_{\mathrm {t}}} \times M}}}$.
The compression matrix obtained in the longterm period T _{l} ^{(n)} will be used by the MS to compress 1×N _{t} channel vectors into 1×M vectors, as well as by the BS to perform recovery in the period T _{l} ^{(n+1)}.
MS operation
The main operations at the MS can be classified into two types: longterm period operations and shortterm period operations.
In the longterm period ${T_{\mathrm {l}}^{\left (n \right)}}$, the MS performs the continuous channel estimation to obtain S highdimensional channel vectors, and derives the compression matrix ${{\bar {\mathbf {U}}^{\left (n \right)}}}$. Then, each column of the compression matrix is quantized by another RVQ codebook. After quantization, the compression matrix ${{\bar {\mathbf {U}}^{\left (n \right)}}}$ is fed back to the BS at the end of ${T_{\mathrm {l}}^{\left (n \right)}}$.
Operation in the s ^{th} (s=1,2…,S) shortterm period of the n ^{th} longterm period, ${T_{\mathrm {s}}^{\left ({n,s} \right)}}$, is described as follows:
Step 1. Channel estimation is performed to obtain a 1×N _{t} channel vector ${\textbf {h}_{k}^{\left ({n,s} \right)}}$.
Step 2. Multiply $\textbf {h}_{k}^{\left ({n,s} \right)}$ by the compression matrix derived in the previous longterm period
By this step, the original highdimensional CSI (1×N _{t}) is compressed into a lowdimensional representation (1×M). The compression ratio is ${\frac {M}{{{N_{\mathrm {t}}}}}}$.
Step 3. Quantize the lowdimensional CSI ${\bar {\mathbf {h}}_{k}^{\left ({n,s} \right)}}$ by RVQ codebook and obtain the index number of the codeword that best fits ${\bar {\mathbf {h}}_{k}^{\left ({n,s} \right)}}$, that is
where c _{ j } is the j ^{th} codeword of the codebook. Compared with the quantizing highdimensional CSI directly, the RVQ codebook used above can be designed to be much smaller. This not only reduces the feedback overhead, but also decreases the codebook search complexity.
Step 4. The index of the preferred codeword is fed back to the BS.
BS operation
Similarly, the main operation at the BS can also be classified into longterm period operation and shortterm period operation. At the end of ${T_{\mathrm {l}}^{\left (n \right)}}$, the BS receives the compression matrix ${{\bar {\mathbf {U}}^{\left (n \right)}}}$ to perform highdimensional CSI recovery in the next longterm period ${T_{\mathrm {l}}^{\left ({n + 1} \right)}}$. Meanwhile, the shortterm period operation follows the steps below:
Step 1. The codeword index j ^{(n,s)} is received in each shortterm period;
Step 2. As the BS and the MS share the same codebook, it is easy for the BS to find the quantized lowdimensional CSI ${\hat {\mathbf {h}}_{k}^{\left ({n,s} \right)}}$ by letting ${\hat {\mathbf {h}}_{k}^{\left ({n,s} \right)} = {\textbf {c}_{{j^{\left ({n,s} \right)}}}}}$;
Step 3. The highdimensional CSI ${\stackrel {\frown }{\mathbf {h}}}_{k}^{\left ({n,s} \right)}$ can be recovered by
where ${{\bar {\mathbf {U}}^{\left ({n  1} \right)}}}$ is derived from the period ${T_{\mathrm {l}}^{\left ({n  1} \right)}}$.
Distortion analysis of proposed scheme
The distortion of our proposed scheme consists of three components, the distortion resulting from the PCA processing, the lowdimensional CSI ${\bar {\mathbf {h}}}$ quantization and the compression matrix ${\bar {\mathbf {U}}}$ quantization. To facilitate the distortion analysis below, different representations of CSI in different stages are enumerated in Table 2.
Distortion analysis of PCA
First, we consider the distortion caused by the PCA method itself, with no quantization errors resulting from lowdimensional CSI or the compression matrix taken into account. That is, we measure the mean square error between h and ${\tilde {\mathbf {h}}}$, where ${\tilde {\mathbf {h}} = {\mathbf {h}\bar {\mathbf {U}}}{\bar {\mathbf {U}}^{H}}}$. In this paper, h is a N _{t} dimensional row vector, which can be viewed as a point in the N _{t} dimensional space. Therefore, h can be expressed by the linear combination of a set of orthogonal basis vectors, u _{ i }
where u _{ i } denotes the i ^{th} basis vector.
In the PCA method, the highdimensional CSI h is compressed into lowdimensional (Mdimensional) ${\bar {\mathbf {h}}}$, the components of which are derived by projecting h onto the M dominating bais vectors. Given this, the reconstructed highdimensional CSI ${\tilde {\mathbf {h}}}$ can be modeled as the combination of M dominating basis vectors and the other N _{t}−M less dominating vectors,
Therefore, the information distortion caused by the PCA itself, which is defined as the mean square error between h and ${\tilde {\mathbf {h}}}$, can be expressed by
where S denotes the number of shortterm periods in a longterm period.
Proposition.
The PCAcaused information distortion J can be expressed by a linear sum of N _{t}−M less dominating eigenvalues of the channel covariance matrix.
Proof.
See Appendix A.
Distortion analysis of quantization
According to [6], to measure the quantization error, ${\bar {\mathbf {h}}}$ can be modeled as
where ${\hat {\mathbf {h}}}$ is the quantization of ${\bar {\mathbf {h}}}$ and e is a unit norm vector isotropically distributed in the nullspace of ${\hat {\mathbf {h}}}$. Parameter d denotes the quantization error independent of e, which satisfies ${\mathbb {E}\left [ {{d^{2}}} \right ] \le {2^{ \frac {{{B_{1}}}}{{M  1}}}}}$. Here, M represents the number of principal components and B _{1} is the number of feedback bits of ${\bar {\mathbf {h}}}$. Similarly, ${\bar {\mathbf {U}}}$ can be modeled as
where ${\hat {\mathbf {U}}}$ is the quantization of ${\bar {\mathbf {U}}}$ and E is composed of M unit norm vectors isotropically distributed in the nullspace of ${\hat {\mathbf {U}}}$. Moreover, quantization error D is independent of ${\hat {\mathbf {U}}}$ satisfying
where B _{2} denotes the number of feedback bits of the compression matrix.
Before analyzing the distortion between the original highdimensional CSI h and the reconstructed h⌢, we first focus on how to express h⌢ in terms of h.
Proposition.
The reconstructed highdimensional CSI h⌢ can be expressed in terms of h as
where ${{\textbf {I}_{M}} = \left [ {\begin {array}{*{20}{c}} {{\textbf {I}_{M \times M}}}&{{{\textbf {0}}_{M \times \left ({{N_{\mathrm {t}}}  M} \right)}}}\\ {{{\textbf {0}}_{\left ({{N_{\mathrm {t}}}  M} \right) \times M}}}&{{{\textbf {0}}_{\left ({{N_{\mathrm {t}}}  M} \right) \times \left ({{N_{\mathrm {t}}}  M} \right)}}} \end {array}} \right ]}$.
Proof.
See Appendix B.
Having derived an expression for h⌢ in terms of h, our purpose is to analyze the distortion of the proposed scheme. We derive an upper bound to the normalized distortion (denoted by δ) between h⌢ and h. Instead of calculating δ directly, we first calculate the normalized similarity (denoted by ρ) between h⌢ and h. Then, δ can be conveniently obtained by δ=1−ρ.
Before calculating the normalized similarity ρ between h⌢ and h, it is insightful to look at the nonnormalized similarity ${\mathbb {E}\left [ {\left  {\textbf {h}{{\stackrel {\frown }{\mathbf {h}}}^{H}}} \right } \right ]}$.
Proposition.
A lower bound to the nonnormalized similarity between h⌢ and h is given by
where $A={\left ({{N_{\mathrm {t}}}  \sum \limits _{i = M + 1}^{{N_{\mathrm {t}}}} {{\lambda _{i}}}} \right)}$.
Proof.
Since we have derived an expression for h⌢ in terms of h, as given in Proposition 2, then the nonnormalized similarity can be expressed by
Now, because d ^{2}<1 and D ^{2}<1, ${\sqrt {1  {d^{2}}} }$ and ${\sqrt {1  {d^{2}}} }$ are also smaller than 1. Additionally, since the quantization of lowdimensional CSI and the compression matrix are independent, ${\mathbb {E}\left [ {\textbf {E}{\textbf {e}^{H}}} \right ] = 0}$. Therefore, the nonnormalized similarity can be bounded by (22), shown at the top of the next page.
Further, we take advantage of the equations
which are proved in Appendix C. Moreover, the upper boundary of ${\mathbb {E}\left [ {{D^{2}}} \right ]}$ and ${\mathbb {E}\left [ {{D^{2}}} \right ]}$ are ${{2^{ \frac {{{B_{2}}/M}}{{{N_{\mathrm {t}}}  1}}}}}$ and ${{2^{ \frac {{{B_{1}}}}{{M  1}}}}}$, respectively. Consequently, we obtain (20).
From (23), we can observe that when there is no distortion caused by PCA (J=0), as well as no quantization error from lowdimensional CSI (${{2^{ \frac {{{B_{1}}}}{{M  1}}}} \to 0}$) or compression matrix (${{2^{ \frac {{{B_{2}}/M}}{{{N_{\mathrm {t}}}  1}}}} \to 0}$), the maximum value of the lower bound reaches N _{ t }. So the normalized similarity can be expressed as
Finally, the upper bound of the normalized distortion of our proposed scheme can be obtained,
According to the expression (26), we give the theoretical upper bound of the distortion and the simulated distortion in Fig. 2.
Implementation complexity analysis
This section analyzes the feedback overhead and the codebook search complexity of the proposed scheme. For comparison, the existing CSbased schemes utilizing the KLT basis and the DCT basis are also taken into account.
The number of feedback bits per user increases linearly with the number of transmit antennas, as modeled by [6]
where ρ denotes the received signaltonoise ratio (SNR) in decibels at the MS.
Consider a longterm period containing S shortterm periods. When the proposed scheme is adopted, the number of feedback bits can be represented by
where M is the number of the principal components, the first term denotes the number of feedback bits for quantizing lowdimensional CSI, and the second term is caused by the quantization of the compression matrix.
For the DCTbased CS scheme, there is no need for the MS to inform the BS of the channel correlation matrix, due to the signalindependent nature of the DCT basis. So the number of feedback bits for DCTbased CS is given by ${{B_{\text {DCT}}} = S \cdot \frac {{\left ({M  1} \right)\rho }}{3}}$.
In the case of the KLTbased CS scheme, the number of feedback bits dramatically increases. In order to feedback the complete correlation matrix in every shortterm period, the second term of Eq. (28) needs to be modified to express the number of feedback bits of the KLTbased CS scheme, which is given by
As for the codebook search complexity, it is proportional to the number of conjugate multiplications when searching for the best codeword. So the search complexity of the proposed scheme, DCTbased CS and KLTbased scheme in a longterm period can be expressed as ${S \cdot M \cdot {2^{\frac {{\left ({M  1} \right)\rho }}{3}}} + M \cdot {N_{\mathrm {t}}} \cdot {2^{\frac {{\left ({{N_{\mathrm {t}}}  1} \right)\rho }}{3}}}}$, ${S \cdot M \cdot {2^{\frac {{\left ({M  1} \right)\rho }}{3}}}}$ and ${S \cdot M \cdot {2^{\frac {{\left ({M  1} \right)\rho }}{3}}} + S \cdot {N_{\mathrm {t}}}^{2} \cdot {2^{\frac {{\left ({{N_{\mathrm {t}}}  1} \right)\rho }}{3}}}}$, respectively.
Table 3 illustrates the comparison in detail. Based on the analysis above, we can observe that the number of feedback bits and the codebook search complexity of the proposed scheme falls in between the DCTbased and KLTbased CS schemes.
Simulation results
In this section, we present simulation results. A single cell scenario is considered, where the BS deploys a uniform linear array with N _{t}=128 antennas serving K=6 singleantenna MSs. Table 4 lists the detailed simulation parameters.
Feasibility validation
We verify whether PCA can be utilized to compress spatially correlated highdimensional CSI into lowdimensional representation. To achieve this purpose, we simulate the eigenvalue distribution of the spatially correlated channels defined in (1). As shown in Fig. 3, the eigenvalue distribution of the spatially correlated channel is far from uniform. The eigenvalues are sorted by their contribution rate in descending order. The contribution rate of the biggest eigenvalue exceeds 50 %, while the fourth biggest eigenvalue only contributes 5.9 %. The cumulative contribution rate of the top four eigenvalues exceeds 95 %. We can conclude that the spatially correlated channel vectors can be expressed by several principal components with low information distortion.
Evaluations of the proposed scheme
We show the simulation results of channel compression of the proposed scheme in Figs. 4 and 5. It is assumed that the SNR is 20 dB and there is no quantization error of the lowdimensional CSI.
Figure 4 shows the effect of the compression ratio $\left ({\eta = \frac {M}{{{N_{\mathrm {t}}}}}}\right)$ on the system capacity. The comparison is among the proposed scheme, DCTbased CS and KLTbased CS. We can observe that whether in low or high compression ratio regimes, the KLTbased CS has the best performance, while the DCTbased CS performs the worst. To be emphasized, the best performance of the KLTbased CS is at a sacrifice of increased feedback overhead, as shown in Table 3. In this sense, the proposed scheme can offer a useful tradeoff. Additionally, as Fig. 4 shows, the proposed scheme performs much better than DCTbased CS in low compression ratio regimes.
Figure 5 illustrates the recovery performance of the highdimensional CSI at the BS under the circumstances that the BS has perfect knowledge of the lowdimensional CSI without quantization. We take the proposed scheme and DCTbased CS for comparison. The 1×N _{t} original CSI is compressed into 1×M _{CS} lowdimensional information, where M _{CS}=20 and the compression ratio is ${{\eta _{\text {CS}}} = \frac {{{M_{\text {CS}}}}}{{{N_{\mathrm {t}}}}} \approx 0.16}$, while in the proposed scheme, the number of principal components is M _{PCA}=4 with the compression ratio being ${{\eta _{{\text {PCA}}}} = \frac {{{M_{\text {PCA}}}}}{{{N_{\mathrm {t}}}}} \approx 0.03}$.
As can be seen, the reconstructed highdimensional CSI is considerably close to the original data when the PCA is utilized. But there still exists distortion because the PCA itself inevitably introduces information loss. However, the recovery performance gets poorer in the case of the DCTbased CS. The reason is that the proposed scheme takes advantage of the signaldependent nature of PCA, which makes it possible for the compression matrix to change adaptively in every longterm period according to the variation of the original data.
Figure 6 shows a system capacity (defined as the sum of all the users’ rates in the system) comparison. We choose four principal components to form the compression matrix, so the compression ratio of PCA is 0.03. For reference, we first consider the ideal situation, where the BS can acquire perfect CSI with neither recovery distortion nor quantization error. As illustrated in Fig. 6, the best system capacity can be achieved only when the BS acquires perfect CSI. Meanwhile, the proposed scheme outperforms the existing DCTbased CS scheme whether there is quantization error resulting from lowdimensional CSI or not. But, it performs a little poorer than the KLTbased CS scheme.
When we utilize the RVQ codebook to quantize lowdimensional CSI, the system capacity decreases in both cases because the quantization error must be taken into account. Based on the results in Fig. 6 and the feedback overhead analysis in Subsection 3.3, we can draw the conclusion that our proposed scheme can offer a worthwhile design tradeoff between system capacity and feedback overhead.
Conclusions
In this paper, a PCAbased feedback scheme for massive MIMO was proposed. In the proposed scheme, two kinds of feedback information, the quantized lowdimensional CSI index and the compression matrix utilized to perform both compression and recovery, are fed back hierarchically. Moreover, we obtained a closedform expression for an upper bound to the normalized information distortion. We analyzed the feedback overhead and codebook search complexity of the proposed scheme. Simulation results showed that without considering the lowdimensional CSI quantization, the proposed scheme outperforms the existing DCTbased CS scheme in terms of compression ratio. When a RVQ codebook is adopted to quantize the lowdimensional CSI, worthwhile system capacity and recovery performance can be achieved. Finally, we draw the conclusion that the proposed scheme can achieve a useful performance tradeoff between system capacity and feedback overhead, which gives it high potential to be implemented in practical massive MIMO systems.
Appendix
A Proof of Proposition 1
The PCAcaused information distortion J is
In order to minimize J, we take partial derivatives with respect to z _{ ni } and b _{ i } separately, as given by
Letting ${\frac {{\partial J}}{{\partial {z_{ni}}}} = 0}$, we can obtain z _{ ni }=α _{ ni }, that is
Similarly, when letting ${\frac {{\partial J}}{{\partial {b_{i}}}} = 0}$, we can also acquire
where h⌣ denotes the mean vector of all the S highdimensional channel vectors estimated in a longterm period, as given by
Substitute (33) and (34) into (14), and then ${\tilde {\mathbf {h}}}$ can be rewritten as
Then,
Therefore, J can be reexpressed by
In (38), the expression ${\frac {1}{S}\sum \limits _{n = 1}^{S} {{{\left ({{\textbf {h}_{n}}{\textbf {u}_{i}}^{H}  \stackrel {\smile }{\mathbf {h}}{\textbf {u}_{i}}^{H}} \right)}^{2}}} }$ can be viewed as the covariance of h _{ n } u _{ i } ^{H}. If we assume C _{ h } to be the covariance matrix of h _{ n }, then J can be further given by
Our target is to minimize the PCA caused information distortion J, which can be solved by the Lagrange Multiplier (LM) method. After applying the LM, we can observe that the base vector must satisfy
Equation (40) indicates that base vector u _{ i } should be chosen as the eigenvector of channel covariance matrix C _{ h }, and λ _{ i } is the corresponding eigenvalue. As a result, the PCAcaused information distortion J is
B Proof of Proposition 2
As has been mentioned in Table 3, h⌢ is the reconstructed highdimensional CSI recovered from ${\hat {\mathbf {h}}}$, which is given by
where ${\left \ {{\mathbf {h}\bar {\mathbf {U}}}} \right \}$ represents the modulus of lowdimensional CSI, since we have performed normalization in the very beginning, that is
Substituting (16), (17) and (43) into (42), we can rewrite h⌢ as
Moreover, because of the independence between ${\hat {\mathbf {U}}}$ and E, when multiplying ${\bar {\mathbf {U}}}$ in (17) by E ^{H}, one obtains
where I _{ M×M } represents the M×M identity matrix; ${{{\textbf {0}}_{M \times \left ({{N_{\mathrm {t}}}  M} \right)}}}\phantom {\dot {i}\!}$, ${{{\textbf {0}}_{\left ({{N_{\mathrm {t}}}  M} \right) \times M}}}\phantom {\dot {i}\!}$ and ${{{\textbf {0}}_{\left ({{N_{\mathrm {t}}}  M} \right) \times \left ({{N_{\mathrm {t}}}  M} \right)}}}\phantom {\dot {i}\!}$ denotes the M×(N _{t}−M), (N _{t}−M)×M and ${\left ({{N_{\mathrm {t}}}  M} \right) \times \left ({{N_{\mathrm {t}}}  M} \right)}\phantom {\dot {i}\!}$ zero matrix respectively. When assuming ${{\textbf {I}_{M}} = \left [ {\begin {array}{*{20}{c}} {{\textbf {I}_{M \times M}}}&{{{\textbf {0}}_{M \times \left ({{N_{\mathrm {t}}}  M} \right)}}}\\ {{{\textbf {0}}_{\left ({{N_{\mathrm {t}}}  M} \right) \times M}}}&{{{\textbf {0}}_{\left ({{N_{\mathrm {t}}}  M} \right) \times \left ({{N_{\mathrm {t}}}  M} \right)}}} \end {array}} \right ]}$, the following expression results,
Substitute (46) into (44), then (44) can be rewritten as
C Proof of Eqs. (23) and (24)
First, we focus on Eq. (23). Assume ${\textbf {U} = \left [ {\bar {\mathbf {U}} \vdots \Delta \textbf {U}} \right ]}$, where ${\bar {\mathbf {U}}}$ is composed of the M dominating eigenvectors, while Δ U is composed of the less dominating N _{ t }−M eigenvectors. In the proposed scheme, we only choose M dominating eigenvectors to compose compression matrix ${\bar {\mathbf {U}}}$, which is to be utilized to compress highdimensional CSI into lowdimensional representation. Particularly, if choosing all of the N _{ t } eigenvectors, that is ${\bar {\mathbf {U}} = \textbf {U}}$, the distortion disappears, as given by
Substituting ${\textbf {U} = \left [ {\bar {\mathbf {U}} \vdots \Delta \textbf {U}} \right ]}$ into (48), and we can obtain
As mentioned above, the choice of M dominating eigenvectors inevitably leads to information distortion. According to (49), the distortion caused by PCA can be expressed by
Meanwhile,
Because of the orthogonality between ${\bar {\mathbf {U}}}$ and Δ U, one has
Therefore, Eq. (51) can be rewritten as
Since each element of channel vector h obeys the Gaussian distribution with unit variance, then ${\mathbb {E}\left [ {{{\left \ \textbf {h} \right \}^{2}}} \right ] = {N_{t}}}$. So we can easily obtain that ${\mathbb {E}\left [ {{{\left \ {{\mathbf {h}\bar {\mathbf {U}}}} \right \}^{2}}} \right ] = {N_{t}}  J}$.
As for Eq. (24), ${\mathbb {E}\left [ {\textbf {h}{\textbf {I}_{M}}^{H}{\textbf {h}^{H}}} \right ]}$ can be expressed by
where h _{(m)}(m=1,2,…M) denotes the m ^{th} element of h. As mentioned above, each element of channel vector h obeys the Gaussian distribution with unit variance. Therefore,
References
 1
F Boccardi, RW Heath Jr., A Lozano, TL Marzetta, P Popovski, Five disruptive technology directions for 5G. IEEE Comm. Mag. 52(2), 74–80 (2014).
 2
F Rusek, D Persson, BK Lau, EG Larsson, TL Marzetta, O Edfors, F Tufvesson, Scaling up MIMO: Opportunities and challenges with very large arrays. IEEE Sig. Proc. Mag. 30(1), 40–60 (2013).
 3
X Su, J Zeng, LP Rong, YJ Kuang, Investigation on key technologies in largescale MIMO. J. Comput. Sci. Technol. 28(3), 412–419 (2013).
 4
X Rao, VKN Lau, Interference alignment with partial CSI feedback in MIMO cellular networks. IEEE Trans. Sig. Proc. 62(8), 2100–2110 (2014).
 5
J Hoydis, S ten Brink, M Debbah, Massive MIMO in the UL/DL of cellular networks: how many antennas do we need. IEEE Jour.Select. Areas in Comm. 31(2), 160–171 (2013).
 6
N Jindal, MIMO broadcast channels with finiterate feedback. IEEE Trans. Info. Th. 52(11), 5045–5060 (2006).
 7
DJ Love, RW Heath Jr., Limited feedback diversity techniques for correlated channels. IEEE Trans. Vehicular Technol. 55(2), 718–722 (2006).
 8
P Xia, GB Giannakis, Design and analysis of transmit beamforming based on limitedrate feedback. IEEE Trans. Sig. Proc. 54(5), 1853–1863 (2006).
 9
J Choi, V Raghavan, DJ Love, Limited feedback design for the spatially correlated multiantenna broadcast channel, 2013 IEEE Global Communications Conference (GLOBECOM), vol. 1, (Atlanta, GA, 2013).
 10
V Raghavan, RW Heath Jr., AM Sayeed, Systematic codebook designs for quantized beamforming in correlated MIMO channels. IEEE J. Select. Areas Commun. 25(7), 1298–1310 (2007).
 11
J Li, X Su, Z Zeng, Y Zhao, S Yu, L Xiao, X Xu, Codebook design for uniform rectangular arrays of massive antennas, Vehicular Technology Conference (VTC Spring), 2013 IEEE 77th, (Dresden, 2013).
 12
X Su, J Zeng, J Li, L Rong, L Liu, X Xu, J Wang, International Journal of Antennas and Propagation, vol. 2013 (Hindawi Publishing Corporation, New York, US, 2013).
 13
D Ying, FW Vook, T Thomas, DJ Love, Subsectorbased codebook feedback for massive MIMO with 2D antenna arrays, 2014 IEEE Global Communications Conference, (Austin, TX, 2014).
 14
J Choi, Z Chance, DJ Love, U Madhow, Noncoherent trellis coded quantization: a practical limited feedback technique for massive MIMO systems. IEEE Trans. Commun. 61(12), 5016–5029 (2013).
 15
J Choi, DJ Love, T Kim, Trellisextended codebooks and successive phase adjustment: a path from LTEadvanced to FDD massive MIMO systems. IEEE Trans. Wireless Commun. 14(4), 2007–2016 (2015).
 16
Y Han, S Wonjae, L Jungwoo, Projection based feedback compression for FDD massive MIMO systems, 2014 IEEE Globecom Workshops (GC Wkshps), (Austin, TX, 2014).
 17
PH Kuo, HT Kung, PA Ting, Compressive sensing based channel feedback protocols for spatiallycorrelated massive antenna arrays, 2012 IEEE Wireless Communications and Networking Conference (WCNC), (Shanghai, 2012).
 18
J Lee, SH Lee, A Compressed Analog Feedback Strategy for Spatially Correlated Massive MIMO Systems, Vehicular Technology Conference (VTC Fall), 2012 IEEE, (Quebec City, QC, 2012).
 19
P Cao, E Jorswieck, DCT and VQ based limited feedback in spatiallycorrelated massive MIMO systems, 2014 IEEE 8th Sensor Array and Multichannel Signal Processing Workshop (SAM), (A Coruna, 2014).
 20
MS Sim, CB Chae, Compressed channel feedback for correlated massive MIMO systems, 2014 IEEE Globecom Workshops (GC Wkshps), (Austin, TX, 2014).
 21
W Lu, X Tan, Q Liu, Y Liu, D Wang, Compressive Channel Feedback Schemes Based on Redundant Dictionary in MIMO Communication Systems. Wireless Personal Communications. 82(4), 2215–2229 (2015).
 22
MS Sim, CB Chae, in Proc. IEEE Globecom Workshops (GC Wkshps). Compressed channel feedback for correlated massive MIMO systems (Austin, TX, 2014), pp. 327–332.
 23
JE Fowler, Compressiveprojection principal component analysis. IEEE Trans. Image Process. 18:, 2230–2242 (2009).
 24
LI Smith, A tutorial on principal components analysis. Technical report (Cornell University, USA, 2002).
 25
D Shiu, G Foschini, M Gans, J Kahn, Fading correlation and its effect on the capacity of multielement antenna systems. IEEE Trans. Comm. 48(3), 502–513 (2000).
 26
J Nam, JY Ahn, A Adhikary, G Caire, Joint spatial division and multiplexing: realizing massive MIMO gains with limited channel state information, 46th Annual Conference on Information Sciences and Systems(CISS 2012) (Princeton, NJ, USA, 2012).
 27
H Yin, D Gesbert, L Cottatellucci, Dealing with interference in distributed largescale MIMO systems: a statistical approach. IEEE Jour. Select. Topics Sig. Proc. 8(5), 942–953 (2014).
 28
A Wiesel, YC Eldar, S Shamai, Zeroforcing precoding and generalized inverses. IEEE Trans. on Sig. Proc. 56(9), 4409–4418 (2008).
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Funding
This work was supported by the NSF of China (No. 61271177 and No. 61461029) and Fundamental Research Funds for the Central Universities (2014ZD0301).
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Massive MIMO
 Limited feedback
 Principal component analysis
 Information distortion analysis