GAMP-SBL-based channel estimation for millimeter-wave MIMO systems

Based on the finite scattering characters of the millimeter-wave multiple-input multiple-output (MIMO) channel, the mmWave channel estimation problem can be considered as a sparse signal recovery problem. However, most traditional channel estimation methods depend on grid search, which may lead to considerable precision loss. To improve the channel estimation accuracy, we propose a high-precision two-stage millimeter-wave MIMO system channel estimation algorithm. Since the traditional expectation–maximization-based sparse Bayesian learning algorithm can be applied to handle this problem, it spends lots of time to calculate the E-step which needs to compute the inversion of a high-dimensional matrix. To avoid the high computation of matrix inversion, we combine damp generalized approximate message passing with the E-step in SBL. We then improve a refined algorithm to handle the dictionary matrix mismatching problem in sparse representation. Numerical simulations show that the estimation time of the proposed algorithm is greatly reduced compared with the traditional SBL algorithm and better estimation performance is obtained at the same time.

theory was developed by Candes et al. [9] and Donoho [10] which can be very useful when signals are sparse or compressible. Several algorithms have been presented in [5,[11][12][13] to exploit such hidden sparsity. The above methods contribute to achieving a significant reduction in the training overhead. However, there are still some limits in the algorithms. For example, in the expectation-maximization (EM) sparse Bayesian learning algorithm, the computation complexity increases exponentially with the signal dimension [14], which may meet great challenges in practical implementations. Another generally existed problem is that although the sparse algorithms can achieve super estimation performance for sparse signals, there also exists a grid mismatch issue. Some super-resolution (off-grid) compressed sensing methods and two-stage channel estimation algorithms have been applied to improve channel estimation accuracy [14][15][16][17][18][19][20] aiming to overcome the grid mismatch caused by conventional compressed sensing techniques. However, the super-resolution dictionary learning algorithm has two main shortcomings: the convergence of the results cannot be guaranteed; a large amount of prior training is required for the sparse dictionary. These two shortcomings limit the practical design and implementation.
In this paper, we adopt a two-step algorithm. The first step is coarse channel estimate by the SBL algorithm, which simplifies operations using DGAMP. The second step is an accurate angle estimate by the interpolation of three 2-DFT spectral lines. The following summarizes the contributions of this paper.
• We formulate a sparse recovery problem and develop a DGAMP-SBL algorithm for the coarse estimation channel of the hybrid MIMO system. Based on the algorithm, we accelerate the computation speed of the original EM sparse Bayesian learning algorithm for mmWave channel estimation. The computation complexity shows that we significantly reduce operation time comparing to the original SBL algorithm with large-scale transmission and reception antennas. • To address the sparse reconstruction based channel estimation schemes suffering from the grid-off problem, we improve a fast refined algorithm without iteration. In the discrete Fourier transform domain, the influence of Gaussian noise decreases with the increasing antenna number, which helps obtain a more accurate angle estimation.
The rest of the paper is organized as follows. In Sect. 2, we outline our algorithm and briefly explain the operation steps. The system and channel model are discussed in Sect. 3. We formulate the sparse recovery-based channel estimation with the proposed DGAMP-SBL algorithm and get more accurate channel estimation using a refined algorithm in Sect. 4. The performance of the proposed algorithm is compared with that of the existing algorithms, and the superiority of the proposed algorithm is concluded in V. At the end of the paper, a conclusion is given in Sect. 6. The notations related to this paper are shown in Table 1.

Method
In this section, an off-grid channel model is implied for sparse representation of massive mmWave MIMO systems and was propose a computationally efficient and accurate mmWave channel estimation algorithm. Generalized approximate message passing (GAMP) is an extension of the approximate message passing algorithm [21] to more comprehensive scenarios involving arbitrary priors and observation noise [22]. We use a GAMP based low complexity SBL algorithm to avoid the high-dimension matrix inversion. This algorithm can be embedded in the expectation-maximization (EM) framework to approximate the true posterior distribution of sparse. Then we improve a fast refined algorithm that utilizes the interpolation of three 2-DFT spectral lines to improve the accuracy of the algorithm. Significantly, the improved algorithm can get accurate angles of departures/arrivals (AoAs/AoDs) estimation without iterations. Consequently, we can obtain both computational efficiency and estimation precision. Finally, the lower bound of normalized mean-squared error of the proposed algorithm is derived analytically: the lower bound of estimation is derived by the least square (LS) algorithm with the assumption that the angles of AoAs/AoDs are known. And then, we estimate the channel by using the LS algorithm. The result indicates our proposed algorithm is close to the lower bound at high signal-to-noise ratios.

System model
In this paper, we consider a typical hybrid mmWave MIMO systems consisting of a BS and a MS, where the BS and MS are equipped with N t and N r antennas, respectively. Let N RF t and N RF r be the number of transmitter radio frequency (RF) chains and receiver RF chains, respectively. We consider a hybrid analog-digital communication system that has inherent sparse characteristics as shown in Fig. 1. For mmWave massive MIMO with hybrid precoding, the amount of RF chains is far less than the amount of antennas, i.e., N RF t < N t , N RF r < N r [23][24][25]. We assume that F = F RF F BB is the N t × N t s hybrid s denote the analog and digital precoders, respectively. W = W RF W BB ∈ C N r ×N r s is the hybrid combiner. W RF ∈ C N r ×N RF r and W BB ∈ C N RF r ×N r s denote the analog and digital combiners, respectively, and the received signal r ∈ C N r s ×1 can be rewritten as where r ∈ C N r s ×1 is the received vector, H ∈ C N r ×N t is the channel matrix, s ∈ C N t s ×1 and n ∈ C N r s ×1 is the Gaussian noise matrix. Use x = Fs ∈ C N t ×1 to denote the transmit signal, and the transmit vector elements correspond to the transmit antennas position. Firstly the channel estimation is considered in M time slots, and the channel matrix remains unchanged during the M time slots. N x (N x < N t ) pilot sequences x 1 , x 2 , . . . x N x are sent by transmitters, and we also use N y dimensional received pilot y p by receiving antennas for each transmits pilot sequence and N represents the noise matrix with each entry following the distribution CN (0, σ 2 n ) . In this paper, we suppose that the matrices W r and X are known as prior knowledge at the receiver without loss of generality. As mentioned in [16], the (1) entries in W r and X are optimally chosen from the set w i,j = 1 N t e jβ i,j ,x i,j = ρ N t e jβ i,j where β i,j is the random phase uniformly distributed in [0, 2π) . This assumption is also employed in [26,27]. Further details of how to design W r and X can be found from [28]. It is not available to observe H directly. Instead, a noisy edition W H r HX can be observed from the receiver. This is referred to as channel subspace sampling and a detailed analysis can be found in [29,30]. Fortunately, as mentioned above, the channel estimation problem can be solved as a sparse signal recovery. The typical model of mmWave channel is modeled as [31] where L denotes the number of communication paths. There are usually a few reflected path clusters in mmWave multipath propagation channel [26][27][28]. Therefore, we usually have L ≪ min{N r , N t } . α l is the complex-valued gain corresponding to the l-th communication path. θ l ∈ [−π , π] and φ l ∈ [−π , π] are the corresponding azimuth AoAs and AoDs [32][33][34], respectively, and a r ∈ C N r ×1 a t ∈ C N t ×1 is the array response vector corresponding to the receiver (transmitter).
In this paper, we employ ULA both at the transmitter and the receiver. Then, the steering vectors at the BS and MS can be written as where is the signal wavelength and d denotes the spacing between every two adjacent antenna elements with d = /2 . For convenience, each scatterer is assumed to possess a transmission path, and the channel gains are modeled as independent identically distributed random variables with the distribution CN (0, σ 2 α ) . By defining the steering matrices � r = [a r (θ 1 ), . . . , a r (θL)] ∈ C N r ×L , � t = [a t (φ 1 ), . . . , a t φL ] ∈ C N t ×L in a matrix form, the mmWave channel matrix H in (4) can be rewritten in a matrix form as follows where H v is a diagonal matrix with diagonal elements α = [α 1 , . . . , αL] T . To transform the signal estimation problem into the sparse recovery problem, we firstly express the channel in a sparse matrix form are overcomplete beamforming matrices

are assumed as channel-invariant unitary DFT matrices, and
, respectively. C ∈ C N 1 ×N 2 is a sparse representation matrix of mmWave channel with the L nonzero elements correspond to each scatterer under the condition that each AoA/AoD is defined in quantization grids. Note that the practical direction is usually invalid to locate on the predefined grids, so it is actually that the quantizing error exists in the sparse channel matrix. Hence, a refined channel estimation scheme is given in the later sections to solve the off-grid problem. What is more, it is worth noting that C is no longer diagonal.
The signal model in (3) can be further denoted as follows . The estimation of c can be handled as a sparse signal recovery problem with 2D dictionary matrix A * t ⊗ A r , and the sensing matrix also can be denoted as We consider the number of grids N 1 = N 2 for simplicity, and the channel estimation problem can be summarized as compressed sensing problem as follows where ζ is a tolerance constant which is determined by noise power. Since recovery signal from l 0 -norm is a NP hard question, we replace the l 1 -norm with l 0 -norm as follows

Proposed GAMP-SBL-based high precision channel estimation
This section describes the proposed SBL scheme for sparse beamspace channel vector estimation. We begin by introducing the proposed hierarchical prior model. Then, we use the DGAMP-SBL to estimate the mmWave channel. After that, we propose a refined algorithm to get the accurate estimation of AoAs and AoDs. And then we propose a refined algorithm to estimate the AoAs and AoDs accurately. Finally, we analyze the computation complexity of the proposed algorithm and make a comparison to the complexity of SBL.

Hierarchical prior model
We have the following results using the assumption of circular symmetric complex Gaussian noises (9)  (12) p(y|c, ) = CN y| c, −1 I , where = σ −2 stands for the noise precision. In order to leverage the sparse structure of the underlying mmWave channel, we enforce sparsity constraints on the channel vector which is commonly used in the sparse Bayesian model. The sparse channel vector c is assumed to follow a two hierarchical prior Gaussian distribution. In the first layer, c is assumed that the entries of c are independent and identically distributed, i.e.
where γ i denotes the inverse variance, and γ = [γ 1 , γ 2 , . . . , γ N 1 N 2 ] . Meanwhile, a Gamma prior is employed for the inverse variance vector where a, b are parameters associated with the above distribution, and Ŵ(a) = ∞ 0 t a−1 e −t is the Gamma function. Besides, ⌢ n is Gaussian noise with zero mean and covariance matrix (1/ )I . We set a Gamma hyperprior over : To obtain a broad hyperprior, we set a, b → 0 [35,36]. This two-stage hierarchical prior gives which is helpful to obtain sparse solutions due to the sharp peak and heavy tails with small a and b. Actually, according to paper [37], the maximum posterior estimation of c is consistent with l 0 -norm solution in formula (10) by FOCUSS with p → 0 . To update the parameter θ = {γ , } , we can also use maximum posteriori estimation to achieve the most probable values, i.e., or, equivalently, Then, the EM-SBL algorithm is employed to learn the sparse vector c and iteratively update the hyper-parameters θ = {γ , } . Note that the key step is to update the hyperparameters p(γ , | y) by maximizing the posterior probability when the EM-SBL is in the updating phase. And in the stage of E-step, the likelihood function can be written as follows (13) ln p(γ , , y). As noted before, the conditional density p(c|r) shows c is Gaussian distribution. When we treat c as a hidden variable, its posterior distribution we need to compute conditioned on the observed vector y and the updated hyper-parameters θ is a complex Gaussian [35] function where the posterior covariance c and mean µ c are given, respectively, where D = diag γ 0 , γ 1 , . . . , γ N 1 N 2 , c and µ c are the posterior mean and variance with relevant for p c | y, γ (t) , (t) , respectively. We assume that τ c is the vector whose elements are composed of the diagonal of the covariance matrix c . As mentioned above, in the EM algorithm iterative process, the hyper-parameters are updated by iteratively maximizing the R-function, i.e., Using Bayesian rule, (21)) can be rewritten by ignoring part unrelated to θ as follows Firstly. the algorithm carries out the M-step for the hyper-parameters {γ n } . We take the partial derivative of the R-function with respect to γ n with eliminating independent terms. Since the first term in (22) does not depend on γ , it can be ignored as it will not be relevant for the M-Step. The objective function in (22) becomes γ n ĉ 2 n + τ c n 2 − 1 2 log γ n + a log γ n − bγ n . We take the partial derivative of the R-function with respect to γ n with eliminating independent terms, and the iteration of γ n can be denoted by According to the hyperprior p(γ n ) which possesses a non-informative when the parameter a and b tend to zero, we can simplify the update formalization as Similarly, we then compute the estimation of the scalar hyper-parameter and it can be updated as where N = N x N y represents the dimension of y . According to the references and practical implementation tips, we can consistently set the constant as follows: a = b = c = d = 0.0001 , since the result of constant initialization has little effect on estimation performance.

Update by DGAMP
As mentioned earlier, it is obvious that the calculation of the posterior mean and posterior variance which involve the inversion of high-dimensional matrices is extensive. Hence, the high computational complexity characteristic of the EM-SBL algorithm causes that it is impractical to be adopted by the massive MIMO channel estimation.
To simplify the calculation, we replace posterior calculation with the GAMP algorithm which is a very-low-complexity Bayesian iterative technique. It is noted that the hyperparameters {γ , } are considered as known constants during the iterative process of the GAMP algorithm. GAMP is a fast heuristic algorithm and can be utilized for simplifying matrix inversion within the SBL framework [38,39]. GAMP algorithm obtains the maximum posterior estimation of c by Taylor approximation. Specifically, the process of iteratively computing the marginal posterior p c n | y, γ , is performed by message passing on the GAMP factor graph. By utilizing the condition that all posteriors are Gauss, the process can be simplified by replacing posterior probability with expectation and variance of the sparse variables {c n } and mixture variables {z m } whose elements are denoted by z = c . To detour the convergence of GAMP whose measurement matrix satisfies independent Gaussian distribution, paper [40] proposes a DGAMP algorithm to improve the robustness of the measurement matrix through importing damping factors ρ s , ρ c ∈ (0, 1] , but it will also slow down the convergence speed. Nevertheless, the process is computationally  [22] to learn more about details of the derivation of DGAMP. There are two versions which are the max-sum version and sum-product versions damp-GAMP algorithm. The input and output functions g s p, τ p and output functions g x (r, τ r ) in Algorithm 1 are distinguished according to whether the max-sum or the sumproduct version of GAMP. Coincidentally, both the sum-product and max-sum version simplify the same equation. We only introduce the functions of the input and output of the sum-product, and readers can refer [41] to get furthermore detail. The intermediate variables r and p are explained as approximations of Gaussian noise corrupted of c and z = c with the noise levels of τ r and τ p , respectively. The difference between the sum-product and max-sum version is the estimation strategy. The sum-product version uses the vector minimum mean-squared error (MMSE) estimation which is reduced to a sequence of scalar MMSE estimation. The input and output functions are shown as follows According to the (12)(13)(14)(15) which are the prior information and the posterior information we imposed on c and p y|c , respectively, the input function and its derivative function can be rewritten as follows Similarly, the output function and its derivative function can be rewritten as follows Upon convergence, DGAMP-SBL yields a sparse estimate of c according to the prior information given by the posterior mean. This also gets a coarse estimate of the channel matrix, which can be represented by a paired frequency of AoAs and AoDs using multiplying the steering vectors. The initial channel estimation can be obtained as where Ĉ denote order by columns as H.
It is worth noting that GAMP is a low complexity algorithm that transforms the vector estimation into the scalar estimation; therefore, (27), (28) and the operations in the Algorithm 1, all vector squares, divisions and multiplications are taken element-wise. Figure 2 represents sparse channel matrix based on discrete Fourier basis. Figure 3 represents the SBL-GAMP algorithm estimation without considering the sparse off-grid problem when the number of grids is chosen as N 1 = N 2 = 180 for comparison.

Refined estimation
In the following, we propose an exact and fast 2D frequency estimation method base on the interpolation of three 2-DFT spectral lines [42]. The SBL-DGAMP algorithm, (27) g s p, τ p m = z m p(y m |z m )N z m ; p m τ pm , 1 τ pm dz m p(y m |z m )N z m ; p m τ pm , 1 τ pm dz m , (28) [g c (r, τ r )] n = c n p(c n )N c n ; r n , τ r n dc n p(c n )N c n ; r n , τ r n dc n .
(29) g c (r, τ r ) = γ γ + τ r r,  where δ is a real number, and we formulate the D(k, j), D(k ± δ, j) , D(k, j ± δ) as D, D K +δ , D K −δ , D J +δ , D J −δ , respectively. Then, we propose the interpolation algorithm as follows where ˆ x ,ˆ y are frequency deviation normalized by grid distance f which is defined by �f = 1/(N DFT /2). And then we could get the frequency estimations that and According to [43], in order to get a more accurate frequency estimate, we adopt a parabolic interpolation algorithm. For each frequency dimension d ∈ x, y , we calculate three periodogram sample D d1 , D d2 and D d3 at frequency θ d1 =f d − � d , θ d2 =f d and θ d1 =f d + � d . Middle frequency f d given by (39) and (40), and the sides of frequency θ d1 and θ d3 are excursed by d which is chosen to satisfy � d ∈ (0, 1 2N DFT ) , as well as high estimation accuracy. The last step of frequency estimation along the dth dimension is achieved by calculating the vertex of a parabola fitted through points

Reconstruct mmWave MIMO channel
In this subsection, according to the obtained AoAs θ l L l=1 and AoDs φ l L l=1 , the mmWave MIMO channel will be reconstructed as follow. Firstly, we recover the steering vectors a r (θ l ) and a t (φ l ) in (5) and (6) using the estimated AoAs and AoDs. Secondly, we rewrite the expression (6) as follows Thus, the receive signal also can be rewritten using vector form as follows The estimator estimates h v in the LS sense. From (35), the LS estimate of h v , denoted as ĥ v , is given as follows Finally, according to the obtained exact angle estimation, and the gains of path ĥ v above, we can recover the high-dimensional mmWave MIMO channel as Ĥ = r diag(h v ) H t .

Analysis of computational complexity
Apparently, the complexity of the DGAMP-SBL algorithm is dominated by the E-step, and the matrix multiplications are a big part of it which matrix multiplications by S , S T , , and T at each iteration. The complexity of each iteration is O(4 · N 1 N 2 (N x N y )) , since we should convert the complex signal to a real signal. It is worth noting that separate operations can reduce the single time, so we also can neglect the coefficient 4. Nonetheless, it is mentioned above that the multiplications operation in Algorithm 1 is taken element-wise. The complexity is much smaller than O N 1 N 2 (N x N y ) 2 . The complexity of SBL iteration when the dimension of N x N y is large. And the refined part does not need iteration so that it can ignore the complexity. For the OMP scheme, the main computational complexity is O(N 1 N 2 (N x N y )) which lies in the correlation operation. According to [16], a preprocessing step is proposed to reduce the computational cost of each iteration greatly. For the iterative reweight (IR) scheme, the main computational cost that is contributed by gradient operation is O(L 2 · (N x N y (N r + N t ))) . Actually, according to the author's description in [16], the first L maximum correlation angles extracted only according to preprocessing are likely to miss the real angle, especially when L is large. That means the first 2L maximum correlation angles are often taken so that the computational complexity can be written as O(4L 2 · (N x N y (N r + N t ))) . Finally, for the least square algorithm, the total complexity of LS is O((N x N y ) 2 (N r N t ) + (N x N y ) 3 )) . Due to the large antenna dimension of the channel, it is difficult to use LS for mm-Wave channel estimation. According to the above analysis, we can find that the computational complexity of DGAMP-SBL is proportional to N 1 N 2 (N x N y ) , which is similar to the OMP algorithm. Obviously, for large N x N y , the computational costs of DGAMP-SBL are much smaller than SBL. When L is small, the computational complexity of IR is the smallest, but as the value of L becomes larger, the complexity of IR becomes close to that of DGAMP-SBL and OMP.

Results and discussion
In this section, we will prove the performance of the proposed algorithm superioritybased channel estimation scheme through MATLAB simulation with the following parameters. The transmitter and the receiver are equipped with N r = N t ∈ {32, 64} , N RF = N RF t = N RF r = 2 , N x = N y ∈ {24, 32} and N 1 = N 2 = 120 . Each item of the transmitted pilots X is defined as x i,j = ρ N t e jβ i,j , where ρ is the transmitted power, β i,j is the random phase uniformly dirstributed in [0, 2π ) . The signal-to-noise (SNR) is defined by SNR= ρσ 2 α σ 2 . We consider the ULA geometry, and we take the following algorithm as comparison algorithms, i.e., the OMP-based channel estimation [44], the IR-based super-resolution channel estimation scheme [16], the Oracle estimator in [44] and the LS estimator. The normalized mean-squared error (NMSE) which is denoted as  [45,46]. Figures 4 and 5 compare NMSE performance against SNR with N r = N t = 32 , N x = 24 and N r = N t = 64 , N x = 32 , respectively. In both scenarios, it is apparent that the curves of the proposed algorithm are below that of comparing algorithms under most circumstances. Comparison results indicate that the proposed scheme has remarkable performance improvement. Note that our proposed scheme almost coincides with the curve of Oracle in the condition of high SNR. Under the same SNR, our algorithm achieves a huge advantage since our algorithm considers the offgrid influence which can estimate a more accurate angle. By comparing Fig. 4, our proposed algorithm achieves a great advantage for all of SNR. And with the increase in SNR, the advantage of our proposed algorithm performance over OMP becomes greater. Moreover, our proposed algorithm is superior to the IR algorithm, and it almost coincides with the Oracle curve. The NMSE performance degrades in low SNR levels because the suffer noise interference exists. And under the low SNR, our proposed algorithm coarse estimation faces a challenge which is brought by noise and off-grid, so the performance does not achieve the expectation. Although the number of antennas and pilots decline, our proposed algorithm still have superior performance. Taking the SNR of 10dB, 15dB, 20dB as an example, our proposed algorithm is also very close to the Oracle curve. It can be seen that the channel estimation accuracy based on our proposed algorithm is better than that of comparing algorithm, which verifies the advantages of our proposed schemes. To summarize, the proposed scheme is superior to another scheme in channel estimation accuracy regardless of the antennas and pilots are more or less. Figure 6 compares the NMSE performance against the number of training beams when SNR=10dB, N t = N r = 64 . As the number of training beams with N x increases, the NMSE curves of all algorithms decrease monotonically. It is obvious that the comparing algorithm curves tends to decrease slowly. Even so, our proposed algorithm still achieves a huge advantage under different pilots. When the SNR is 10dB, our proposed algorithm achieves high estimation accuracy, and it is very close to the Oracle curve. And we have tested that our algorithm may get an accurate result when N x is very small. Still, a probability of failure exists comparing with high N x . Again, among the comparing algorithm, the proposed algorithm still performs better outperforms than others. Figure 7 compares the difference in runtime between the original SBL and the proposed scheme. In order to facilitate comparison, we fix the number of iterations 200 times. Note that although we apply 200 iterations, the algorithm convergence only needs 100 times and even less in most cases. It is obvious that the proposed algorithm needs much less time than the original SBL algorithm as the dimension increases, which suggests that the proposed algorithm is more practical for large-size sparse matric recovery. Figure 8 shows the NMSE curves against the number of grids N 1 = N 2 when the SNR is 10 dB. We refine the mesh and compare the performance of different algorithms at different mesh spacing. In order to compare the advantages of the second refining algorithm, we add the unrefined DGAMP-SBL to the comparison. It is obvious that the performance of the refined results is greatly improved under any grid spacing. The proposed algorithm outperforms the IR when N 1 = N 2 ≥ 70 . After the number of grids is greater than 70, the NMSE of the proposed algorithm is slightly down. The NMSE curves of the sparse matric algorithms monotonically decrease with the number of the grids increasing. It is interesting that the number of grids only needs to be set to 80 to balance the complexity of the computation and the accuracy of estimation. For comparison, the NMSE curves of the Oracle, LS and IR algorithms remain unchanged for different values of grids since all of those work are off-grid. Figure 9 compares the NMSE curves against the number of paths. When L varies from 2 to 12, the curve of the IR scheme is basically stable since it can still find the most relevant columns. Nevertheless, the number of iterations increases greatly, and as mentioned in Sect. 4.5, with the increase in L, the calculation cost of the IR scheme is proportional to 4L 2 . When L is large, the computation costs required for a single calculation are much bigger than our algorithm to maintain the estimation accuracy. For our proposed algorithm, the NMSE curve rises slightly. This is because when the number of L is large, due to the small adjacent angular distance, it may  lead to the ambiguity of adjacent peaks, which affects the performance of the second step estimation. It is worth noting that our algorithm still has good estimation ability and is much higher than the comparison algorithm OMP. Figure 10 compares the ASE against SNR when different channel estimation schemes are used with N r = N t = 64 , N x = 32 . We introduce a perfect CSI as a reference upper limit curve. It is noted that although the performance of our proposed algorithm is slightly worse than the IR scheme at -5dB, the ASE curve of the proposed scheme almost overlaps that of CSI for high SNR values.

Conclusion
In this paper, we have proposed a high-precision channel estimation scheme for mmWave massive MIMO with hybrid precoding. Specifically, a two-stage strategy is used to estimate the channel. In the first phase of the algorithm, the DGAMP-SBL which simplifies operations is used to estimate the mmWave MIMO channel coarsely. In the second phase of the algorithm, the previously obtained channel which considers off-grid error is refined by 2DFT-interpolation algorithm. Experimental results have indicated that the proposed high-precision channel estimation strategy can perform better than state-of-art estimation precision. For future work, we focus on the sparse channel estimation 2D high-dimension matrix reduction, since the sparse recovery problem still be limited by the 2D sparse dictionary matrix which has a huge matrix dimension.