Skip to main content

Machine learning based low-complexity channel state information estimation


In 5G communications, the acquisition of accurate channel state information (CSI) is of great importance to the hybrid beamforming of millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) system. In classical mmWave MIMO channel estimation methods, the exploitation of inherent sparse or low-rank structures has demonstrated to improve the performance. However, most high-accurate CSI estimators incur a high computational complexity and require the prior channel information, which hence present the major challenges in the practical deployment. In this work, we leverage machine learning to design the low-complexity and high-performance channel estimator. To be specific, we first formulate the CSI estimation, in the case of sparse structure, as one classical least absolute shrinkage and selection operator problem. In order to reduce the time complexity of existing compressed sensing (CS) methods, we then approximate the original optimization problem to another one, by imposing the other low-rank constraint that was barely considered by CS. We thus solve this new approximated problem and attain the near-optimal solution of the original problem. One new method excludes any prior channel information, and greatly improves the estimation performance, which only incurs a low time complexity. Simulation results demonstrate the superiority of our proposed method both in the estimation accuracy and time complexity.

1 Introduction

Millimeter wave (mmWave) communication technology has attracted much attention in 5G cellular systems, and it provides a wide range of spectrum with multiple access multiplexing technology that can greatly improve channel capacity, which is undoubtedly attractive in tight spectrum resources. Besides, the reliability of mmWave communications system is extremely high, and it can provide a stable transmission channel [1,2,3,4,5,6,7]. To compensate for the severe path losses in millimeter wave signal propagation, millimeter wave communication systems are usually equipped with massive multiple-input multiple-output (MIMO) antenna arrays [8, 9]. For such mmWave massive MIMO systems, the superior hybrid analog/digital beamforming performance necessitates reliable channel state information (CSI), while is difficult to acquire due to the large number of unknown channel parameters [10].

By exploiting of the inherent sparse or rank restricted property of mmWave’s massive MIMO channel, a number of algorithms have been developed to improve CSI estimation performance [11]. Among them, the least squares (LS) algorithm and the least mean squares error (MMSE) algorithm are widely adopted. The least squares method estimation accuracy is low, while it is easy to implement; another MMSE algorithms perform better, but require a lot of computational overhead [12,13,14]. Recently, some new mmWave CSI estimation schemes have been proposed to tradeoff the computational complexity and estimation accuracy. Specifically, Reference [15] proposes an iterative singular value projection (SVP) method to improve CSI estimation performance by utilizing the low-rank structure of a massive MIMO channel. Moreover, Ref [16] exploits the well-known Fast Iterative Shrinkage Threshold Algorithm (FISTA) to reduce the complexity of CSI estimation based on channel sparsity, while it may lead to the deteriorated performance due to grid mismatch.

As one important theory in machine learning (ML) field, compression sensing (CS) has been widely used in millimeter wave CSI estimation due to the inherent sparse property of mmWave channel. The ref [17] proposes an effective mmWave large-scale MIMO system open-loop channel estimator to achieve superior estimation performance, by the orthogonal matching pursuit (OMP) algorithm employing a redundant dictionary consisting of array response vectors. However, this OMP-based approach requires prior channel sparsity and is often difficult to obtain. Furthermore, another two-stage compressive sensing (TSSR) method developed in [18] is aimed to exploit sparse and low-rank characteristics in two consecutive phases, respectively, but the error in this scheme is largely affected by the ratio of the number of conducts to the transmitted signal. The complexity of channel estimation and the overhead of channel feedback will be unbearable when the pilot signal is too long. Ref [19] develops one novel joint CSI estimation and feedback (JCEF) CSIT acquisition scheme by exploiting the random matrix approximation technique. This scheme can effectively reduce the complexity of calculations. Likewise, a low-rank structure of the channel covariance matrix is proposed to reduce the training overhead in [20], which is more robust than the traditional compressive perception method. However this method only works with OFDM-based systems. Ref [21] proposes a channel estimation scheme that uses the sparsity of the angular domain structure of the channel to reduce the training overhead, which is more efficient than some previous channel estimation schemes, where only the line of sight (LOS) component was estimated.

In this work, by leveraging the CS technique in machine learning, we propose one novel CSI estimator based on the joint sparse and low-rank structure of mmWave massive MIMO channel, which greatly improve the estimation performance and meanwhile reduces the time complexity and pilot overhead. Specifically, the mmWave channel estimation process is first modeled as one non-convex problem, and then we theoretically approximate this non-convex problem as one classical least absolute shrinkage and selection operator (LASSO) problem. To solve this LASSO problem, we develop one novel CSI estimation algorithm including two stages to accurately estimate the CSI matrix. In the first stage, our new method exploits the CS technique to estimate one roughly CSI estimation result. Then, on this basis, we develop one novel low-rank matrix completion algorithm to solve the constructed LASSO problem, with which we can accurately recover the channel matrix. As validated by the numerical results, our proposed method achieves the much higher CSI estimation performance than most existing algorithms, while the computational complexity and pilot overhead are low. The main contributions of this paper is summarized as follows.

  • We model the described mmWave channel estimation process as a non-convex problem and approximate this non-convex problem as one classical LASSO problem, based on the inherent sparse and low-rank properties of mmWave massive MIMO channels, which has rarely been considered.

  • We develop one novel CSI estimation scheme to solve this LASSO problem without prior channel information, by leveraging the CS technique in machine learning, which occurs much less complexity and attains higher estimation performance. Theoretically, we analyze the time complexity of our new method. It is proved that the algorithm can greatly improve the estimation accuracy even with only low time complexity.

  • We provide the detailed numerical simulations of our proposed CSI estimator and then compare it with most existing algorithms. As illustrated by the simulation results, our CSI estimator greatly reduce the computational complexity and plot training overhead, and almost attain the same CSI estimation accuracy as classical OMP method. These prove the superiority of our proposed method.

Notation: Lower-case and upper-case boldface letters denote vectors and matrices, respectively; \((\cdot )^{ T }\) and \((\cdot )^{ H }\) denote the transpose and conjugate transpose of a matrix, respectively; \((\cdot )^{*}\) denotes the conjugate of a matrix, that is, only the conjugation of all matrix elements; rank(\(\textbf{H}\)) denotes the rank of \(\textbf{H}\); vec(\({\textbf {H}}\)) and unvec(\({\textbf {H}}\)) denote the vectorization and unvectorization of matrix \({\textbf {H}}\), respectively; \(vecd({\textbf {H}})\) denotes is an N-dimensional vector consisting of the diagonal entries of \({\textbf {H}}\)(the n-th entry of \(vecd({\textbf {H}})\) is given by \({\textbf {H}}(n,n)\)); \(\Vert \cdot \Vert _p\) is the \(l_{p}\)-norm.

2 System model

In this work, we consider one hybrid analog-digital mmWave massive MIMO communication system, which is equipped with \(N_t\) transmitting antennas at the base station (BS) and \(N_r\) receiving antennas at the mobile station (MS) respectively (as seen in Fig.1). Without loss of generality, we adopt the well-accept geometric channel model in mmWave massive MIMO system, which is given by [15, 22]:

$$\begin{aligned} {{\textbf {H}}} = \sqrt{\frac{{{N_r}{N_t}}}{\beta }} \sum \limits _{k = 1}^K {{\alpha _k}{{{\textbf {a}}}_r}\left( {{\theta _k}} \right) {{\textbf {a}}}_t^H\left( {{\varphi _k}} \right) }, \end{aligned}$$

where \(\beta\) is the average path-loss between; K denotes the number of scattering paths; \({\alpha _k}\) is the complex path gain of k-th path; \({{\theta _k}}, {{\varphi _k}} \in \left[ {0,2\pi } \right]\) are the direction of arrival or departure (DOA/DOD) of the k-th path [22]. \({{{{\textbf {a}}}_r}\left( {{\theta _k}} \right) , {{\textbf {a}}}_t\left( {{\varphi _k}} \right) }\) are the array response vector and denoted as \({{{\textbf {a}}}_r}( {{\theta _k}} ) = \frac{1}{{\sqrt{{N_r}} }}{[ {1,{e^{j\frac{{2\pi d}}{\lambda }\sin ( {{\theta _k}} )}}, \ldots ,{e^{j\frac{{2\pi d}}{\lambda }{( {{N_r} - 1})}\sin ( {{\theta _k}} )}}} ]^T}\), \({{{\textbf {a}}}_t}( {{{\varphi _k}}} ) = \frac{1}{{\sqrt{{N_t}} }}[ 1,e^{j\frac{{2\pi d}}{\lambda }\sin ( {\varphi _k} )}, \ldots ,e^{j\frac{{2\pi d}}{\lambda }{( {{N_t} - 1})}\sin ( {\varphi _k} )} ]^T\), d is the distance between neighboring antenna elements, \({\lambda }\) is the signal wavelength. As seen, the channel matrix can be written in a more compact form as \({{\textbf {H}}} \triangleq {{{\textbf {A}}}_r}{{{\varvec{\Lambda }} {\textbf {A}}}}_t^H\), where \({{{\textbf {A}}}_r} = [{{{\textbf {a}}}_r}\left( {{\theta _1}} \right) , \ldots ,{{{\textbf {a}}}_r}\left( {{\theta _K}} \right) ]\), \({{{\textbf {A}}}_t} = [{{{\textbf {a}}}_t}\left( {{\varphi _1}} \right) , \ldots ,{{{\textbf {a}}}_t}\left( {{\varphi _K}} \right) ]\); \({{\varvec{{\Lambda }}}} = \sqrt{\frac{{{N_r}{N_t}}}{\beta }} \textrm{diag}({\alpha _1}, \ldots ,{\alpha _K})\).

In this hybrid analog-digital mmWave massive MIMO communication system, the BS transmits the pilot symbol matrix \({\textbf {X}}\) with size of \({{ \mathbb {C} }^{{N_s} \times {T}}}\) (\(N_s\) is the length of data streams, T denotes pilot length.), and then the received signal matrix \({\textbf {Y}}\) at MS is give as [22] :

$$\begin{aligned} {{\textbf {Y = }}}{{{\textbf {C}}}^H}{{{{{\textbf {A}}}_r}{{\varvec{\Lambda }}{} {\textbf {A}}}_t^H{\textbf {FX}} + {\textbf {N}}}} . \end{aligned}$$

Here, \({\textbf {C}} \triangleq {\textbf {C}}_{{\textbf {RF}}} {\textbf {C}}_{{\textbf {BB}}} \in {{ \mathbb {C} }^{{N_r} \times {N_s}}}\) denotes the combining matrix consisting of the analog combiners and digital combiners; \({\textbf {F}} \triangleq {\textbf {F}}_{{\textbf {RF}}} {\textbf {F}}_{{\textbf {BB}}} \in {{ \mathbb {C} }^{{N_t} \times {N_s}}}\) is the precoding matrix; \({\textbf {N}} \in {{ \mathbb {C} }^{{N_s} \times {T}}}\) is the independent and identically distributed additive white Gaussian noise, with its elements having zero mean and the variance \(\sigma _n^2\). Furthermore, we vectorize the received signal matrix \({\textbf {Y}}\) in (2) as the following [17], i.e.,

$$\begin{aligned} {{\textbf {y}}} = vec\left( {{\textbf {Y}}} \right) = \left( {{{{\textbf {D}}}^T} \otimes {{{\textbf {C}}}^H}} \right) \left( {{{\textbf {A}}}_t^ * \circ {{{\textbf {A}}}_r}} \right) {{\textbf {b}}} + {\textbf {n}}, \end{aligned}$$

where \({\textbf {D}} \triangleq {\textbf {FX}} \in {{ \mathbb {C} }^{{N_t} \times {T}}}\); \({\textbf {b}} = \sqrt{\frac{{{N_r}{N_t}}}{\beta }}[{\alpha _1}, \ldots ,{\alpha _K}]^T \in {{ \mathbb {C} }^{{K} \times {1}}}\); \({{\textbf {h}}} = vec({{\textbf {H}}}) = \left( {{{\textbf {A}}}_t^ * \circ {{{\textbf {A}}}_r}} \right) {{\textbf {b}}}\); \({\textbf {n}}=vec({\textbf {N}})\); \(\otimes\) is the Kronecker product; \(\circ\) is the Khatri-Rao product. Note that, the number of propagation paths K is usually much less than the number of transmitting/receiving antennas \(N_r, N_t\) in mmWave massive MIMO radar system, i.e., \(K \ll \min ({N_r, N_t})\). In such a case, we can seen that \(\texttt {rank}({\textbf {H}}) \le K \ll \min \left( {{N_r},{N_t}} \right)\), i.e., the channel matrix is serious rank-restricted.

Fig. 1
figure 1

The hybrid analog-digital mmWave massive MIMO communication system

3 Proposed channel estimation scheme

In this section, we develop one low-complexity channel estimation scheme to greatly improve the CSI estimation performance of mmWave massive MIMO system, which fully utilizes the inherent rank-restricted and sparse structure yet without needing any prior knowledge of the channel information (including the channel sparsity and rank),

To achieve our purpose, we first approximate the received signal vector \({\textbf {y}}\) as following [18], i.e.,

$$\begin{aligned} {{\textbf {y}}} = \left( {{{{\textbf {D}}}^T} \otimes {{{\textbf {C}}}^H}} \right) {{{\textbf {A}}}_{a}}{{\textbf {u}}} + {\textbf {n}}, \end{aligned}$$

where \({{{\textbf {A}}}_{a}} \in {{ \mathbb {C} }^{{N_rN_t} \times {M^2}}}\) denotes one dictionary matrix whose column is composed by \({{\textbf {a}}}_{t}^{*} ({\mathop {\varphi }\limits ^{\frown }}_{i}) \otimes {{\textbf {a}}}_r({\mathop {\theta }\limits ^{\frown }}_{j})\), and \({\mathop {\varphi }\limits ^{\frown }}_{i} = 2\pi i/M, i= {0,1, \cdots ,M - 1}\), \({\mathop {\theta }\limits ^{\frown }}_{j} = 2\pi j/M, j= {0,1, \cdots ,M - 1}\) are the angles uniform grid; \({\textbf {u}} \in {{ \mathbb {C} }^{{M^2} \times {1}}}\) should be one sparse vector containing the path parameters and \({{{\textbf {A}}}_{a}} {\textbf {u}} \approx {\textbf {h}}\). Note that, this approximation error is low according to the classical reference [23], and the approximation error would be degraded as the size of grid increasing. In such a case, by exploiting the sparse structure of constructed vector \({\textbf {u}}\) and the low-rank property of channel matrix \({\textbf {H}}\), the CSI estimation process of mmWave massive MIMO system can be exactly modeled as following non-convex problem, i.e.,

$$\begin{aligned} \begin{array}{l} {{\min \left\| {{{\textbf {y}}} - {{{\textbf {S}}}_{a}}{{{\textbf {A}}}_{a}}{{\textbf {u}}}} \right\| _{2}^{2}}}\\ s.t.\;\texttt {sparsity}({{\textbf {u}}}) = R, \texttt {rank}(unvec({{{\textbf {A}}}_{a}}{{\textbf {u}}})) \le K, \end{array} \end{aligned}$$

where R denotes the sparsity of the vector \({\textbf {u}}\)Footnote 1, i.e., \(R = {\left\| {{\textbf {u}}} \right\| _1}\); \({\textbf {S}}_{a} \triangleq {{{{\textbf {D}}}^T} \otimes {{{\textbf {C}}}^H}}\); \(\texttt {rank}({\textbf {H}}) = \texttt {rank}(unvec({{{\textbf {A}}}_{a}}{{\textbf {u}}}))\). Note that, it is difficult to known the prior sparsity information R of \({\textbf {u}}\) in the mmWave massive MIMO channel estimation process. In such a case, directly estimating \({\textbf {u}}\) from the above-constructed non-convex problem (5) is hard to accomplish. According to [24], we approximate the original problem’s estimate of the sparse vector u to the classical LASSO problem (6), as seen in the following:

$$\begin{aligned} \begin{array}{l} \min {\left\| {{{\textbf {y - }}}{{{\textbf {S}}}_a}{{{\textbf {A}}}_a}{{\textbf {u}}}} \right\| _2^2} + \lambda {\left\| {{\textbf {u}}} \right\| _1}\\ s.t.\;\;\;\texttt {rank}(unvec({{{\textbf {A}}}_{a}}{{\textbf {u}}})) \le K, \end{array} \end{aligned}$$

where \(\lambda\) denotes the regularization parameter. Therefore, the near-optimal solution of the original problem (5) can be obtained by solving another formulated problem (6).

In order to solve the above problem (6), we develop a novel CSI estimation algorithm including two separate stage, by fully exploiting the joint low-rank and sparse structure, as seen in the Algorithm 1. In the first stage, our new CSI estimation scheme exploits the compression sensing technique to recover one sparse vector \({\textbf {u}}_1\) from (6) without considering the non-convex constraint. Then, we further construct the rough channel estimation result \({\textbf {H}}_0\) via \({\textbf {A}}_a {\textbf {u}}_1\). In the second stage, based on Gradient descent (GD) framework and Singular Value Hard Thresholding (SVHT), we develop a new algorithm to accurately estimate the CSI matrix \(\hat{{{\textbf {H}}}}\), which fully exploits the inherent rank-restricted property of mmWave massive MIMO channel and the rough channel estimation \({\textbf {H}}_0\). Comparing to other existing methods, our method can achieve the much higher CSI estimation accuracy, yet it only incurs low time complexity.

Specifically, in the first stage, we simply estimate one sparse vector \({{{\textbf {u}}}_1}\) from the problem (6) yet without considering the rank-restricted constrict, which is denoted as problem \((\mathcal {P}1)\), i.e.,

$$\begin{aligned} \mathcal {P}1:\,\,\,\,\mathop {\mathrm{{arg}}\min }\limits _{{{{\textbf {u}}}_1}} {\left\| {{{\textbf {y - }}}{{{\textbf {S}}}_a}{{{\textbf {A}}}_a}{{{\textbf {u}}}_1}} \right\| _2^2} + \lambda {\left\| {{{{\textbf {u}}}_1}} \right\| _1}. \end{aligned}$$

Here, (\(\mathcal {P}1\)) can be solved by the low-complexity FISTA compression sensing algorithm [16]. Based on the estimated sparse vector \({{{\textbf {u}}}_1}\), we can calculate one rough CSI estimation matrix \({{{\textbf {H}}_0}} \triangleq unvec({{{\textbf {A}}}_{a}}{{\textbf {u}}}_1)\). Note that, \({\textbf {H}}_0\) is usually one full-rank channel matrix.

In the second stage, based on the classical gradient descent (GD) framework and Singular Value Hard Thresholding (SVHT) techniques, we further develop one novel algorithm to solve the problem (6) with the initial estimation result \({\textbf {H}}_0\), with which the accurately CSI estimation matrix is acquired. As demonstrated, it is noted that the problem (6) can be approximate as another problem (\(\mathcal {P}2\)) when providing the initial sparse result \({{{\textbf {H}}_0}} \triangleq unvec({{{\textbf {A}}}_{a}}{{\textbf {u}}}_1)\), i.e.,

$$\begin{aligned} \mathcal {P}2:\,\,\,\,\mathop {{\textrm{arg}}\min }\limits _{{\hat{{\textbf {h}}}}} {\left\| {{{\textbf {y - }}}{{{\textbf {S}}}_a}{{{\hat{{\textbf {h}}}}}}} \right\| _2^2};\,\,\,s.t.\,\,\texttt {rank}{(\hat{{\textbf {H}}})} \le {K}. \end{aligned}$$

To be specifical, our new method first calculates the \({{{{{\hat{{\textbf {h}}}}}}}_{t}^d} \triangleq {{{{{\hat{{\textbf {h}}}}}}}_{t - 1}} + {\lambda _t}\nabla f\left( {{{{{{\hat{{\textbf {h}}}}}}}_{t - 1}}} \right)\) according to the gradient descent framework at iteration t, where \({\lambda _t}, \nabla f\left( {{{{{{\hat{{\textbf {h}}}}}}}_{t - 1}}} \right) \triangleq vec({{{\textbf {C}}}^*}{{{\textbf {C}}}^T}{{{{\hat{{\textbf {H}}}}}}_{t - 1}}{{\textbf {D}}}{{{\textbf {D}}}^T} - {{{\textbf {C}}}^*}{{\textbf {Y}}}{{{\textbf {D}}}^T})\) are the step length and gradient respectively. Then, we further restrict the rank of \({\hat{{\textbf {H}}}}_t^d = unvec({\hat{{{\textbf {h}}}}}_t^d)\) by hard thresholding its singular values [25], which is given as:

$$\begin{aligned} {\hat{{\textbf {H}}}}_t = \sum \nolimits _{i = 1}^{\min ({N_r},{N_t})} {{\eta _d}\left( {s_i;\tau } \right) } {{{\textbf {u}}}_i}{{\textbf {v}}}_i^H, \end{aligned}$$

where \({{\eta _d}\left( {s_i;\tau } \right) }\) denotes the hard thresholding nonlinearity and \(\eta _d \left( {{s_i};\tau } \right) = \left\{ {\begin{array}{*{20}{c}} {{s_i},{s_i} \ge \tau }\\ {0,{s_i} < \tau } \end{array}} \right.\); \({\textbf {u}}_i, {\textbf {v}}_i, s_i\) are the i-th left and right singular vectors and value of \({\hat{{\textbf {H}}}}_t^d\); \(\tau \triangleq 2.858 \cdot s_{{{\text{med}}}}\) denotes one specified threshold and \(s_{{{\text{med}}}}\) is the median singular value of the matrix \({\hat{{\textbf {H}}}}_t^d\). As demonstrated by the ref [26], the parameter of 2.858 is determined according to the size of received signal matrix \({\textbf {Y}}\) and it is independent on the noise level. Finally, at the end iteration \(t_{{{\text{end}}}}\), we can obtain the CSI estimation result \({\hat{{\textbf {H}}}} = unvec(\hat{{{\textbf {h}}}}_{t_{{{\text{end}}}}})\).

figure a

4 Complexity analysis

In the following, we theoretically analyze the computational complexity of our proposed CSI estimator. According to the Algorithm 1, we first acquire the sparse vector \({\textbf {u}}_1\) by leveraging the FISTA algorithm to solve problem \(\mathcal {P}1\), which incurs the complexity \(\mathcal {O}(N_sTM^2 t_{f})\) (\(t_f\) denotes the iterations of the FISTA algorithm) according to ref [16]. Then, computing the initial result \({\textbf {H}}_1\) requires the complexity \(\mathcal {O}(N_sTM^2)\). Next, we further exploit the developed novel algorithm to solve our constructed problem \(\mathcal {P}2\), based on the inherent rank-restricted property and initial result \({\textbf {H}}_1\), which requires the computational complexity \(O(N_{r}^{2} N_{s} + N_{t}^{2} T + N_{r} N_{s} T + N_{t} N_{s} T + t_{{{\text{end}}}} (N_{r}^{2} N_{t} + N_{t}^{2} N_{r} ))\), where \(t_{{{\text{end}}}}\) denotes the maximal iteration of our proposed method. Without loss of generality, it is noted that \({N_s} \sim {N_t},T \sim {N_t}\), thus the overall computational complexity of our proposed method can be further given as:

$$O\left( {N_{r}^{2} N_{t} + N_{t}^{3} + t_{{{\text{end}}}} \left( {N_{r}^{2} N_{t} + N_{t}^{2} N_{r} } \right) + t_{f} N_{t}^{2} M^{2} } \right)$$

Note that, the complexity of solving problem \(\mathcal {P}1\) by FISTA algorithm [16] in the first stage is much higher than the time complexity induced in the second stage, due to \(M \gg \max (N_r, N_t)\).

5 Numerical performance

In this section, we numerically evaluate the normalized mean squared error (NMSE) performance of our proposed scheme in the mmWave massive MIMO system, and then compare it with other existing methods. Here, the NMSE between the estimated and original CSI matrix is defined as \({\text {NMSE}}{\triangleq }\;{\mathbb {E}}\{ {{{\Vert {{{\hat{{\textbf {H}}}}} - {{\textbf {H}}}} \Vert _F^2} / {\Vert {{\textbf {H}}} \Vert _F^2}}} \}\). In our simulations, all the simulation parameters are set as follows: \(N_r = N_t = 64\); \(N_s = 60\).

Fig. 2
figure 2

The channel estimation performance of different CSI estimators

Fig. 3
figure 3

Time complexity of different CSI estimation algorithms

As illustrated in Fig. 2, comparing to some existing CSI estimation algorithms which do not need the prior sparsity or rank information (e.g., the TSSR [18] and FISTA [16]), our method would greatly improve the estimation accuracy. Moreover, from Fig. 2, we note that the CSI estimation performance of our proposed method is close to the classical OMP method which requires the prior sparsity information.

Then, we further evaluate the time complexity of our proposed CSI estimation algorithm, as seen in Fig. 3. According to Fig. 3, the computational complexity of our proposed channel estimation method is much lower than OMP-based method, and it is almost the same as that of other existing algorithms, i.e., the TSSR and FISTA method.

Fig. 4
figure 4

NMSE performance comparison for different number of propagation paths

Fig. 5
figure 5

NMSE comparison of different CSI estimation algorithms with the number of pilot symbols

Moreover, we evaluate the performance of our proposed algorithm under different numbers of propagation paths. As shown in Fig. 4, our method performs roughly the same on different numbers of propagation paths, without much difference. In the following simulation, we consider the performance of different CSI with different pilot lengths, where SNR = 5 dB. Fig. 5 shows the detection performance of several CSI estimation methods as the number of pilots increases. It can be seen that with the increase in the number of pilots, the performance of the algorithm we propose is better than other algorithms, which is similar to the performance of the classic OMP algorithm. Our algorithm can effectively reduce the overhead required for channel estimation training

6 Conclusion

In this work, based on the inherent sparse and low-rank structure of mmWave massive MIMO channel, we develop one novel CSI estimation scheme to greatly improve performance meanwhile reducing the computational complexity by leveraging the CS technique in machine learning, which does not require the prior sparsity and rank information of channel. As demonstrated by the numerical simulations, the CSI estimation performance of our new method is much higher than most existing methods, and it is even close to the OMP method. Furthermore, comparing with other methods, the computational complexity and the channel training overhead of our CSI estimator are greatly reduced, which is significantly meaningful for the practical deployment in mmWave massive MIMO system.

Availability of data and materials

Not applicable.


  1. Note that, R can be equal to the channel rank K when the DODs and DOAs of propagation paths are respectively contained by \({\mathop {\varphi }\limits ^{\frown }}_{i} = {{2\pi i} / M}\), \({\mathop {\theta }\limits ^{\frown }}_{j} = {{2\pi j} / M}\), \(i,j= 0,1, \ldots ,M - 1\). However, such ideal case is almost impossible in real mmWave massive MIMO communication systems.


  1. W. Hong, Z.H. Jiang, C. Yu, D. Hou, H. Wang, C. Guo, Y. Hu, L. Kuai, Y. Yu, Z. Jiang et al., The role of millimeter-wave technologies in 5g/6g wireless communications. IEEE J. Microw. 1(1), 101–122 (2021)

    Article  Google Scholar 

  2. T.S. Rappaport, S. Sun, R. Mayzus, H. Zhao, Y. Azar, K. Wang, G.N. Wong, J.K. Schulz, M. Samimi, F. Gutierrez, Millimeter wave mobile communications for 5g cellular: It will work! IEEE Access 1, 335–349 (2013)

    Article  Google Scholar 

  3. S. Mohanty, A. Agarwal, K. Agarwal, S. Mali, G. Misra, Role of millimeter wave for future 5g mobile networks: Its potential, prospects and challenges, in 2021 1st Odisha International Conference on Electrical Power Engineering, Communication and Computing Technology (ODICON). IEEE, (2021), pp. 1–4

  4. J. Kibiłda, A.B. MacKenzie, M.J. Abdel-Rahman, S.K. Yoo, L.G. Giordano, S.L. Cotton, N. Marchetti, W. Saad, W.G. Scanlon, A. Garcia-Rodriguez et al., Indoor millimeter-wave systems: design and performance evaluation. Proc. IEEE 108(6), 923–944 (2020)

    Article  Google Scholar 

  5. T.S. Rappaport, Y. Xing, G.R. MacCartney, A.F. Molisch, E. Mellios, J. Zhang, Overview of millimeter wave communications for fifth-generation (5g) wireless networks—with a focus on propagation models. IEEE Trans. Antennas Propag. 65(12), 6213–6230 (2017)

    Article  Google Scholar 

  6. R. Zhang, J. Zhou, J. Lan, B. Yang, Z. Yu, A high-precision hybrid analog and digital beamforming transceiver system for 5g millimeter-wave communication. IEEE Access 7, 83 012-83 023 (2019)

    Article  Google Scholar 

  7. C.-X. Wang, F. Haider, X. Gao, X.-H. You, Y. Yang, D. Yuan, H.M. Aggoune, H. Haas, S. Fletcher, E. Hepsaydir, Cellular architecture and key technologies for 5g wireless communication networks. IEEE Commun. Mag. 52(2), 122–130 (2014)

    Article  Google Scholar 

  8. B. Yang, Z. Yu, J. Lan, R. Zhang, J. Zhou, W. Hong, Digital beamforming-based massive mimo transceiver for 5g millimeter-wave communications. IEEE Trans. Microw. Theory Tech. 66(7), 3403–3418 (2018)

    Article  Google Scholar 

  9. W. Hong, Z.H. Jiang, C. Yu, J. Zhou, P. Chen, Z. Yu, H. Zhang, B. Yang, X. Pang, M. Jiang et al., Multibeam antenna technologies for 5g wireless communications. IEEE Trans. Antennas Propag. 65(12), 6231–6249 (2017)

    Article  Google Scholar 

  10. S. Gao, X. Cheng, L. Yang, Estimating doubly-selective channels for hybrid mmwave massive mimo systems: a doubly-sparse approach. IEEE Trans. Wirel. Commun. 19(9), 5703–5715 (2020)

    Article  Google Scholar 

  11. F. Talaei, X. Dong, Hybrid mmwave mimo-ofdm channel estimation based on the multi-band sparse structure of channel. IEEE Trans. Commun. 67(2), 1018–1030 (2018)

    Article  Google Scholar 

  12. J. Choi, D.J. Love, P. Bidigare, Downlink training techniques for fdd massive mimo systems: open-loop and closed-loop training with memory. IEEE J. Sel. Top. Signal Process. 8(5), 802–814 (2014)

    Article  Google Scholar 

  13. M.-F. Tang, C.-C. Chen, and B. Su, Downlink path-based precoding in fdd massive mimo systems without csi feedback, in 2016 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM). IEEE, (2016), pp. 1–5

  14. A. Adhikary, J. Nam, J.-Y. Ahn, G. Caire, Joint spatial division and multiplexing—the large-scale array regime. IEEE Trans. Inf. Theory 59(10), 6441–6463 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  15. W. Shen, L. Dai, B. Shim, S. Mumtaz, Z. Wang, Joint csit acquisition based on low-rank matrix completion for fdd massive mimo systems. IEEE Commun. Lett. 19(12), 2178–2181 (2015)

    Article  Google Scholar 

  16. A. Beck, M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  17. J. Lee, G.-T. Gil, Y.H. Lee, Channel estimation via orthogonal matching pursuit for hybrid mimo systems in millimeter wave communications. IEEE Trans. Commun. 64(6), 2370–2386 (2016)

    Article  Google Scholar 

  18. X. Li, J. Fang, H. Li, P. Wang, Millimeter wave channel estimation via exploiting joint sparse and low-rank structures. IEEE Trans. Wirel. Commun. 17(2), 1123–1133 (2017)

    Article  MathSciNet  Google Scholar 

  19. Z. Wei, H. Liu, B. Li, C. Zhao, Joint massive mimo csi estimation and feedback via randomized low-rank approximation. IEEE Trans. Veh. Technol. 71(7), 7979–7984 (2022)

    Article  Google Scholar 

  20. Y. Shi, J. Zhang, K.B. Letaief, Low-rank matrix completion for topological interference management by riemannian pursuit. IEEE Trans. Wirel. Commun. 15(7), 4703–4717 (2016)

    Google Scholar 

  21. Z. Gao, C. Hu, L. Dai, Z. Wang, Channel estimation for millimeter-wave massive mimo with hybrid precoding over frequency-selective fading channels. IEEE Commun. Lett. 20(6), 1259–1262 (2016)

    Article  Google Scholar 

  22. C. Huang, L. Liu, C. Yuen, S. Sun, Iterative channel estimation using lse and sparse message passing for mmwave mimo systems. IEEE Trans. Signal Process. 67(1), 245–259 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  23. Z. Yang, L. Xie, C. Zhang, Off-grid direction of arrival estimation using sparse bayesian inference. IEEE Trans. Signal Process. 61(1), 38–43 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  24. M.J. Wainwright, Sharp thresholds for high-dimensional and noisy sparsity recovery using \(\ell _{1}\) -constrained quadratic programming (lasso). IEEE Trans. Inf. Theory 55(5), 2183–2202 (2009)

    Article  MATH  Google Scholar 

  25. A. Alkhateeb, O. El Ayach, G. Leus, R.W. Heath, Channel estimation and hybrid precoding for millimeter wave cellular systems. IEEE J. Sel. Top. Signal Process. 8(5), 831–846 (2014)

    Article  Google Scholar 

  26. M. Gavish, D.L. Donoho, The optimal hard threshold for singular values is \(4/\sqrt{3}\). IEEE Trans. Inf. Theory 60(8), 5040–5053 (2014)

Download references


Not applicable.


Not applicable.

Author information

Authors and Affiliations



All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ziping Wei.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



According to Eq. (11) in the ref [17], \({{\textbf {H}}} \triangleq {{{\textbf {A}}}_r}{{\varvec{\Lambda }}{} {\textbf {A}}}_t^H\) can be rewritten as a vector form:

$$\begin{aligned} {\textbf {h}} = vec({{\textbf {H}}})=\left( {{{\textbf {A}}}_{ t}^ * \circ {{{\textbf {A}}}_{ r}}} \right) \cdot vecd({{\varvec{{\Lambda }}}}) \end{aligned}$$

and according to Eq. (9) in the ref [22], we can get

$$\begin{aligned} {{\textbf {Y = }}}{{{\textbf {C}}}^H}{{{{{\textbf {A}}}_r}{{\varvec{\Lambda }}{} {\textbf {A}}}_t^H{\textbf {FX}} + {\textbf {N}}}}={{{\textbf {C}}}^H}{{\textbf {HFX + N}}}. \end{aligned}$$

then, vectorizing (11) yields

$$\begin{aligned} \begin{aligned} vec({{\textbf {Y}}})&= vec({{{\textbf {C}}}^H}{{\textbf {HFX}}})+vec({\textbf {N}})\\&=\left( {{{{\textbf {D}}}^T} \otimes {{{\textbf {C}}}^H}} \right) vec({\textbf {H}})+vec({\textbf {N}})\\&=\left( {{{{\textbf {D}}}^T} \otimes {{{\textbf {C}}}^H}} \right) \left( {{{\textbf {A}}}_{ t}^ * \circ {{{\textbf {A}}}_{ r}}} \right) \cdot vecd({{\varvec{{\Lambda }}}})+vec({\textbf {N}})\\&=\left( {{{{\textbf {D}}}^T} \otimes {{{\textbf {C}}}^H}} \right) \left( {{{\textbf {A}}}_t^ * \circ {{{\textbf {A}}}_r}} \right) {{\textbf {b}}} + {\textbf {n}}, \end{aligned} \end{aligned}$$

where \({\textbf {D}} \triangleq {\textbf {FX}} \in {{ \mathbb {C} }^{{N_t} \times {T}}}\); \(vecd({{\varvec{{\Lambda }}}})={\textbf {b}} = \sqrt{\frac{{{N_r}{N_t}}}{\beta }}[{\alpha _1}, \cdots ,{\alpha _K}]^T \in {{ \mathbb {C} }^{{K} \times {1}}}\); \({\textbf {n}}=vec({\textbf {N}})\). So derivation can get Eq. (3).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meng, J., Wei, Z., Zhang, Y. et al. Machine learning based low-complexity channel state information estimation. EURASIP J. Adv. Signal Process. 2023, 98 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: