Low-complexity signal detection networks based on Gauss-Seidel iterative method for massive MIMO systems

In massive multiple-input multiple-output (MIMO) systems with single- antenna user equipment (SAUE) or multiple-antenna user equipment (MAUE), with the increase of the number of received antennas at base station, the complexity of traditional detectors is also increasing. In order to reduce the high complexity of parallel running of the traditional Gauss-Seidel iterative method, this paper proposes a model-driven deep learning detector network, namely Block Gauss-Seidel Network (BGS-Net), which is based on the Gauss-Seidel iterative method. We reduce complexity by converting a large matrix inversion to small matrix inversions. In order to improve the symbol error ratio (SER) of BGS-Net under MAUE system, we propose Improved BGS-Net. The simulation results show that, compared with the existing model-driven algorithms, BGS-Net has lower complexity and similar the detection performance; good robustness, and its performance is less affected by changes in the number of antennas; Improved BGS-Net can improve the detection performance of BGS-Net.

In conventional signal detection methods, maximum likelihood (ML) is a nonlinear maximum likelihood detector. However, its complexity increases exponentially as the number of transmitting antennas increases, hindering its implementation in practical MIMO systems [2]. The spherical decoding (SD) detector [3] and the k-best detector [4] are two variants of ML detectors that balance computational complexity and SER by controlling the number of nodes in each search phase. Unfortunately, QR decomposition in these nonlinear detectors leads to high computational complexity and low parallelism because of the inclusion of unfavorable matrix operations, such as element elimination. In contrast, suboptimal linear detectors, such as minimum mean square error (MMSE) [5] and zero forcing (ZF) [6], provide a better trade-off between SER and computational complexity, but their complexity still reaches three times the number of transmitting antennas .
In order to reduce the complexity of matrix inversion, in 2013, Wu et al. proposed an approximate inversion-based uplink detection algorithm in [7]. Over the next few years, a large number of MIMO detectors designed for specific massive MIMO systems continued to appear. The main idea of the proposed method is to use iterative methods to approximate the inverse of a matrix or to avoid the computation of exact matrices. For example, the neumann series method (NS) [8], the newton iterative method (NI) [9], the Gauss-Seidel method (GS) [10], the successive super-relaxation method (SOR) [11], the jacobi (JA) method [12], the richardson method (RI) [13], the conjugate gradient method (CG) [14], the lanczos method ( LA) [15], residual method [16], coordinate descent method (CD) [17], belief propagation method (BP) [18], etc., these algorithms successfully reduce the complexity to O(M 2 ) , but the SER is only close to MMSE.
In order to improve the performance of detection algorithms, [19][20][21] introduced deep learning into communication. These methods treat the functional blocks of wireless communication as black boxes and replace them with deep learning networks. The mapping relationship between input and output data is obtained by training a large amount of data in an offline training phase. However, deepening the network does not significantly improve the performance beyond a certain number of layers, for this reason [22] proposed a parallel detection network (PDN), which consists of several unconnected deep learning detection networks in parallel. By designing specific loss functions that reduce the similarity between detection networks, the PDN obtains considerable diversity results. These algorithms are pure black boxes and although they improve the performance of detection, they require a large amount of training data to learn a large number of parameters, the advantage of these algorithms is that they do not require the incorporation of communication knowledge. [23,24] proposed a modern neural network structure suitable for this detection task, detection network(DetNet). The structure of DetNet is obtained by expanding the iterations of the projected gradient descent algorithm into the network. [25,26] proposed the orthogonal approximate message passing (OAMPNet), a model-driven deep learning network for multiple-input multipleoutput (MIMO) detection, and [27,28] proposed MIMO detection network(MMNet), a deep learning MIMO detection scheme. The design of MMNet is based on the theory of iterative soft thresholding algorithms, which significantly outperforms existing methods on realistic channels of the same or lower computational complexity. These algorithms are purely white-box model iterative and have better performance than convolutional neural network (CNN), deep neural networks (DNN), but are not as applicable as its wide range. In addition to these, there are BP-Net [29], CG-Net [30], which are networks changed based on approximation methods. [31] proposed a data-driven implementation of an iterative soft interference cancellation (SIC) algorithm, called DeepSIC. This method significantly outperforms model-based methods in the presence of channel state information (CSI) uncertainty, but the network is more complex and this method is a combination of black-box and white-box methods. Therefore, we know that deep learning methods can improve the performance of detection. However, when the number of antennas is large, deep learning not only requires high hardware requirements, but also requires training, which in reality, can have a significant delay. Therefore, we consider using approximate inversion methods to reduce complexity while using deep learning methods to improve SER. However, most of the abovementioned work is directed to a single-antenna user equipment (SAUE) system, and it is assumed that the channel matrix is independent, identically distributed (i.i.d) and obeys gaussian distribution. Unfortunately, in practice, because a user maybe equipped with several antennas, the antennas from the same user equipment (UE) are not sufficiently separated [32], so their transmission vectors are usually related. The spatial correlation between antennas is a key factor affecting the performance of the massive MIMO (M-MIMO) system. Therefore, this paper also considers the multiple-antenna user equipment(MAUE) system while considering the SAUE system.
In this paper, we propose a model-driven deep learning detector network Block Gauss-Seidel Network (BGS-Net) to solve the high complexity caused by the parallel operation of traditional Gauss-Seidel [10], which is based on the Gauss-Seidel iterative method . We reduce the complexity by converting large matrix inversions (D + L) −1 to small matrix inversions and converting matrix-by-matrix to matrix-by-vector. This paper considers SAUE and MAUE systems [32,33] under Rayleigh channels. In order to improve SER of BGS-Net under MAUE system, we improve the initial solution of BGS-Net by replacing x 0 = D −1 H T y with x 0 = A −1 H T y . For A −1 , we use block matrix approximation to reduce its complexity. Simulation results show that compared with existing model-driven algorithms, BGS-Net has lower complexity and similar SER; good robustness, its performance is less affected by changes in the number of antennas; SER is better than traditional Gauss-Seidel; Improved BGS-Net can improve the SER of BGS-Net. This paper is organised as follows. Section 2 presents and analyses the channels required in this paper. Section 3 analyses the existing OAMPNet/MMNet-iid algorithms. Section 4 proposes the algorithm BGS-Net, and explains why BGS-Net is proposed. Section 5 analyses the problems that BGS-Net may encounter in MAUE systems and improve it, and proposes Improved BGS-Net. Section 6 analyses the complexity of BGS-Net and Improved BGS-Net in terms of their complexity. Experimental simulations and discussion are performed in Section 7. Section 8 concludes the full paper.

Notation
In this paper, lower-case and upper-case boldface letters are used to represent column vectors and matrices, respectively . I n denotes a unit matrix of size n. For any matrix A , A T , A H , tr(A) , and A + represent the transpose, conjugate transpose, trace, and pseudo-inverse of A . N C (s i ; r i , τ 2 t ) denotes the univariate Gaussian distribution of a random variable s i with mean r i and variance τ 2 t . The operator �·� denotes the vector/matrix parametrization. The notation diag(x) creates a matrix with x in the diagonal and diag(X) is the vector of the diagonal elements of X.

SAUE System
Consider an uplink massive MIMO system that uses N r antennas at the BS to serve N t single-antenna user terminals simultaneously, where N r ≫ N t . The SAUE system can be simply expressed as: where ỹ ∈ C N r ×1 , H S ∈ C N r ×N t , x ∈ C N t ×1 , and ñ ∈ C N r ×1 are the receive symbol, channel response, transmit symbol, and system noise respectively. N r and N t are the numbers of receive and transmit antennas, respectively. ñ is distributed as CN (0, σ 2 ) . For signal detection, the complex-valued system model (1) is converted to the corresponding realvalued system model as M =2N t . H S denotes the flat Rayleigh fading channel matrix whose entries are assumed to be independently and identically distributed (i.i.d.) with zero mean and variance (1/N)I . Since each user is a single antenna, the correlation between them is not considered.

MAUE System
Consider an uplink massive MIMO with a multi-antenna user equipment (MAUE) system. A BS with N r antennas communicates with m UEs, and each UE is equipped with m UE antennas, as shown in Figure 1. The total number of antennas on the user side is N t = m × m UE . The transmission vector is expressed as , represents the uplink from the jth antenna of the ith UE to the BS, n ∈ R N×1 is an additive white gaussian noise (AWGN) vector with a mean value of zero and a variance of σ 2 /2 . The Kronecker channel model [34] is: H M = R 1/2 H S T 1/2 . Spatial correlation matrix R ∈ R N ×N and T ∈ R M×M .
(1) y =H Sx +ñ where ξ r and ξ t are correlation coefficients, R pq is the value of the receive antenna correlation matrix R , T pq is the value of the transmit antenna correlation matrix T for each user. It can be seen that the transmitting antennas from the same terminal usually have correlation. However, most of the current papers do not consider the correlation between antennas from the same terminal, which is impractical and inaccurate. Therefore, according to the actual situation of propagation, ξ r and ξ t are respectively defined as the correlation factors of the receiving antenna and the transmitting antenna from the same terminal. Note that the correlation between different terminal antennas is ignored [32,33].

Channel characteristics
In this section, SAUE and MAUE system characteristics are analyzed. N t = 4 , N r = 32 are used here. It can be seen from Figure 2 that when the SAUE system has a large-scale antenna and α = N r /N t is large, the channel appears channel hardening, and H T S H S is diagonally dominant, so the approximate inversion method is very suitable for this environment. Figure 3 shows that in the MAUE system environment with ξ r = 0, ξ t = 0.2, m UE = 2 , although H T M H M is still dominant diagonally and exhibits blockiness, it can be seen from the color depth that the non-diagonal elements on both sides of the diagonal elements have begun to have an impact on the diagonal elements and cannot be ignored. Figure 4 shows that in the MAUE system environment with ξ r = 0, ξ t = 0.4, m UE = 4 , the two sides of the diagonal element have had a great impact on the diagonal element, and it is difficult to get a (4)  good approximation by the approximation method. Figure 5 shows that in the MAUE system environment with ξ r = 0.2, ξ t = 0.4, m UE = 4 , the non-diagonal elements around the diagonal elements have a serious impact on the diagonal elements, and the approximation effect is very poor. The Marchenko-Pasture theorem of random matrix theory shows that when each element of the matrix channel H is independently and identically distributed at zero mean and the variance is 1/N, the number of rows N and the number of columns M tend to Infinity, that is, M, N → ∞ , and the ratio of the two tends to a constant ( N/M → β ), the diagonal elements of the matrix H T H tend to a certain constant, and the off-diagonal elements tend to zero. The following analyzes the symmetry of H T H under SAUE and MAUE systems: Known from formulas (4) and (5): R and T are symmetric matrices, so suppose and where R 1 , (a) H T S H S under SAUE system. From equation (2) we know that: It can be seen from formula (14) that the upper left corner matrix and the lower right corner matrix of H T M H M are different, but the matrix is symmetric about the main diagonal.

Related work
The goal of the receiver is to calculate the maximum likelihood (ML) estimate x of x set as However, its complexity is too high. In the past few decades, researchers have been studying various detectors to reduce their complexity while maintaining their SER.

OAMPNet
OAMPNet is a model-driven DL algorithm for MIMO detection derived from orthogonal approximate matching tracking (OAMP). Compared with approximate message passing (AMP), the advantage of OAMP is that it can be applied to unitary invariant matrices, while AMP is only applicable to Gaussian measurement matrices. OAMPNet has better performance than OAMP and can be adapted to various channel environments by using a number of learnable variables. The algorithm for OAMPNet is as follows.
Step 1:We need to design a linear detector. v 2 t is the variance of the nonlinear estimation error here W t is the optimal W in OAMP in [35] In this way, the value of the linear estimate can be obtained (14)  Step 2:We take the linear detection estimates as input and perform nonlinear detection τ 2 t is the variance of the linear estimation error In this way, the input value of the (t+1) layer can be obtained The performance of the OAMPNet detection algorithm is very good, and there are only two training parameters (γ t , θ 2 t ) , but each layer of Ŵ t needs to be calculated once, and each calculation requires a pseudo-inverse, which brings great complexity, not suitable for massive MIMO, suitable for medium-scale MIMO [25,26].

MMNet-iid
The main idea of MMNet-iid is to introduce an appropriate degree of flexibility in the linear and denoising components of the iterative framework while maintaining its linear plus non-linear structure [27,28].
Step 1: We need to design a linear detector to estimate z t .
Step 2:We take z t as input and perform nonlinear detection. σ 2 t is the variance of the linear detection estimation error In this way, the input of the next layer can be obtained (20) , and its complexity is much smaller than OAMPNet. MMNet-iid performs well when the number of antennas is large and the linear Gaussian channel is good, and has poor performance in correlated channels or when the number of antennas is low.

Gauss-Seidel iterative method
GS is one of the common iterative methods used to solve systems of linear equations. If a system of linear equations Ax = b is required to be solved, it will be decomposed as follows.
The iterative formula of GS [36] is j , its matrix representation is GS is used in communication, that is, the Hermitian positive semi-definite matrix A is decomposed into strictly lower triangular terms L , strictly upper triangular terms U and diagonal terms D

Then the problem we solve is
Then solve a set of linear equations by calculating the solution of the iterative be havior [37]. where x (n) is the estimated signal, iteratively refined in each iteration and x MF = H T y , replacing b in (32). Here x (0) is initialised to D −1 H T y . Gauss-Seidel has good convergence, and is guaranteed to converge when A is diagonally dominant or symmetrically positive definite. This is because in MAUE systems, A is not guaranteed to be diagonally dominant, but A is definitely symmetric positive definite. The following is a proof of convergence of Gauss-Seidel for diagonally dominant or symmetric positive times, respectively [38].

Theorem 1 When A is diagonally dominant, Gauss-Seidel can guarantee convergence.
Proof of Theorem 1. For strictly diagonally dominant matrix A , its diagonal elements a ii = 0, i = 1, 2, . . . , n , so

the characteristic value is , then the characteristic equation is
When the determinant is zero, the equation has a non-zero solution. Use contradiction: It is a strictly diagonally dominant matrix, so it is non-singular, that is, Theorem 2 When A is symmetrically positive, Gauss-Seidel can guarantee convergence.
Proof of Theorem 2. Suppose B G = −(D + L) −1 U , the eigenvalue is , and x is the eigenvector, then When A is symmetric positive definite, Gauss-Seidel converges, and we divide the numerator and denominator of (44) by a 2 at the same time, we get From (45) we can see that the more diagonally dominant, the smaller the 2 , the faster the convergence.

BGS-Net architecture
In this section, a model-driven DL detector network (called BGS-Net) is proposed. The signal detector uses the Gauss-Seidel method and nonlinear activation to improve the detection performance. The only training parameters is , all of which need to be computed only once and then reused at each layer. In contrast, W t and A t for OAMPNet and MMNet-iid need to be calculated once per layer because of the training parameters present in them. The structure of BGS-Net is shown in Figure 6, which is an improved algorithm by adding a learnable vector variable γ t . The network consists of L layer cascaded layers, each of which has the same structure, including nonlinear estimator, error variance τ 2 t , and tied weights. The input of the BGS-Net network is x MF and the initial value x 0 , and the output is the final estimate of the signal x Llayer . To make it easier to see the deep learning structure, see Figure 7. We first calculate ẑ t and scalar τ 2 t through GS detection block, plus the constellation map S as the input of the network. We introduce a Initialize: where softmax(V i ) = exp V i j exp V j . It can be seen from Algorithm1 that we only have one training parameter per layer, and vector γ t is used to adjust the variance τ 2 t which is an estimated value. Because 1 M tr(C t C T t ) used is a constant value, it is multiplied by v 2 t each time in τ 2 t , which saves a large amount of calculation. What we need to pay attention to is that when 4 → 5 , τ 2 t 's dimension expansion to a vector τ 2 t .

Low-complexity algorithm for (D + L) −1
In this section, the complexity of (D + L) −1 will be reduced. The complexity in cal- If the inverse is solved directly, the complexity of the algorithm will reach O(M 3 ) , so a circular nesting method is proposed to reduce its complexity, as described below : The first: From Eq. (30), the complexity of each row is M − 1 multiplication, 1 division, for a total of M rows, so the complexity of one iteration is M 2 . However, this method is not applicable to BGS-Net.
The second: Solve the lower triangular matrix in parallel, the structure is shown in Figure 8. For the inversion of the lower triangular matrix, we have the following properties: where B , C , F have the same size. The main complexity of (37) is to solve (D + L) −1 , which we know to be a lower triangular matrix, and use the above property for its inverse. It can be known from (8), (10) and (14) that when the system is SAUE or MAUE (only T ), it has the following (50) properties; when MAUE (both T and R ), there is no (50) property: We can see from Figure 8 that the specific steps of the loop nesting method are as follows.
Step 2: Bring the resulting B −1 i,t and F −1 i,t , and the corresponding , then the next step, otherwise loop the second step.
Step 3:In the case of Section 2.3 (a) (b), assign B −1 = B −1 i+1,t to F −1 ; otherwise, the same method as for B −1 , solve F −1 . In this way, we can obtain (D + L) − Note that we don't need to find the value of (D + L) −1 , just take the following formula to solve the linear detection term. (Because P 1 ∈ R P×Q , P 2 ∈ R Q×K , b ∈ R K×1 , we know

Error analysis
In this section, we will study the reasons why BGS-Net can improve performance. The analysis of the error ( x t − x ) is as follows: Define the output error e lin t = z t − x for the linear phase at iteration t and the output error at the previous iteration t − 1 as e den t−1 =x t − x . We can rewrite the update equation of Algorithm 1 based on these two output errors as: and From Figure 2, we know that under channel hardening conditions, the first term of equation (52) (I − (D + L) −1 (H T H + σ 2 2 I)) tends to 0; the second term is divided into the effect of n and the effect of x , for the noise, where n = σ 2 /2 * N (0, 1) , so when the signal-to-noise ratio(SNR) is small, H T n becomes larger, σ 2 2 x also becomes larger; when the SNR is large, H T n becomes smaller and σ 2 2 x also becomes smaller. And , is significantly attenuated. These calculations explain why BGS-Net has good performance on i.i.d Gaussian channels. Moreover, it is better than MMNet-iid's I − θ (1) t H T H on correlated channels, where channel hardening disappears when the channel is correlated and there is no way for I − θ (1) t H T H to converge to 0 as the number of antennas increases, while I − (D + L) −1 (H T H + σ 2 2 I) , since A is symmetric and D + L itself contains all the information in A . When the number of antennas increases, it can be approximated as I − A −1 A , tending to 0 but not to 0.
For the effect of the nonlinear activation function, E x|ẑ t , τ t − x in (53) reduces the difference of x t+1 − x . The proof is as follows: Assuming that the true value x ti is s 1 , then the above formula is equal to (52) We know that the softmax soft decision uses an exponent that allows judgments with larger probabilities to become larger and smaller probabilities to become smaller, but the total probability is still 1. It can be seen that as the probability of judging s 1 increases, the first term of equation (55) gets closer and closer to s 1 , so using this activation function can further reduce the error.

Analysis of the problem
Under the MAUE system, A cannot be approximated as a diagonal matrix D , which has a great influence on x 0 ← D −1 H T y . x 0 is a initial solution. If x 0 is given well, then the number of iterations required is small. In the SAUE system, the value of x 0 ← D −1 H T y is approximately equal to (H T H + σ 2 2 I) −1 H T y , so the number of iterations is small, which can also explain why fast iterations converge under channel hardening conditions. In the MAUE system, H T H loses the diagonal dominance, and the sum of the other elements in the same line is no longer much smaller than the diagonal elements. D −1 H T y cannot approach the real solution x , here x 0 ← A −1 H T y replaces x 0 ← D −1 H T y , so that no matter how the channel changes, a good initial solution can be extracted.
However, calculating A −1 has a high complexity, so the low-complexity method is used to approximate the solution of A −1 .

Improved BGS-Net Design
In this section, BGS-Net is improved to adapt to the MAUE system by replacing

Approximation of A −1
From Figures 3, 4 and 5, it can be seen that as the correlation coefficient increases, the channel is block-like in character. To approximate A −1 , as in Figure 9, divide the diagonal of matrix A into M m UE small matrices, the size of the small matrice is m UE × m UE , each (55) s j ∈S s j × p(s j /ẑ ti , τ ti ) − s 1 Fig. 9 Block matrix approximation of A small matrice in order is set to D (2,t) ∈ R m UE ×m UE , (t ∈ [1 : T]) respectively. The matrix A is again divided into 4 matrices, with the upper left and lower right matrices set to D (1,1) , D (1,2) ∈ R M/2×M/2 respectively. To ensure convergence of the Neumann Series, here, unlike in [32], the parameter α opt = 1 + η , η = M/N was introduced [39]. We do the following for all small matrices D (2,t) as D (2,t) = D (2,t) × α opt and find the inverse of each small matrix D −1 (2,t) . In order to get D −1 (1,1) , we don't want to find it directly, because its complexity is O(M 3 /4) , at this time D −1 (1,1) has both diagonal elements and non-diagonal elements, unlike only diagonal elements under channel hardening conditions. In this paper, the Neumann Series is used to approximate it [32,40] by the following method: use E to approximate the off-diagonal block part of D (1,1) and use N to approximate D −1 (1,1) So we can approximate D −1 (1,1) by k N times Neumann method.
As above, we can obtain D −1 (1,2) , and then splice D −1 (1,1) and D −1 (1,2) together In this way we can get the required D −1 . The same can be obtained: In this way, we get Ã −1 . Considering the high complexity of Eq. (64), we rewrite Eq. (64) without changing its principle [41] where S 0 = D −1 (H T y) , ϑ = D −1 E . In this way, we can bypass the high complexity of solving Eq. (64) and directly obtain x 0 ← A −1 H T y with low complexity.
(a) When the system is SAUE or MAUE (only T ):

Complexity comparison
This section will analyze the computational complexity according to the number of multiplications of different algorithms. Here N r = 256 , N t = 32 , k N = k F = 2 , m UE = 4 . From Table 1, we can know that in the environment of a (when the system is SAUE or MAUE (only T)), the complexity of BGS-Net is about 16% of MMSE, 0.01% of OAMP-Net, 30% of TL-BD-INSA , and 20% of MMNet-iid; the complexity of Improved BGS-Net is about 38% of MMSE, 0.02% of OAMPNet, 42% of TL-BD-INSA, and 65% of MMNetiid. In the environment of b (when MAUE (with both T and R)), the complexity of BGS-Net is approximately 19% of MMSE, 0.01% of OAMPNet, 32% of TL-BD-INSA, and 21% of MMNet-iid; the complexity of Improved BGS-Net is about 46% of MMSE, 0.02% of OAMPNet, 80% of TL-BD-INSA, and 52% of MMNet-iid. So our algorithm complexity is very low. From [28] we can know that the complexity of MMNet-iid nonlinear detection is O(M 2 ) , while the complexity of BGS-Net's nonlinear detection is much smaller than that of MMNet-iid nonlinear detection, and the complexity of nonlinear detection of MMNet-iid is much smaller than that of linear detection, so we ignore its complexity.

Numerical results and discussion
In this section we give simulation results for MIMO detection of BGS-Net and Improved BGS-Net, and evaluate the performance by the symbol error rate(SER) for different signal-to-noise ratios(SNR). The SNR of the system, defined as   (x, y) . where the data x is generated by QPSK modulation symbols. We trained the network for 1000 iterations using stochastic gradient descent and the Adam optimiser. The learning rate was set to 0.001. k N = k F = 2 and in the experimental setup we chose l 2loss as the cost function.
The different detectors are described in detail below, and in order to reduce the high latency of the deep learning method, all networks and iterations are set to 4 layers.
• TL_BD_INSA [32]: TL_BD_INSA is an improved Neumann series approximation algorithm based on two-level block diagonal. • OAMPNet: It is a DL-based detector that develops the OAMP detector. In our simulation, the number of layers of OAMPNet is set to 4 layers, and each layer has 2 learnable variables. • MMNet-iid: MMNet-iid is specially designed for i.i.d. Gaussian channels. In our simulation, the number of layers of MMNet-iid is set to 4, and each layer has 2 learnable variables.

SAUE system
The performance of BGS-Net was tested on the SAUE system using QPSK modulation; the SNR was 3dB during training.

Convergence analysis
As shown in Figure 10, the convergence speed of BGS-Net was tested for different network layers with the same SNR of 3dB and the same number of antennas N r = 32 , N r = 4 . It can be observed that the 3-layer BGS-Net has converged, while MMNet-iid needs at least 7 layers to converge and OAMPNet needs 4 layers which indicates that BGS-Net converges fastest in the SAUE system environment. The SER of BGS-Net is much better than that of MMNet-iid in the SAUE system, but there is still a gap between the SER of BGS-Net and that of OAMPNet.

Impact of ratio α
This section analyzes the effect of the ratio α of the receiving antenna and the transmitting antenna on the performance of the algorithm. We set N t = 4 , SNR=3dB, and compare the performance of the algorithm when N r = 24, 32, 40 , respectively. As shown in Figure 11, it is found that as the ratio α increases, the lower the SER, the smaller the gap between the algorithms. Studying N t = 4, N r = 40 separately, as shown in Figure 12, BGS-Net can approximate the performance of OAMPNet with much lower complexity. When N r = 24 , the gap between BGS-Net and OAMPNet is 9 × 10 −5 ; when N r = 40 , the gap between BGS-Net and OAMPNet is 3 × 10 −7 .

Impact of the number of antennas
This section analyzes the influence of the number of antennas on the performance of the algorithm, and sets the ratio α as a fixed value, that is, α = 8 . As shown in Figure 13, Figure 14, and Figure 15, when the number of antennas increases, the performance of all algorithms is improving, the performance of MMNet-iid changes the most, and the performance of BGS-Net changes little. Affected by the number of antennas is small, which shows that BGS-Net is very robust. At the same time, BGS-Net has always been better than Gauss-Seidel, which shows that the nonlinear activation function improves the performance of Gauss-Seidel.

Effect of modulation order
This section analyzes the impact of modulation methods on algorithm performance. We compare the performance of MMSE, Gauss-Seidel, and BGS-Net under QPSK and 16QAM. When the test SNR is set to 5-9dB, the training test ratio is set to 7dB. As shown in Figure 16, it is found that when the modulation order increases, the performance of the algorithm decreases, but the performance of BGS-Net is always better than Gauss-Seidel, and the performance of Gauss-Seidel is close to that of MMSE.

MAMU system
The MAUE system uses QPSK modulation, and the SNR during training is 4dB.

Convergence analysis
In order to study the performance of the algorithm, what is the difference between the MAUE system and the SAUE system. As shown in Figure 17, we tested the convergence speed of BGS-Net and Improved BGS-Net at different network layers under the same SNR of 4dB and the same number of antennas N t = 4, N r = 32 , and found that the 3-layer Improved BGS-Net has converged, and the BGS-Net and OAMPNet require 4-layer can converge, while MMNet-iid needs 7-layer network to converge. The performance of MMNet-iid is much lower than that of other algorithms, while the performance of BGS-Net and Improved BGS-Net maintains a slight performance gap with OAMPNet, and the performance of Improved BGS-Net is better than BGS-Net.

Impact of ratio α
This section analyzes the effect of α on the performance of the algorithm under ξ r = 0 , ξ t = 0.2 , we set N t = 4 , SNR=4dB, and compare the performance of the algorithm when N r = 32, 40 respectively. As shown in Figure 18, it is found that as α increases, the performance of Improved BGS-Net is closer to that of OAMPNet. As shown in Figure 19, when the antenna ratio α = 11 , the performance gap between Improved BGS-Net and OAMPNet is 2.5 × 10 −6 . This shows that as long as α is large enough, the performance of Improved BGS-Net can approach OAMPNet with low complexity.

Impact of the number of antennas
This section analyzes the influence of the number of antennas on the performance of the algorithm, we set α to a constant value, i.e. α = 8 . The performance of the algorithm is compared for N t = 4, N r = 32/N t = 8, N r = 64/N t = 16, N r = 128 respectively, as shown in Figures. 20, 21, and 22. It is found that the performance gap between Improved BGS-Net and BGS-Net decreases as the number of antennas increases in the ξ r = 0 , ξ t = 0.2 environment, while at the same time the MMNet-iid performance improves much faster than the others, suggesting that the impact brought by correlation can be improved by increasing the number of antennas in this environment. The fact that our proposed algorithm is consistently better than TL_BD_INSA suggests that Improved BGS-Net does improve the performance of the algorithm based on using TL_BD_INSA as the initial solution.

Effect of modulation order
This section analyzes the effect of modulation on the performance of the algorithm. We compare the performance of MMSE, Gauss-Seidel, BGS-Net, and Improved BGS-Net under QPSK and 16QAM. As shown in Figure 23, it is found that the larger the modulation order, the lower the performance of the algorithm, and the gap between Improved BGS-Net and BGS-Net has increased a bit. Under QPSK, BGS-Net coincides with Improved BGS-Net at 5dB; under 16QAM, Improved BGS-Net has always been better than BGS-Net.

Effect of transmit correlation
To explore the effect of the correlation between the user's multiple antennas on the performance of the algorithm, we made two sets of comparisons, one for the performance of the algorithm with N t = 8, N r = 64, ξ r = 0, ξ t = 0.2 or 0.4 , as shown in Figure 24. one for the performance of the algorithm with N t = 16, N r = 128, ξ r = 0, ξ t = 0.2 or 0.4 , as shown in Figure 25. it was found that when the greater the correlation between multiple antennas, the lower the performance of the algorithm and the greater the difference between Improved BGS-Net and BGS-Net, which suggests that our improvements can make BGS-Net better adapted to the MAUE system environment.

Effect of receive correlation
This section analyzes the impact of the correlation between the multiple antennas of the BS on the performance of the algorithm. We have made two comparisons, one is the performance of the algorithm under N t = 16, N r = 128, ξ r = 0 or 0.2, ξ t = 0.4 , As shown in Figure 26. One group is the performance of the algorithm under N t = 32, N r = 256, ξ r = 0 or 0.2, ξ t = 0.4 , as shown in Figure 27. It is found that the greater the correlation between the multiple antennas of the BS, the lower the Table 2 When N t = 8, N r = 64 , the required complexity corresponding to Figure 28 (calculated according to Table 1, based on the complexity of MMSE, NAN indicates that the performance cannot be achieved at this time)

Algorithms
Computing Complexity  Table 3 When N t = 16, N r = 128 , the required complexity corresponding to Figure 28 (calculated according to Table 1, based on the complexity of MMSE, NAN indicates that the performance cannot be achieved at this time)

Algorithms
Computing Complexity  performance of the algorithm, and the greater the gap between Improved BGS-Net and BGS-Net. And in this environment, Improved BGS-Net and MMSE are very close, so our algorithm can only be applied to low and medium correlations.

Comprehensive analysis of complexity and performance
From Figure 28, Table 2, and Table 3, we can see that when the number of antennas is the same, as the correlation degree increases, the proposed algorithm requires more layers to converge; To achieve the same performance, although BGS-Net and Improved BGS-Net require more layers than OAMPNet, their required complexity is much lower than OAMP-Net. When the number of antennas is increased, the performance of the algorithm should be better, but since the number of individual terminals is changed from 2 to 4, more layers are required to converge.

SER performance with channel estimation error
In the presence of channel estimation errors, the performance of the proposed algorithm in uplink multi-user massive MIMO systems is investigated. The estimated channel gain matrix is given by where H ∈ C N r ×N t is the error matrix of the complex Gaussian terms of iid with zero mean and variance σ 2 ǫ . As shown in Figures 29 and 30, when there is a channel estimation error, the Improved BGS-Net performance is very close to OAMPNet; With the increase of channel estimation error, the performance of all algorithms decreases, but the proposed detection algorithm still has good SER performance and is more robust to channel estimation error.
(70) H =H + H ∈ C N r ×N t

SER performance with noise uncertainty
Next, we investigate the effect of noise variance uncertainty on the performance of different DL detectors. It is assumed that the noise variance is unknown in both training and testing phases. Therefore, when evaluating performance on test data, the noise variance is not the same as when training. Suppose the estimated noise variance is σ 2 = ησ 2 . We also define the noise uncertainty factor (NUF) as NUF = 10log 10 η.
As can be seen in Figure 31, both MMNet-iid and OAMPNet incur considerable performance losses when the estimated noise variance deviates from the true variance. When the estimation of noise variance is inaccurate, the performance gap between OAMPNet and BGS-Net, Improved BGS-Net becomes more obvious. At the same time, BGS-Net and Improved BGS-Net are hardly affected by inaccurate estimation of noise variance and have good robustness.

Conclusion
We propose a new model-driven deep learning network for MIMO detection, BGS-Net, and build on it with Improved BGS-Net. The network is based on Gauss-Seidel, coupled with a non-linear activation function, and exhibits excellent performance. The network needs to be optimised with few adjustable parameters and the training process is simple and fast. In this paper, single-antenna user equipment (SAUE) and multiple-antenna user equipment (MAUE) systems are considered under Rayleigh channels. Simulation results show that the performance of BGS-Net is significantly better than that of the Gauss-Seidel algorithm; the proposed scheme is suitable for massive MIMO with low complexity, and the performance can be improved by increasing the ratio between the receiving and transmitting antennas; the robustness of BGS-Net is good, and the performance is little affected by the variation of the number of antennas; under the MAUE system, the performance of Improved BGS-Net is better than that of BGS-Net, and both are suitable for low-and medium-correlation MAUE systems.