Skip to main content

Neighbor-based joint spatial division and multiplexing in massive MIMO: user scheduling and dynamic beam allocation


Two-stage precoding schemes have been developed to reduce the channel estimation overhead in FDD systems. By integrating user scheduling into these schemes, it becomes possible to meet the quality-of-service requirements of high-density wireless communication systems, despite the limitations on spatial resources and transmit power budget. In this paper, we propose a user scheduling and dynamic beam allocation method for neighbor-based joint spatial division multiplexing (N-JSDM) transmission. The user scheduling problem is formulated as a 0–1 quadratic programming problem to maximize effective spectral efficiency (ESE) using directional channel properties. To address the complexity issue, convex relaxation and linearization methods are employed to transform the problem into a 0–1 mixed integer linear programming, and a dimensionality reduction method is introduced. The proposed user scheduling-aided N-JSDM scheme reduces downlink training length and feedback of channel state information. Additionally, a dynamic configuration form is used for pre-beamforming matrix design based on user distribution, outperforming conventional approaches. Simulation results demonstrate higher ESE achieved by the proposed scheme compared to previous methods.

1 Introduction

Over the past three decades, the data rates of wireless communication have been doubling every eighteen months, and it is projected to reach Terabit-per-second in the near future [1]. Massive multiple-input multiple-output (MIMO) has been a crucial technology for enhancing system throughput and providing reliable communication [2]. By employing a large-scale antenna array at the base station (BS), massive MIMO achieves higher data transmission rates, with the number of BS antennas significantly surpassing the number of served user terminals. It utilizes spatial resources and capitalizes on the multipath propagation characteristics to establish a parallel transmission mechanism, multiplying system capacity without the need for additional spectrum resources or transmit power [3]. In the forthcoming communication systems, massive MIMO will continue to play a pivotal role.

Massive MIMO relies on the channel state information (CSI), which is the communication link state information from the transmitter to the receiver [4]. When the CSI is perfect, the performance of massive MIMO scales linearly with the smaller number of antennas between the transmit and receive sides [5], underscoring the critical importance of obtaining instantaneous CSI. In previous research on massive MIMO systems, time division duplex (TDD) mode has been widely adopted. TDD leverages channel reciprocity, enabling the estimation of downlink CSI through the uplink channel, thereby reducing spectral overhead [6,7,8,9]. However, the prevailing wireless standards predominantly employ frequency division duplex (FDD) systems, which offer more mature industrial products and market share [10]. Furthermore, in the extensively studied millimeter-wave frequency band, FDD systems may exhibit similarly impressive performance in cell-free massive MIMO systems [11]. Nonetheless, due to the absence of channel reciprocity, FDD massive MIMO systems necessitate substantial downlink training length (DTL) and CSI feedback during the downlink communication to acquire CSI at the transmitter [12]. Additionally, the cost of reconfiguring frequency bands to accommodate TDD in FDD systems is considerably high [13]. Consequently, for FDD massive MIMO systems, acquiring CSI presents a significant challenge, particularly for telecom operators compelled to upgrade their existing FDD systems to 5 G wireless communications.

There have been a lot of research efforts on reducing DTL and channel feedback in FDD massive MIMO systems. Similar to the CSI acquisition in TDD mode, several studies (e.g., [11] and [13]) leverage angle reciprocity by transmitting uplink pilots to obtain CSI, thereby eliminating the need for CSI feedback. The minimum number of pilots required corresponds to the number of terminals. Moreover, some works focus on the spatially correlated MIMO channels and utilize the structure of CSI to reduce DTL and CSI feedback. Specifically, the compressed sensing techniques are used to exploit channel sparsity [14, 15]. Expanding on the consideration of spatial correlation, additional researches have taken into account temporal correlation and leveraged the spatial and temporal common sparsity of massive MIMO channels to acquire CSI with reduced overhead [4, 16]. Additionally, a two-stage beamforming scheme called joint spatial division multiplexing (JSDM) based on statistical CSI is proposed [12]. The JSDM beamforming scheme comprises two stages. In the first stage, the pre-beamformer uses the channel covariance matrix (CCM) to mitigate inter-group interference. In the second stage, the instantaneous CSI of each group is used to design a precoding scheme for intra-group interference suppression. Obtaining the statistical CSI is relatively easier compared to instantaneous CSI since its variations occur at a slower rate [17, 18].

Extensive research attentions have been paid to enhance the performance of JSDM [19,20,21,22,23]. Some works consider the pre-beamformer design to achieve a better spectral efficiency [19,20,21]. Specifically, due to the non-convexity caused by signal-to-interference-plus-noise ratio (SINR) as an optimization criterion, Kim et al. proposed to use signal-to-leakage-plus-noise ratio (SLNR) as the optimization objective and simplified the SLNR-based pre-beamformer design problem to the trace quotient problem encountered in the field of machine learning [19]. In [20], Jeon et al. used the minimum mean squared error criterion to design the pre-beamformer and multi-user precoder sequentially. However, none of the above works considered the impact of user grouping. Since the channel covariance matrices of users differ, and the goal of user grouping is to make users in each group have a common eigen-subspace, there will inevitably be overlapping signal spaces between groups. Eliminating inter-group interference by pre-beamforming will reduce the signal space and result in a loss of system performance. Recently, a scheme called neighbor-based JSDM (N-JSDM) is proposed in [21], which avoids the user grouping problem by adopting the neighbor scheme to fully utilize the signal space. N-JSDM is still a two-stage scheme. In the first stage, a pre-beamforming matrix is designed according to the CCMs to reduce the interference of non-neighbors, and the effective channel matrix becomes a band matrix. Neighbor interference is removed in the second stage. Besides, Khalilsarai et al. proposed a method to approximate the downlink CCM of users as the columns of the discrete Fourier transformation matrix, particularly when the number of antennas at the BS is large [22]. This approximation enables the BS to utilize codebook-based beam selection for designing the pre-beamforming matrix, thereby reducing the computational complexity. There are also works to improve the performance of JSDM from the aspects of antenna structures [23] and BS selection [24]. Tang et al. provided an analysis of two-stage precoding designs under different antenna structures, offering guidelines for antenna structure selection to achieve a better balance between performance and cost [23]. Considering that the overlap of the angle-spreading-ranges (ASR) of different user clusters may seriously degrade the performance of two-stage precoding, Ma et al. proposed a solution to minimize ASR using BS selection [24].

As the number of users increases in the system, inter-user interference becomes severe, and a portion of degrees of freedom is used to mitigate inter-group interference, resulting in a degradation of desired signal energy [21]. Therefore, it is necessary to schedule users to improve the spectral efficiency. User scheduling in conventional JSDM schemes are divided into two parts: user grouping and intra-group user scheduling. Before implementing JSDM beamforming, users need to be grouped, and the users in each group share a common eigen-subspace, i.e., group eigen-space, where the group eigen-spaces of different groups are orthogonal or non-overlapping. Several user grouping methods have been proposed [25,26,27,28]. For example, the K-means clustering algorithm based on chordal distance and fixed quantization algorithm based on discrete Fourier transform are proposed in [25]. Xu et al. presented a K-means algorithm based on weighted likelihood metric in [26]. Nam et al. transformed user grouping into a subspace packing problem in Grassmann manifold [27], while a recent work [28] proposes a hierarchical clustering algorithm that considers both the number of groups and the chordal distance threshold. Besides, intra-group user scheduling has also been studied. A scheduling algorithm based on average SLNR has been proposed [28]. The iterative SLNR-based group scheduling combines the outer precoder and group scheduling to achieve better performance. Xu et al. proposed an optimized scheduling algorithm based on channel quality indicator (CQI). The algorithm assumes that the users cannot achieve the maximum value on two or more beams and assigns a specific beam to each user based on CQI, allowing the user to obtain maximum gain on that beam [26].

Considering the advantages of N-JSDM, incorporating user scheduling into the N-JSDM transmission scheme enables better integration of precoding techniques, further optimizing system performance and enhancing communication quality. In this paper, we propose a user scheduling and dynamic beam allocation method for the N-JSDM transmission scheme to maximize effective spectral efficiency (ESE) subject to limited pilot length. Specifically, considering the challenges in acquiring complete CSI, we formulate the user scheduling problem as a 0–1 quadratic programming by leveraging the channel directional features. Since the users are randomly distributed, we propose dynamically allocating the number of beams serving each user. This idea is incorporated into the optimization problem as a constraint, and the pre-beamformer is designed accordingly. Additionally, we transform the optimization problem into a 0–1 mixed integer programming problem using convex relaxation and linearization techniques. Simulation results demonstrate the validity of the theoretical analysis. The primary contributions of this paper can be summarized as follows:

  • We analyze the factors that impact ESE and formulate the user scheduling problem as a 0–1 quadratic programming problem with linear constraints, leveraging the channel directional features. These features are more stable over larger time scales compared to instantaneous CSI, which varies according to the channel coherence time.

  • To simplify the 0–1 quadratic programming problem, we employ convex relaxation and linearization techniques to transform it into a mixed integer linear programming problem. Additionally, to further reduce computational complexity, we propose a dimensionality reduction method.

  • The pre-beamformer design using dynamic allocation scheme is proposed. Since the number of beams serving each user is determined by the interference between the user and its neighbors, it can be well applied in realistic scenarios where users are randomly distributed and/or DTL is limited.

The rest of the paper is organized as follows. Section 2 describes the system model and the N-JSDM scheme. The problem formulation of N-JSDM user scheduling is provided in Sect. 3. The beam allocation method based on overlap density, the linearization method of 0–1 quadratic programming, and the dimensionality reduction method are presented in Sect. 4. In Sect. 5, we propose a pre-beamformer design with dynamic beam configuration. Simulation results and discussion are given in Sect. 6. Finally, we conclude this paper in Sect. 7.


Bold uppercase letters indicate matrices, and bold lowercase letters represent column vectors. The i-th row and i-th column of matrix \({\textbf {A}}\) are denoted by \({\textbf {a}}^{i}\) and \({\textbf {a}}_{i}\), respectively. The factorial of \(a\) is represented by \(a !\). \({\textbf {I}}_{N}\) represents the \(N\times N\) identity matrix, while the superscripts \((\cdot )^{H}\) and \((\cdot )^{T}\) denote the conjugate transpose and transpose of a matrix, respectively. The pseudo-inverse operation is denoted by \((\cdot )^{\dag }\). The orthogonal complement space is represented by \(span ^{\perp }(\cdot )\). The Hadamard product is denoted by \(\odot\). The set of real numbers and complex numbers is \(\mathbb {N}\) and \(\mathbb {C}\), respectively. \(\iota\) represents the imaginary unit, i.e., \(\iota = \sqrt{-1}\).

2 Preliminary

2.1 System model

We consider a typical single-cell FDD massive MIMO system where a BS is equipped with a uniform linear array (ULA) of \(M\) elements serving \(K\) single-antenna users. The BS applies a precoder \({\textbf {V}} \in \mathbb {C}^{M \times K }\) in the downlink to transmit symbols. Then, the received signal at the users can be written as

$$\begin{aligned} {\textbf {y}} = {\textbf {H}}^{H }{} {\textbf {V}}{} {\textbf {s}} + {\textbf {n}}, \end{aligned}$$

where \({\textbf {y}} = [y _{1}, y _{2},\ldots ,y _{K}]^{T } \in \mathbb {C}^{K \times \text {1}}\) with \(y _{k }\) being the received signal of user \(k\), \({\textbf {H}}^{H } = [{\textbf {h}}_{1},{\textbf {h}}_{2},\ldots ,{\textbf {h}}_{K}]^{H }\in \mathbb {C}^{K \times M }\) is the channel matrix with \({\textbf {h}}^{H }_{k }\in \mathbb {C}^{\text {1}\times M }\) being the channel from BS to user \(k\), \({\textbf {s}} = [s _{1},s _{2},\ldots ,s _{K }]^{T }\in \mathbb {C}^{K \times \text {1}}\) is the transmitted signal satisfying a power constraint \(E ({\textbf {s}}{} {\textbf {s}}^{H }) = {\textbf {I}}_{K }\), and \({\textbf {n}} = [n _{1},n _{2},\ldots ,n _{K }]\in \mathbb {C}^{K \times \text {1}}\) denotes the additive white Gaussian noise vector with \({\textbf {n}} \sim \mathcal {C}\mathcal {N}({\textbf {0}},{\textbf {I}}_{K })\).

In this paper, we adopt a one-ring (OR) channel model [29], in which user \(k\) has an azimuth angle \(\theta\) and an angular spread (AS) \(\Delta\), and \(\theta _{k }\) is randomly distributedFootnote 1. Here, we sort the users as \(\theta _{\text {1}}\le \theta _{\text {2}}\le \cdots \le \theta _{K }\). The \((m ,p )\)-th element of the CCM \({\textbf {C}}_{k }\) of user \(k\) is [29]

$$\begin{aligned}{}[{\textbf {C}}_{k }]_{m ,p } = \frac{\textrm{1}}{\textrm{2}\Delta }\int _{\theta _{k }-\Delta }^{\theta _{k } +\Delta }{} e ^{\frac{-\iota \text {2}\pi D (m -p )\sin \theta }{\lambda _{C }}}{\hbox {d}}\theta , \end{aligned}$$

where \(\lambda _{C }\) is the carrier wavelength, \(D = \lambda _{C }/\text {2}\) is the spacing between two antenna elements. According to Karhunen–Loeve representation, we can write the channel vector of user \(k\) as \({\textbf {h}}_{k } = {\textbf {C}}_{k }^{\text {1/2}}{} {\textbf {z}}_{k }\), where \({\textbf {z}}_{k }\) is small-scale fading with \({\textbf {z}}_{k }\sim \mathcal {C}\mathcal {N}({\textbf {0}},{\textbf {I}}_{M })\). Letting \({\textbf {C}}=\sum _{k=1}^{K}{} {\textbf {C}}_{k}\), span(H) can be any subspace of span(C).

Since statistical CSI varies much slower than the instantaneous CSI, the BS can accurately obtain the statistical CSI through long-term feedback [32, 33].

2.2 The description of N-JSDM scheme

The N-JSDM uses neighbor scheme instead of grouping scheme to fully utilize the signal space and thus provide a better performance. The following is a brief introduction about N-JSDM.

For user \(k\), if \(\mid \theta _{k } -\theta _{j }\mid >\omega\), then user \(j\) is called user \(k\)’s non-neighbor, and the index set of user \(k\)’s non-neighbors is \(\bar{\Omega }_{k } = \{j |\mid \theta _{k }-\theta _{j }\mid >\omega \}\), where \(\omega\) is called neighbor angular spread (NAS) (in [21], \(\omega\) is chosen to be \(2\Delta\)); the index set of user \(k\) and its neighbors is \(\Omega _{k }= \{j |\mid \theta _{k }-\theta _{j }\mid \le \omega \}\). Since \(\theta _{\text {1}}\le \theta _{\text {2}}\le \cdots \le \theta _{K }\), the elements in \(\Omega _{k }\) are consecutive numbers, and \(\Omega _{k }\) can be represented as \(\Omega _{k } = \{k _{l },\ldots ,k ,\ldots ,k _{u }\}\). In the following, we refer to the set \(\Omega _{k }\) as the neighbor domain of user \(k\).

N-JSDM is a two-stage beamforming scheme. In the first stage, user \(k\)’s CCM \({\textbf {C}}_{k } (k =\text {1},\text {2},\ldots ,K )\) is used to design the pre-beamforming matrix \({\textbf {B}}_{k }\) to reduce non-neighbor interference, so that for each \(k\)

$$\begin{aligned} {\textbf {h}}_{k }^{H }{} {\textbf {B}}_{i }={\textbf {0}} ,i \in \bar{\Omega }_{k }. \end{aligned}$$

The effective channel matrix after the pre-beamforming stage is \({\textbf {H}}^{H }{} {\textbf {B}}\) where \({\textbf {B}}=[{\textbf {B}}_{\text {1}},{\textbf {B}}_{\text {2}},\ldots ,{\textbf {B}}_{K }]\). From Eq. (3), the \(k\)-th row of \({\textbf {H}}^{H }{} {\textbf {B}}\) can be written as

$$\begin{aligned} \begin{aligned} {\textbf {h}}_{k }^{H }{} {\textbf {B}}&= ({\textbf {h}}_{k }^{H }{} {\textbf {B}}_{\text {1}},{\textbf {h}}_{k }^{H }{} {\textbf {B}}_{\text {2}},\ldots , {\textbf {h}}_{k }^{H }{} {\textbf {B}}_{K })\\&=(\cdots ,\text {0},\text {0},{\textbf {h}}_{k }^{H }{} {\textbf {B}}_{\Omega _{k }},\text {0},\text {0},\ldots ), \end{aligned} \end{aligned}$$

where \({\textbf {B}}_{\Omega _{k }}=[{\textbf {B}}_{k _{l }},\ldots , {\textbf {B}}_{k },\ldots ,{\textbf {B}}_{k _{u}}]\) is defined as the matrix composed of the pre-beamforming matrix of user \(k\) and its neighbors. Equation (4) indicates that \({\textbf {h}}_{k}^{H }{} {\textbf {B}}\) has a continuous sequence of \(col({\textbf {B}}_{\Omega _{k}})\) nonzero values, where \(col(\cdot )\) refers to the number of columns. It should be noted that since the azimuth angle of users is sorted, when \(\theta _{k }>\theta _{j }\), there must be \(k _{l }\ge j _{l }\) and \(k _{u }\ge j _{u }\), so the effective channel matrix \({\textbf {H}}^{H }{} {\textbf {B}}\) is a band matrix.

In the second stage of N-JSDM, to eliminate the interference from neighbors, \({\textbf {W}}=(\tilde{{\textbf {H}}})^{\dag }\varvec{\Gamma }\) is designed using the zero forcing criterion [34]. Here, \(\tilde{{\textbf {H}}}\) represents the estimation of the effective channel matrix \({\textbf {H}}^{H }{} {\textbf {B}}\), and \(\varvec{\Gamma }=diag(\gamma _{\text {1}},\gamma _{\text {2}},\ldots ,\gamma _{K })\) is used to normalize each column of \(\tilde{{\textbf {H}}}\). As a result, the SINR at user \(k\) is given by [21]

$$\begin{aligned} SINR _{k }=\frac{\mid {\textbf {h}}_{k }^{H }{} {\textbf {B}} {\textbf {w}}_{k }\mid ^{\text {2}}}{\sum _{k ^{\prime }\ne k }\mid {\textbf {h}}_{k }^{H }{} {\textbf {B}}{} {\textbf {w}}_{k ^{\prime }}\mid ^{\text {2}}+\sigma ^{\text {2}}}, \end{aligned}$$

where \({\textbf {w}}_{k }\) is the \(k\)-th column of \({\textbf {W}}\) and \(\sigma ^{\text {2}}\) is the noise power. It should be noted that the precoding matrix of N-JSDM is written as \({\textbf {V}}={\textbf {BW}}\).

In more practical scenarios, users are randomly distributed. The conventional JSDMs divide users into G groups, and when the users are distributed randomly, there always exists common space between the signal spaces of adjacent groups. The signal space of the g-th group is denoted by \(span ({\textbf {H}}_{g})\). To mitigate the inter-group interference, \(span ({\textbf {B}}_{g})\) is orthogonal to all the signal space \(span ({\textbf {H}}_{j}),~j\ne g\). This means that \(span ({\textbf {B}}_{g})\) is orthogonal to all the overlapped signal space, and hence \(\cup _{g=1,2,\cdots ,G}{} span ({\textbf {B}}_{g})\) (i.e., span(B)) is orthogonal to all the overlapped signal space. Consequently, \(span ({\textbf {H}})\nsubseteq span ({\textbf {B}})\), resulting in a lower-dimensional utilized signal space \(span ({\textbf {B}}^{H }{} {\textbf {H}})\) compared to the full signal space \(span ({\textbf {H}})\), thereby decreasing the performance of JSDM. Compared to conventional JSDMs, N-JSDM offers the following advantages: Firstly, it achieves higher spectral efficiency. N-JSDM employs a neighbor grouping approach to further divide users into subgroups, eliminating the requirement for users in the same neighbor domain to share the same common subspace. This allows for the use of more refined precoding techniques to reduce interference, thereby improving the system’s spectral efficiency. Secondly, N-JSDM exhibits better interference mitigation capabilities. When designing the pre-beamforming scheme, N-JSDM takes into account the mutual interference between subgroups. By optimizing the pre-beamforming matrix, interference between subgroups can be more effectively reduced, enhancing the system’s interference mitigation performance. Therefore, N-JSDM is considered a more promising and feasible beamforming scheme.

3 User scheduling in N-JSDM

The N-JSDM with user scheduling involves three stages. The first stage is user scheduling, which is used to determine the azimuth angle of the scheduled users and the number of beams serving each user, denoted by \(\theta _{u }\) and \(g _{u }\), respectively. The second and third stages are pre-beamforming and multi-user precoding, which are used to obtain matrix \({\textbf {B}}\) and matrix \({\textbf {W}}\), respectively.

In a more realistic scenario with only a limited number of users are randomly distributed in the cell, it is more reasonable to dynamically allocate the number of beams serving each user. Therefore, in the user scheduling stage, we propose to allocate beams according to the interference between users and their neighbors. The idea of dynamic allocation is also extended to the pre-beamforming stage. In the pre-beamforming stage, the obtained \(\theta _{u }\) and \(g _{u }\) about the scheduled users are used to design the pre-beamformer. Further details regarding the design of the pre-beamformer can be found in Sect. 5. The transformation process of addressing the user scheduling problem is outlined below.

Because there is no concept of user group, the user scheduling approach in N-JSDM is fundamentally different from that of conventional JSDM. To address this, we propose a user scheduling algorithm that solely relies on user directional features. Specifically, we use two channel directional features [35]: azimuth angle and AS. Compared to the instantaneous CSI, these features are more stable [32, 33], and easier to obtain.

3.1 Problem formulation

The objective of scheduling is to maximize the ESE of the system. Assuming a coherence block with \(T _{C }\) symbols and pilot length \(P\), the ESE of user \(k\) can be expressed as

$$\begin{aligned} R _{k }=\left( \text {1}-\frac{P }{T _{C }}\right) \log _{\textrm{2}}(\textrm{1}+SINR _{k }). \end{aligned}$$

Note that there is overlapping signal space between some users in the system, and such overlapping signal space represents the inter-user interference (IUI). In the OR channel model, the angle region of user \(k\) is defined as

$$\begin{aligned} \Phi _{k } = (\theta _{k }-\Delta ,\theta _{k }+\Delta ). \end{aligned}$$

Based on the angle region, we introduce the overlap angle (OA) to represent the degree of overlap between users. If there is an intersection between the angle regions of user \(k\) and user \(j\), i.e., \(\Phi _{k }\cap \Phi _{j }\ne \emptyset\), the intersection is called an OA. The OA of user \(k\) and user \(j\) is defined as (\(j ,k =\text {1},\text {2},\ldots ,K\))

$$\begin{aligned} A _{kj } = {\left\{ \begin{array}{ll} -\mid \theta _{k }-\theta _{j }\mid +\textrm{2}\Delta , &{}\mid \theta _{k }-\theta _{j }\mid \le \textrm{2}\Delta , j \ne k ; \\ ~~~~~~~~~~0, &{} else. \end{array}\right. } \end{aligned}$$

These angles depict the interference among users. Since \(\omega =\text {2}\Delta\), the OA between user \(k\) and user \(j\) is nonzero if they are neighbors, and zero otherwise. By using the OA, we can construct an angle matrix \({\textbf {A}}\), where \(A _{kj }\) is the (\(k ,j\))-th element of matrix \({\textbf {A}}\). The \(k\)-th row of the matrix \({\textbf {A}}\) can be written as

$$\begin{aligned} \hspace{-0.1mm}{} {\textbf {A}}^{k } = (A _{k \text {1}},A _{k \text {2}},\ldots ,A _{kK }) =(\cdots ,\text {0},\text {0},{\textbf {A}}_{k \Omega _{k }},\text {0},\text {0},\ldots ), \end{aligned}$$

where \({\textbf {A}}_{k \Omega _{k }}=[A _{k k _{l }},\ldots ,A _{k {} k _{u }}]\) is composed of the OA of user \(k\) and its neighbors. From (9), it can be observed that \({\textbf {A}}_{k \Omega _{k }}\) has \(| \Omega _{k }|-\text {1}\) nonzero elements, where \(|\Omega _{k }|\) refers to the number of elements in the index set \(\Omega _{k }\). To distinguish the neighbors and non-neighbors, we introduce an unweighted matrix \(\hat{{\textbf {A}}}\), whose \(k\)-th row can be written as

$$\begin{aligned} \hat{\textbf{A}}^{k } =(\textrm{0},\ldots ,\textrm{0},\underbrace{\textrm{1},\textrm{1}, \ldots ,\textrm{1},\overbrace{\textrm{0}}^{user k },\textrm{1},\ldots ,\textrm{1}}_{\mid \Omega _{k }\mid -\textrm{1}},\textrm{0},\ldots ,\textrm{0}). \end{aligned}$$

Since the denominator of SINR contains the interference term, there is a strong correlation between IUI and SINR. By reducing interference among users, SINR increases. Furthermore, in practical systems, the length of pilot sequences is often limited. As a result, the problem of maximizing the ESE is transformed into minimizing interference while adhering to the constraint of maximum pilot length.

To describe the problem, use \(x _{i }\) to denote whether user \(i\) is selected, i.e.,

$$\begin{aligned} x _{i } = {\left\{ \begin{array}{ll} \text {1}, &{} selected;\\ \text {0}, &{} not~selected. \end{array}\right. } \end{aligned}$$

Then, the problem of minimizing the sum of OAs with pilot constraints is formulated as

$$\begin{aligned}&\mathcal {P}_{\textrm{1}}: \min _{{\textbf {x}}}\sum _{i =\text {1}}^{K }\sum _{j =\text {1}}^{K }{} A _{ij }{} x _{i }{} x _{j } \end{aligned}$$
$$\begin{aligned}&s.t. ~~ \sum _{i =\text {1}}^{K }{} x _{i }=U \end{aligned}$$
$$\begin{aligned}&~~~~ \sum _{j \in \Omega _{i }}\beta _{i }{} g _{s } x _{j }\le P _{C } \end{aligned}$$
$$\begin{aligned}&~~~~~~~~ {\textbf {x}}\in \{\text {0},\text {1}\}^{K }, \end{aligned}$$

where \(U\) represents the number of scheduled users, \(\beta _{i }\) denotes the weighted factor used to adjust the number of beams allocated to each user, \(g _{s }\) refers to the average number of beams assigned to each user in the system after scheduling, \(P _{C }\) represents the maximum pilot constraint, and \({\textbf {x}}=[x _{\text {1}},x _{\text {2}},\ldots ,x _{K }]\in \{\text {0},\text {1}\}^{K }\) with \(x _{i }\in \{\textrm{0},\textrm{1}\}\). For a more efficient system, we aim to allocate a total number of beams close to M when the number of scheduled users is high. Conversely, when the number of scheduled users is low, increasing the number of beams allocated to each user beyond a certain point will not improve system performance. Therefore, an upper bound value \(\xi\) is set for the number of beams assigned to each scheduled user. Based on these considerations, the value of \(g _{s }\) is set to \(\min (\xi ,M /U )\). The specific design details of \(\beta _{i }\) can be found in Sect. 4.

It is evident that the angle matrix \({\textbf {A}}\) serves as the coefficient matrix in the objective function of \(\mathcal {P}_{\textrm{1}}\). In convex quadratic programming, the Hessian matrix of the objective function is positive definite. In the case of \(\mathcal {P}_{\textrm{1}}\), the Hessian matrix of the objective function is represented by \({\textbf {L}}=\text {2}{} {\textbf {A}}\). If matrix \({\textbf {A}}\) is positive definite, then according to the properties of eigenvalues and the necessary and sufficient conditions for a positive definite matrix, matrix \({\textbf {L}}\) is also positive definite.

However, due to the fact that all diagonal elements of matrix \({\textbf {A}}\) are \(\text {0}\), some of the sequential principal minors of matrix \({\textbf {A}}\) may be smaller than \(\text {0}\). Therefore, matrix \({\textbf {A}}\) cannot be a positive definite matrix. To address this, we add a scalar matrix to matrix \({\textbf {A}}\), transforming it into a positive definite matrix \({\textbf {A}}_{P }\), which can be expressed as

$$\begin{aligned} {\textbf {A}}_{P } = {\textbf {A}} + \alpha \cdot {\textbf {I}}_{K }. \end{aligned}$$

If \(\alpha\) surpasses the absolute value of the minimum eigenvalue of matrix \({\textbf {A}}\), \({\textbf {A}}_{P}\) is deemed positive definite [36]. For simplicity, we set \(\alpha\) as the smallest positive integer that ensures the positive definiteness of matrix \({\textbf {A}}_{P}\). By replacing the coefficient matrix in the objective function of \(\mathcal {P}_{\textrm{1}}\) with matrix \({\textbf {A}}_{P }\), the optimization problem can be transformed into the following matrix form

$$\begin{aligned}&\mathcal {P}_{\textrm{2}}: \min _{{\textbf {x}}}{} {\textbf {x}}^{T }{} {\textbf {A}}_{P }{} {\textbf {x}} \end{aligned}$$
$$\begin{aligned}&s.t. ~~ {\textbf {e}}^{T }{} {\textbf {x}}=U \end{aligned}$$
$$\begin{aligned}&~~~~~~{\textbf {n}}_{B }\odot (\hat{{\textbf {A}}}_{f }{} {\textbf {x}})\le P _{C }\cdot {\textbf {e}} \end{aligned}$$
$$\begin{aligned}&~~~~~~ {\textbf {x}}\in \{\text {0},\text {1}\}^{K }, \end{aligned}$$

where \({\textbf {e}}=[\text {1},\text {1},\ldots ,\text {1}]^{T }\in \mathbb {N}^{K \times \textrm{1}}\), \(\hat{{\textbf {A}}}_{f }\) is a matrix formed by setting the diagonal elements of matrix \(\hat{\textbf{A}}\) to \(\text {1}\), \({\textbf {n}}_{B }=[\beta _{\textrm{1}}{} g _{s },\beta _{\textrm{2}}{} g _{s },\ldots ,\beta _{K }{} g _{s }]^{T }\), and \(\beta _{k }{} g _{s }\) is the number of beams serving users in the neighbor domain of user \(k\). It is worth noting that \(\mathcal {P}_{\textrm{1}}\) and \(\mathcal {P}_{\textrm{2}}\) are equivalent, as they share the same optimal solution and their optimal values differ by a constant \(\alpha \cdot U\).


Matrix \({\textbf {A}}\) (or matrix \({\textbf {A}}_{P }\)) is derived from two directional features of the users, namely the azimuth angle \(\theta\) and angular spread \(\Delta\). As a result, the proposed user scheduling algorithm only requires these two directional features to perform the scheduling task.

4 Linearization of 0–1 quadratic programming

In this section, we propose methods to solve \(\beta _{i }\) and the scheduling problem \(\mathcal {P}_{\textrm{2}}\).

4.1 Beam allocation based on overlap density

Before scheduling, only azimuth angle \(\theta\), AS \(\Delta\) and NAS \(\omega\) can be determined. It is crucial to note that the pre-beamforming matrix \({\textbf {B}}\) of N-JSDM is solved iteratively, and \(span ({\textbf {B}})=span ^{\perp }(\bar{{\textbf {C}}}_{k })\bigcap span ({\textbf {C}}_{k })\bigcap span ^{\perp }({\textbf {B}}_{\Psi _{k }})\), where \({\textbf {B}}_{\Psi _{k }}=[{\textbf {B}}_{k _{l }},{\textbf {B}}_{k _{l }+\text {1}},\ldots ,{\textbf {B}}_{k -\text {1}}]\). This implies that the azimuth angle of users must be known during the process of solving \({\textbf {B}}\), making it challenging to obtain matrix \({\textbf {B}}\) during the user scheduling process. Therefore, we propose a beam allocation method based on the overlap density of neighbor domains. The method is aimed at determining the number of columns of the pre-beamforming matrix \({\textbf {B}}_{k }\).

Note that when the local distribution of users is dense, it is advisable to use a small number of beams to serve these users and use more beams to serve other users. This approach is based on the fact that the number of beams in a particular angle area is limited, and it can not only reduce the pilot overhead but also enable the system to serve more users.

The overlap density of the neighbor domain \(\Omega _{k }\) is used to describe the average degree of overlap between any two users in set \(\Omega _{k }\) and can be calculated as

$$\begin{aligned} \rho _{k }=\frac{\frac{\textrm{1}}{\textrm{2}}\sum _{i ,j \in \Omega _{k }}{} A _{ij }}{C _{|\Omega _{k }|}^{\text {2}}\cdot \text {2}\Delta }, \end{aligned}$$

where the numerator represents the sum of OAs between users in \(\Omega _{k }\), the coefficient \(\frac{\textrm{1}}{\textrm{2}}\) is due to the real symmetry of the angle matrix \({\textbf {A}}\), and \(C _{|\Omega _{k }|}^{\text {2}}=\frac{|\Omega _{k }|!}{\textrm{2}!(|\Omega _{k }|-\text {2})!}\) in the denominator is the combination number formula. Considering that the OA range between users in set \(\Omega _{k }\) is \((\text {0},\text {2}\Delta ]\). The denominator of Eq. (15) represents the upper bound of the sum of OAs between users in \(\Omega _{k }\), which is equal to the superposition of the maximum OAs of any two users in \(\Omega _{k }\). It should be noted that the value range of \(\rho _{k }\) is \((\text {0},\text {1}]\).

Next, \(\rho _{k}\) is used to determine the average number of beams allocated to each user within the set \(\Omega _{k}\). As explained in Sect. 3, the average number of beams serving each user in \(\Omega _{k }\) is \(\beta _{k }{} g _{s }\). Assuming that the value range of \(\beta _{k }{} g _{s }\) is \([g _{s }-\tau ,g _{s }+\tau ]\), then the expression of \(\beta _{k }{} g _{s }\) is as follows

$$\begin{aligned} \beta _{k }{} g _{s } = {\left\{ \begin{array}{ll} g _{s }+\tau (\text {1}-\text {2}\rho _{k })&{}, |\Omega _{k }|\ne \text {1};\\ ~~~~~~g _{s }&{}, |\Omega _{k }|=\text {1}. \end{array}\right. } \end{aligned}$$

As seen in (16), when some users in the system are densely distributed, i.e., the overlap density of their neighbor domains is high, the number of beams serving these users will decrease, and vice versa. The detail of how to obtain \({\textbf {n}}_{B }\) is in Algorithm 1.


The value of \(\tau\) should not be too large, because when the overlap density of the neighbor domain \(\Omega _{k }\) is small, the average number of beams serving these users will be close to \(g _{s }+\tau\). This implies that the total number of beams serving users in this neighbor domain will increase by \(\tau |\Omega _{k }|\). Additionally, it is essential to emphasize that while solving problem \(\mathcal {P}_{\textrm{2}}\) (which will later be transformed into problem \(\mathcal {P}_{\textrm{5}}\)), we do not have knowledge of the exact number of beams serving each user, but only the average value in the neighbor domain.

Algorithm 1
figure a

Beam Allocation Based on Overlap Density

4.2 Linearization

Note that the user scheduling problem in \(\mathcal {P}_{\textrm{2}}\) is a 0–1 quadratic programming problem whose computational complexity increases exponentially with the problem size. To solve \(\mathcal {P}_{\textrm{2}}\) with a low-computational complexity, we further transform it into a 0–1 mixed integer linear programming as follows. Consider the following optimization problem

$$\begin{aligned}&\mathcal {P}_{\textrm{3}}: \min _{{\textbf {x}},{\textbf {z}},{\textbf {s}}}{} {\textbf {e}}^{T }{} {\textbf {s}} \end{aligned}$$
$$\begin{aligned}&s.t. ~~ {\textbf {A}}_{P }{} {\textbf {x}}-{\textbf {z}}-{\textbf {s}}={\textbf {0}} \end{aligned}$$
$$\begin{aligned}&~~~~~~{\textbf {z}}^{T }{} {\textbf {x}}=\text {0} \end{aligned}$$
$$\begin{aligned}&~~~~~~ {\textbf {e}}^{T }{} {\textbf {x}}=U \end{aligned}$$
$$\begin{aligned}&~~~~~~{\textbf {n}}_{B }\odot (\hat{{\textbf {A}}}_{f }{} {\textbf {x}})\le P _{C } \cdot {\textbf {e}} \end{aligned}$$
$$\begin{aligned}&~~~~~~{\textbf {z}}\ge {\textbf {0}},{\textbf {s}}\ge {\textbf {0}},{\textbf {x}}\in \{\text {0},\text {1}\}^ {K }, \end{aligned}$$

where \({\textbf {z}}\in \mathbb {R}^{K \times \text {1}}\) and \({\textbf {s}}\in \mathbb {R}^{K \times \text {1}}\).

Theorem 1

\(\mathcal {P}_{\textrm{2}}\) has an optimal solution \({\textbf {x}}^{*}\) if and only if there are \({\textbf {z}}^{*}\) and \({\textbf {s}}^{*}\) such that \(({\textbf {x}}^{*},{\textbf {z}}^{*},{\textbf {s}}^{*})\) is an optimal solution to \(\mathcal {P}_{\textrm{3}}\), and \(\mathcal {P}_{\textrm{2}}\) and \(\mathcal {P}_{\textrm{3}}\) have the same optimal solution.


See Appendix A.

It can be observed that the constraint (17b) in \(\mathcal {P}_{\textrm{3}}\) is quadratic, so \(\mathcal {P}_{\textrm{3}}\) is not a linear programming. To further process \(\mathcal {P}_{\textrm{3}}\), we proceed as follows: From (17b), we can deduce that if \(x _{i }=\text {1}\), then \(z _{i }\) must be \(\text {0}\), but if \(x _{i }\ne \text {1}\), then \(z _{i }\) is not necessarily \(\text {0}\). Moreover, from (17a), we have \({\textbf {z}}\le {\textbf {A}}_{P }{} {\textbf {x}}\), implying an upper bound on \({\textbf {z}}\). Thus, we have \({\textbf {z}}\le {\textbf {A}}_{P }{} {\textbf {x}}\le \Vert {\textbf {A}}_{P }\Vert _{\infty }\cdot {\textbf {e}}\), where \(\Vert {\textbf {A}}_{P }\Vert _{\infty }=\max _{i }\sum _{j =\text {1}}^{K}|a _{ij }|\) is the infinite norm of the matrix \({\textbf {A}}_{P }\). By letting \(M _{T }=\Vert {\textbf {A}}_{P }\Vert _{\infty }\) and using \({\textbf {z}}\le M _{T }({\textbf {e}}-{\textbf {x}})\) to replace \({\textbf {z}}^{T }{} {\textbf {x}}=\text {0}\), we can transform \(\mathcal {P}_{\textrm{3}}\) into the following form

$$\begin{aligned}&\mathcal {P}_{\textrm{4}}: \min _{{\textbf {x}},{\textbf {z}},{\textbf {s}}}{} {\textbf {e}}^{T }\textbf{s} \end{aligned}$$
$$\begin{aligned}&s.t. ~~ {\textbf {A}}_{P }{} {\textbf {x}}-{\textbf {z}}-{\textbf {s}}={\textbf {0}} \end{aligned}$$
$$\begin{aligned}&~~~~~~{\textbf {z}}\le M _{T }({\textbf {e}}-{\textbf {x}}) \end{aligned}$$
$$\begin{aligned}&~~~~~~ {\textbf {e}}^{T }{} {\textbf {x}}=U \end{aligned}$$
$$\begin{aligned}&~~~~~~{\textbf {n}}_{B }\odot (\hat{{\textbf {A}}}_{f }{} {\textbf {x}})\le P _{C } \cdot {\textbf {e}} \end{aligned}$$
$$\begin{aligned}&~~~~~~{\textbf {z}}\ge {\textbf {0}},{\textbf {s}}\ge {\textbf {0}},{\textbf {x}}\in \{\text {0},\text {1}\}^ {K }. \end{aligned}$$


Theorem 2

\(({\textbf {x}},{\textbf {z}},{\textbf {s}})\) is a feasible solution of \(\mathcal {P}_{\textrm{3}}\) if and only if \(({\textbf {x}},{\textbf {z}},{\textbf {s}})\) is a feasible solution of \(\mathcal {P}_{\textrm{4}}\); \(({\textbf {x}},{\textbf {z}},{\textbf {s}})\) is an optimal solution of \(\mathcal {P}_{\textrm{3}}\) if and only if \(({\textbf {x}},{\textbf {z}},{\textbf {s}})\) is an optimal solution of \(\mathcal {P}_{\textrm{4}}\).


When \(({\textbf {x}},{\textbf {z}},{\textbf {s}})\) is a feasible solution of \(\mathcal {P}_{\textrm{3}}\), obviously \(({\textbf {x}},{\textbf {z}},{\textbf {s}})\) is a feasible solution of \(\mathcal {P}_{\textrm{4}}\). Assuming that \(\mathcal {P}_{\textrm{4}}\) has a feasible solution \(({\textbf {x}},{\textbf {z}},{\textbf {s}})\), because of \({\textbf {0}}\le {\textbf {z}}\le M _{T }({\textbf {e}}-{\textbf {x}})\), when \(x _{i }=\text {1}\), there must be \(z _{i }=\text {0}\), while \(x _{i }\ne \text {1}\) implies that \({\textbf {z}}\le M _{T }\). Therefore, we can obtain \({\textbf {z}}^{T }{} {\textbf {x}}=\text {0}\), indicating that \(({\textbf {x}},{\textbf {z}},{\textbf {s}})\) is also a feasible solution of \(\mathcal {P}_{\textrm{3}}\). Similarly, it can be proven that \(({\textbf {x}},{\textbf {z}},{\textbf {s}})\) is an optimal solution of \(\mathcal {P}_{\textrm{3}}\) if and only if \(({\textbf {x}},{\textbf {z}},{\textbf {s}})\) is an optimal solution of \(\mathcal {P}_{\textrm{4}}\).

4.3 The algorithm to obtain scheduled users and beams

It is worth noting that the solution space dimension of \(\mathcal {P}_{\textrm{4}}\) is \(\text {3}{} K\). This implies that if the number of original users in the system is large, the solution space dimension of \(\mathcal {P}_{\textrm{4}}\) will also be large. As the computational complexity grows with the size of the problem, \(\mathcal {P}_{\textrm{4}}\) still has a high complexity when the user scale is large. Thus, we simplify \(\mathcal {P}_{\textrm{4}}\) as follows: Since \({\textbf {A}}_{P }{} {\textbf {x}}-{\textbf {z}}-{\textbf {s}}={\textbf {0}} \Leftrightarrow {\textbf {A}}_{P }{} {\textbf {x}}-{\textbf {s}}={\textbf {z}}\) and \({\textbf {0}}\le {\textbf {z}}\le M _{T }({\textbf {e}}-{\textbf {x}})\), the constraints (18a) and (18b) can be transformed into \({\textbf {0}}\le {\textbf {A}}_{P }{} {\textbf {x}}-{\textbf {s}} \le M _{T }({\textbf {e}}-{\textbf {x}})\). Hence, \(\mathcal {P}_{\textrm{4}}\) can be transformed into

$$\begin{aligned}&\mathcal {P}_{\textrm{5}}: \min _{{\textbf {x}},{\textbf {z}},{\textbf {s}}}{} {\textbf {e}}^{T }{} {\textbf {s}} \end{aligned}$$
$$\begin{aligned}&s.t. ~~ {\textbf {A}}_{P }{} {\textbf {x}}-{\textbf {s}}\ge {\textbf {0}} \end{aligned}$$
$$\begin{aligned}&~~~~~~{\textbf {A}}_{P }{} {\textbf {x}}-{\textbf {s}}\le M _{T }({\textbf {e}}-{\textbf {x}})\label{eq:ex19b} \end{aligned}$$
$$\begin{aligned}&~~~~~~ {\textbf {e}}^{T }{} {\textbf {x}}=U \end{aligned}$$
$$\begin{aligned}&~~~~~~{\textbf {n}}_{B }\odot (\hat{{\textbf {A}}}_{f }{} {\textbf {x}})\le P _{C }\cdot {\textbf {e}} \end{aligned}$$
$$\begin{aligned}&~~~~~~{\textbf {s}}\ge {\textbf {0}},{\textbf {x}}\in \{\text {0},\text {1}\}^{K }. \end{aligned}$$

\(\mathcal {P}_{\textrm{5}}\) is a mixed integer linear programming that can be solved using the branch and bound algorithm. Here, we implemented the branch and bound algorithm using the MOSEK optimization solver [37] in the CVX toolbox.


In practical scenarios, local users may be densely distributed, and/or the pilot requirements \(P _{C }\) may be too strict, leading to situations where problem \(\mathcal {P}_{\textrm{5}}\) has no solution. In such cases, we choose to gradually reduce the number of scheduled users \(U\) until they can be effectively served. To achieve this, we reduce one user at a time, update \(g _{s }\), and then recalculate the solution of \(\mathcal {P}_{\textrm{5}}\) based on the updated conditions.

Once \(\mathcal {P}_{\textrm{5}}\) has been solved, we can determine the average number of beams serving each user in each neighbor domain and the scheduled users. However, the exact number of beams serving each user remains unknown. To address this issue, we utilize a linear system of equations to calculate \(g _{k }\). Firstly, we sort the users in ascending order based on their azimuth angle and obtain the angle matrix \({\textbf {A}}_{S }\) of the scheduled users. We set its diagonal elements to \(\text {1}\) and convert it into an unweighted matrix \(\hat{{\textbf {A}}}_{S }\). Then, we sum the rows of the matrix and convert it into a diagonal matrix \({\textbf {D}}_{S }\). The form of matrix \({\textbf {D}}_{S }\) is as follows

$$\begin{aligned} {\textbf {D}}_{S } = diag(|\Omega _{\text {1}}|,|\Omega _{\text {2}}|,\ldots ,|\Omega _{U }|). \end{aligned}$$

We can also get the \(\beta _{k }{} g _{s }(k =\text {1},\text {2},\ldots ,K )\) corresponding to the remaining users and sort them in ascending order, i.e., \(\check{{\textbf {n}}}_{B }=(\check{\beta }_{\text {1}}{} g _{s }, \check{\beta }_{\text {2}}{} g _{s },\ldots , \check{\beta }_{U }{} g _{s })\). Considering that some users are neighbors with each other but non-neighbors with other users, we take the average of \(\check{\beta }_{u }\) for these neighbor users. The system of equations for solving \(g _{u }(u =\text {1},\text {2},\ldots ,U )\) is as follows

$$\begin{aligned} \hat{{\textbf {A}}}_{S }{} {\textbf {g}}={\textbf {D}}_{S }\check{{\textbf {n}}}_{B }. \end{aligned}$$

The solution of \({\textbf {g}}\) is \({\textbf {g}}=[g _{\text {1}},g _{\text {2}},\ldots ,g _{U }]^{T }=(\hat{{\textbf {A}}}_{S })^{\dag }{} {\textbf {D}}_{S }\check{{\textbf {n}}}_{B }\). As the solution for \(g _{u }\) may contain decimal values, we perform round down operation on it, i.e., \(g _{u } = \lfloor g _{u }\rfloor\), and set the solution of \(g _{u }\) to \(\text {1}\) if it is less than \(\text {1}\). Please refer to Algorithm 2 for the details of solving \(\mathcal {P}_{\textrm{5}}\) and determining the number of beams.

Algorithm 2
figure b

Acquisition of Scheduling users and the number of beams

There are a total of three benchmark algorithms considered in this paper. It should be noted that user scheduling and pre-beamforming in the active channel sparsification [22] method are coupled, requiring the solution of a mixed integer linear programming for joint beam and user selection. However, without specifying the optimization toolkit used, it is not possible to determine its computational complexity. Therefore, we conduct a brief analysis and comparison of the computational complexity of proposed algorithm and the other two benchmark algorithms. The user scheduling in conventional JSDM schemes consists of two stages: user grouping and intra-group user scheduling. In the user grouping stage, the computational complexity of the K-means user grouping method with chordal distance in [25] is \(\mathcal {O}(N_{it}KG(2M^{3}+M^{2}))\), where \(N_{it}\) is the default number of iterations. The computational complexity of the agglomerative user clustering method with chordal distance in [28] is \(\mathcal {O}(\frac{K(K-1)}{2}(2M^{3}+M^{2}))\). Since intra-group user scheduling is often coupled with beamforming design, it would be unfair to compare its computational complexity with our proposed algorithm. The computational complexity of the greedy intra-group user scheduling algorithm in [26] is \(\mathcal {O}(UK)\) after modifying the termination condition to scheduling U users and the complexity of beamforming design being ignored. It can be observed that the user grouping stage is the main contributor to the complexity.

In contrast, the computational complexity of our proposed algorithm (Algorithm 2) depends on the complexity of two sub-processes: Algorithm 1 and optimization problem \(\mathcal {P}_{\textrm{5}}\). The computational complexity of Algorithm 1 is \(\mathcal {O}(K(K-1))\), where \(K-1\) is the number of times to determine the edge neighbors of each user. Optimization problem \(\mathcal {P}_{\textrm{5}}\) is solved using the branch and bound method, with a computational complexity of \(\mathcal {O}(2^{2K})\), where 2K represents the problem scale. Therefore, the overall computational complexity of our proposed algorithm is \(\mathcal {O}(U(K(K-1)+2^{2K}))\).

4.4 Discussion on proposed user scheduling algorithm

This scheme has three advantages. First, it proposes a beam allocation method that considers the overlap density in the neighboring domain, which guarantees that all scheduled users can be served. This is due to the problem that the pre-beamforming design method with constrained DTL in N-JSDM cannot be implemented because of the dense local user distribution. Second, the scheme takes into account the influence of the pilot. Furthermore, the scheme is adaptive. If there is no solution to the optimization problem, the number of scheduled users will be gradually reduced until they can be served. Gradually reducing the number of scheduled users in practical scenarios until they can be effectively served brings the following benefits: reduced system load, improved user experience, and decreased interference levels, among others [25].

Figures 1 and 2 illustrate two examples of user scheduling results in a macro-cell scenario. Figures 1a and 2a are drawn from the same initial distribution of users, as are Figs. 1b and 2b. In Fig. 1, the hollow diamond at the center represents the massive MIMO base station, and the large circle indicates the coverage area with a radius of 50 km. Other markers represent users, where hollow circles denote unscheduled users, and solid circles represent scheduled users. In certain scenarios, the actual number of scheduled users, denoted as \(U'\), may fall short of our expectations due to unfavorable initial user distributions and stringent pilot conditions (for example, Fig. 1b).

In Fig. 2, we employ bar graphs to illustrate the scheduling status of users. The vertical axis represents the average number of beams serving each user in their respective neighborhoods, while the horizontal axis represents the user indices. Unfilled bars indicate unscheduled users, while filled bars indicate scheduled users. It can be observed that when the system imposed a limited length of pilots, the desired number of scheduled users cannot be achieved, resulting in \(U'<U\). In such cases, the relationship between the average number of beams serving users in their neighborhoods and their scheduling status is not evident. The high average number of beams serving each user in the neighborhood can be attributed to two factors: low overlap density in user neighborhoods and users having fewer neighbors. According to the expression of the pilot (22), we know that the pilot is not only related to the number of neighbors but also to the total number of beams serving users in their neighborhoods. Hence, even if the average number of beams per user is relatively small, a subset of users will still be scheduled to ensure the scheduling of U’ users within the limited pilot length. User \(\text {16}\) and user \(\text {36}\) in Fig. 2a and user \(\text {20}\) and user \(\text {21}\) in Fig. 2b serve as examples of this scenario.

Figure 3 displays the ESEs under different \(P _{C }\)s. As \(P _{C }\) increases, the ESE initially increases and then levels off. This indicates that a small number of \(P _{C }\)s often leads to a failure in scheduling the expected number of users, resulting in performance degradation. Furthermore, it can be observed that a small value of parameter \(\xi\) primarily helps to maintain a better level of ESE when \(P _{C }\) is relatively small. Additionally, as shown in Fig. 3b, when the number of scheduled users is small, the value of \(P _{C }\) that results in a smoother ESE will also decrease proportionally. From Fig. 3b, we can also observe that when the number of scheduled users is small and \(P_{C}\) is large, the ESE for \(\xi =2\) is significantly lower than for other values. This discrepancy occurs because a larger \(P_{C}\) value generally leads to a higher likelihood of achieving the expected number of scheduled users. Considering the condition \(M /U > 2\), it implies that with \(\xi =2\), fewer beams are allocated to each user compared to other values. This limitation restricts the column number of \({\textbf {B}}\) to a significantly smaller value than the number of antennas M, resulting in a larger discrepancy between the column space of \({\textbf {B}}\) and the column space of \({\textbf {C}}\) compared to other values. Consequently, the ESE is lower for \(\xi =2\). Therefore, the parameter \(\xi\) should be set based on both \(P _{C }\) and the number of scheduled users \(U\) to optimize system performance.

Fig. 1
figure 1

User scheduling scenarios

Fig. 2
figure 2

Average number of beams under different pilot constraints

Fig. 3
figure 3

Comparison of the ESEs under different \(P_{C}\)s. \(K=36\), \(\Delta =5^{\circ }\), \(\text {NAS}=10^{\circ }\), \(\text {SNR}=20~\text {dB}\)

5 Pre-beamformer design with dynamic beam configuration

We now present the dynamic pre-beamformer design for scheduled users, which differs from previous pre-beamformer designs in N-JSDM by considering the specific user distribution to dynamically configure the beams. The previous designs include the optimal design and the design method with constrained DTL [21]. While the optimal design achieves good performance with a large DTL, it is not suitable for pilot-constrained scenarios. To reduce the DTL, the constrained DTL design limits the number of columns of the pre-beamforming matrix for each user [21]. Specifically, the number of columns of the pre-beamforming matrix \({\textbf {B}}\) is set to \(\lfloor g {} K \rfloor\), where \(g = M /K\), and \(\lfloor \cdot \rfloor\) is the round down operation. The number of columns in \({\textbf {B}}_{k }\) is \(\lfloor g {} k \rfloor - \lfloor g (k - \text {1})\rfloor\).

However, the constrained DTL design method has a fixed number of columns for \({\textbf {B}}_{k }\), which makes it unsuitable for scenarios with randomly distributed users. Therefore, we propose dynamically configuring the number of columns in \({\textbf {B}}_{k }\). Notably, in scenarios with harsh pilot conditions, the optimal design may not meet the transmission requirement, while our proposed method can satisfy it. In the following, we describe how to implement this method using the obtained \(\theta _{u }\) and \(g _{u }\).

Assume that \(\theta _{u }\) and \(g _{u }\) of the scheduled users are given. For the unity of symbols, we still use the subscript \(k\) to denote the parameters related to user \(k\) in this section. From Eq. (4), we can know that user \(k\) only needs to feed back \({\textbf {h}}_{k }^{H }{} {\textbf {B}}_{\Omega _{k }}\) to BS. The feedback length \(d _{k }\) equals to the number of elements of \({\textbf {h}}_{k }^{H }{} {\textbf {B}}_{\Omega _{k }}\), i.e., the number of columns of \({\textbf {B}}_{\Omega _{k }}\). The minimum DTL is \(L =\max _{k }{} d _{k }\). In this work, we consider the case where the pilot is the minimum DTL.

In this paper, the pilot length is limited. Since the number of columns of matrix \({\textbf {B}}_{k }\) is \(g _{k }\), the index set of user \(k\) and its neighbors has a linear relationship with the number of columns of \({\textbf {B}}_{\Omega _{k }}\), i.e., the number of neighbors of the user has a linear relationship with the number of pilots. The pilot length \(P\) is given by

$$\begin{aligned} P = \max _{\Omega _{k }}\sum _{i \in \Omega _{k }}{} g _{i }. \end{aligned}$$

To mitigate the non-neighbors’ interference of user k, the pre-beamforming matrix \({\textbf {B}}\) needs to be designed satisfying Eq. (3). Considering that if user i is a neighbor of user k, then conversely, user k is also a neighbor of user i. Therefore, we can regard Eq. (3) as the problem of designing matrix \({\textbf {B}}_{k}\) to satisfy

$$\begin{aligned} {\textbf {h}}_{i}^{H }{} {\textbf {B}}_{k}={\textbf {0}} ,i\in \bar{\Omega }_{k}, \end{aligned}$$

for each k. According to Karhunen–Loeve representation, we can express the channel vector of user k as \({\textbf {h}}_{k} = {\textbf {C}}_{k}^{\text {1/2}}{} {\textbf {z}}_{k}\), where \({\textbf {C}}_{k}^{\text {1/2}}\) is a Hermitian matrix. Substituting this into Eq. (23), we obtain the equivalent form

$$\begin{aligned} {\textbf {z}}_{i}^{H }{} {\textbf {C}}_{i}^{\text {1/2}}{} {\textbf {B}}_{k }={\textbf {0}} ,i\in \bar{\Omega }_{k}. \end{aligned}$$

During the pre-beamforming stage, only the CCMs \({\textbf {C}}_{k}\) are available at the BS. Without the knowledge of \({\textbf {z}}_{i}\), Eq. (24) can be reformulated as

$$\begin{aligned} {\textbf {C}}_{i}^{\text {1/2}}{} {\textbf {B}}_{k}={\textbf {0}} ,i\in \bar{\Omega }_{k}. \end{aligned}$$

This implies \(span ({\textbf {B}}_{k})\subseteq span ^{\bot }({\textbf {C}}_{i}^{\text {1/2}})\) for each \(i\in \bar{\Omega }_{k}\). Based on \(Lemma~1\) in [21], we have \(span ({\textbf {B}}_{k})\subseteq span ^{\bot }(\bar{{\textbf {C}}}_{k})\), where \(\bar{{\textbf {C}}}_{k}=\sum _{i\in \bar{\Omega }_{k}}{} {\textbf {C}}_{i}\).

To fully utilize the signal space and achieve large spectral efficiency, we design \(span ({\textbf {B}})\) to be close to \(span ({\textbf {C}})\). This is because the spectral efficiency of the system will be maximized when design \({\textbf {B}}\) satisfying \(\mathcal {S}_{{\textbf {C}}}\cap span ({\textbf {H}})\subseteq span ({\textbf {B}})\) where \(\mathcal {S}_{{\textbf {C}}}=\bigcup span ^{\bot }(\bar{{\textbf {C}}}_{k })\), and \(span ({\textbf {H}})\subseteq \bigcup span ({\textbf {C}}_{k })\subseteq \mathcal {S}_{{\textbf {C}}}\) [21]. The difference between two spaces is represented by the chordal distance [25], and the chordal distance of spaces \(span ({\textbf {C}})\) and \(span ({\textbf {B}})\) is

$$\begin{aligned} D _{C }(span ({\textbf {B}}),span ({\textbf {C}})) =\parallel {\textbf {U}}_{{\textbf {B}}}{} {\textbf {U}}_{{\textbf {B}}}^{H }- {\textbf {U}}_{{\textbf {C}}}{} {\textbf {U}}_{{\textbf {C}}}^{H }\parallel _{F }^{\text {2}}, \end{aligned}$$

where \(\Vert \cdot \Vert _{F }\) is the Frobenius norm, \({\textbf {U}}_{{\textbf {C}}}\) and \({\textbf {U}}_{{\textbf {B}}}\) are the orthonormal basis of spaces \(span ({\textbf {C}})\) and \(span ({\textbf {B}})\), respectively. In order to design \(span ({\textbf {B}})\) approaching to \(span ({\textbf {C}})\), the chordal distance between \(span ({\textbf {C}})\) and \(span ({\textbf {B}})\) should be minimized, and the problem of designing \({\textbf {B}}\) is formalized as

$$\begin{aligned}&\mathcal {P}_{\textrm{6}}: \min _{{\textbf {B}}}{} D _{C }(span ({\textbf {B}}),span ({\textbf {C}})) \end{aligned}$$
$$\begin{aligned}&s.t. ~~ span ({\textbf {B}}_{k })\subseteq span ^{\bot }(\bar{{\textbf {C}}}_{k }), k =\text {1},\text {2},\ldots ,K \end{aligned}$$
$$\begin{aligned}&~~~~~~ col({\textbf {B}}_{\Omega _{k }})\le P _{C } \end{aligned}$$
$$\begin{aligned}&~~~~~~ {\textbf {B}}^{H }{} {\textbf {B}}={\textbf {I}}, \end{aligned}$$

where constraint (27a) ensures that the effective channel matrix is a band matrix and \(\bar{{\textbf {C}}}_{k }=\sum _{i \in \bar{\Omega }_{k }}{} {\textbf {C}}_{i }\), constraint (27b) ensures that the design of matrix \({\textbf {B}}\) meets the pilot requirements. \(\mathcal {P}_{\textrm{6}}\) is solved iteratively using a greedy algorithm. First, the space \(span ({\textbf {C}})\) is divided into \(K\) subspaces, i.e., \(\mathcal {S}_{k } = span ({\textbf {C}}_{k }),k =\text {1},\text {2},\ldots ,K\). Then iteratively solves the pre-beamforming matrix \({\textbf {B}}_{k }\) such that the chordal distance between \(\mathcal {S}_{k }\) and \(\bigcup _{j =\text {1}}^{k }{} {\textbf {B}}_{j }\) is minimized. When the iteration is complete, \(D _{C }(span ({\textbf {B}}),span ({\textbf {C}}))\) will be small. In the \(k\)-th iteration, the problem of designing \({\textbf {B}}_{k }\) is as follows

$$\begin{aligned}&\mathcal {P}_{\textrm{7}}: \min _{{\textbf {B}}_{k }}{} D _{C }(span ({\textbf {G}}_{k }),\mathcal {S}_{k }) \end{aligned}$$
$$\begin{aligned}&s.t. ~~ span ({\textbf {B}}_{k })\subseteq span ^{\bot }(\bar{{\textbf {C}}}_{k }) \end{aligned}$$
$$\begin{aligned}&~~~~~~ col({\textbf {B}}_{k })=g _{k } \end{aligned}$$
$$\begin{aligned}&~~~~~~ {\textbf {G}}_{k }^{H }{} {\textbf {G}}_{k }={\textbf {I}}, \end{aligned}$$

where \({\textbf {G}}_{k }=[{\textbf {B}}_{\text {1}},{\textbf {B}}_{\text {2}},\ldots ,{\textbf {B}}_{k }]\). Setting the number of columns of the matrix \({\textbf {B}}_{k }\) to the obtained value \(g _{k }\) during the user scheduling stage ensures that the actual pilots of the system are less than or equal to \(P _{C }\). This is because of the constraint (12b) of the user scheduling problem \(\mathcal {P}_{\textrm{1}}\).

Let \({\textbf {U}}_{\mathcal {S}_{k}}\) be the orthogonal basis of \(\mathcal {S}_{k}\). Since \({\textbf {G}}_{k}\) is the orthogonal basis of \(span ({\textbf {G}}_{k})\), we have

$$\begin{aligned} \begin{aligned} D _{C }(span ({\textbf {G}}_{k }),\mathcal {S}_{k })&=\parallel {\textbf {G}}_{k}{} {\textbf {G}}_{k}^{H }- {\textbf {U}}_{\mathcal {S}_{k}}{} {\textbf {U}}_{\mathcal {S}_{k}}^{H }\parallel _{F }^{\text {2}}\\&=\parallel {\textbf {G}}_{k-1}{} {\textbf {G}}_{k-1}^{H }+{\textbf {B}}_{k}{} {\textbf {B}}_{k}^{H }- {\textbf {U}}_{\mathcal {S}_{k}}{} {\textbf {U}}_{\mathcal {S}_{k}}^{H }\parallel _{F }^{\text {2}}. \end{aligned} \end{aligned}$$

Taking into account the non-negativity property of the Frobenius norm, we now just focus on \(\parallel {\textbf {G}}_{k-1}{} {\textbf {G}}_{k-1}^{H }+{\textbf {B}}_{k}{} {\textbf {B}}_{k}^{H }- {\textbf {U}}_{\mathcal {S}_{k}}{} {\textbf {U}}_{\mathcal {S}_{k}}^{H }\parallel _{F }\). Denoting \({\textbf {T}}={\textbf {G}}_{k-1}{} {\textbf {G}}_{k-1}^{H }-{\textbf {U}}_{\mathcal {S}_{k}}{} {\textbf {U}}_{\mathcal {S}_{k}}^{H }\), we have

$$\begin{aligned} \begin{aligned} \parallel {\textbf {B}}_{k}{} {\textbf {B}}_{k}^{H }+{\textbf {T}}\parallel _{F }&=Tr \left( ({\textbf {B}}_{k}{} {\textbf {B}}_{k}^{H}+{\textbf {T}})({\textbf {B}}_{k} {\textbf {B}}_{k}^{H }+{\textbf {T}})^{H }\right) \\&=Tr \left( {\textbf {B}}_{k}{} {\textbf {B}}_{k}^{H }\right) +2Tr \left( {\textbf {B}}_{k}{} {\textbf {B}}_{k}^{H }{} {\textbf {T}}\right) +Tr \left( {\textbf {T}}{} {\textbf {T}}^{H }\right) . \end{aligned} \end{aligned}$$

Based on the property of trace, we can derive \({\textbf {B}}=\bar{{\textbf {B}}}_{\Psi _{k }}{} {\textbf {U}}_{\varepsilon }{} {\textbf {N}}\), where \(\bar{{\textbf {B}}}_{\Psi _{k }}\) is the orthogonal basis of the space \(span ^{\perp }({\textbf {B}}_{\Psi _{k }})\), and \({\textbf {U}}_{\varepsilon }\) is the matrix composed of the eigenvectors corresponding to the eigenvalues of the matrix \(\bar{{\textbf {B}}}_{\Psi _{k }}\bar{{\textbf {R}}}_{k }\) less than \(\varepsilon\). For detailed derivations, please refer to [21]. Unlike the design method with constrained DTL, \({\textbf {N}}\) is an unitary matrix composed by the eigenvectors of \({\textbf {U}}_{\varepsilon }^{H }\bar{{\textbf {B}}}_{\Psi _{k }}^{H }({\textbf {U}}_{\mathcal {S}_{k }}{} {\textbf {U}}_{\mathcal {S}_{k }}^{H })\bar{{\textbf {B}}}_{\Psi _{k }}{} {\textbf {U}}_{\varepsilon }\) corresponding to the \(g _{k}\) largest eigenvalues. Once we get \({\textbf {N}}\), we can use \({\textbf {B}}_{k}=\bar{{\textbf {B}}}_{\Psi _{k }}{} {\textbf {U}}_{\varepsilon }{} {\textbf {N}}\) to get \({\textbf {B}}_{k}\).

Figure 4 illustrates the chordal distance of different iterations. It should be note that, given the dimension of these two spaces (e.g.,\(N_{1}\) and \(N_{2}\)), a chordal distance of 0 indicates that the spaces are the same. When the chordal distance reaches its maximum value of \(N_{1}+N_{2}\), the spaces are orthogonal to each other. From Fig. 4, we can observe that in each iteration, the chordal distance between \({\textbf {B}}_{k}\) and \({\textbf {C}}_{k}\) remains small but nonzero. This is because \({\textbf {B}}_{k}\) is designed not only to approximate \({\textbf {C}}_{k}\), but also to lie in the null space of the CCM \(\bar{{\textbf {C}}}_{k}\). The chordal distance between \(span ({\textbf {B}})\) and \(span ({\textbf {C}})\) gradually increases with the number of iterations. This can be attributed to the increasing dimension of \(span ({\textbf {B}})\) over the course of iterations. Therefore, even though \(span ({\textbf {B}})\) is designed to approach \(span ({\textbf {C}})\) incrementally, their chordal distance still increases.

Fig. 4
figure 4

The chordal distance of different iterations. \(K=20\), \(U=10\), \(\Delta =5^{\circ }\), \(\text {NAS}=10^{\circ }\), \(\text {SNR}=20~\text {dB}\), \(P_{C}=20\), \(\xi =4\)

6 Simulation results

In this section, we provide the simulation results of the proposed algorithm. A ULA with \(M =\text {64}\) antennas at the BS is considered, and \(K =\text {36}\) single-antenna users are served. The azimuth center angle of each user is uniformly distributed in \([-\frac{\pi }{\textrm{3}},\frac{\pi }{\textrm{3}}]\) and the angular spread \(\Delta\) is \(\text {5}^{\circ }\). For JSDM, the users are partitioned into \(G\) groups, and the number of \(G\) is proportional to the number of users, i.e., \(G =\lfloor K /\text {6} \rfloor\). The value \(\tau\) in Sect. 4 is set to \(\text {1}\). The parameters in the design of the pre-beamforming matrix \({\textbf {B}}\) are consistent with those in [21].

The user scheduling in conventional JSDM consists of two parts: user grouping and intra-group user scheduling. The user grouping stage utilizes the K-means algorithm with chordal distance [25] and the agglomerative user algorithm [28] for grouping users (where the effective channel dimension in the \(g\)-th group is \(\lfloor M /G \rfloor\)). The user scheduling stage employs the algorithm from [26] for scheduling. Given the user grouping, the ESE for scheduled user \(k\) in group \(g\) is \(\eta _{g,k } = (\text {1}-\frac{DTL _{g,k }}{T _{C }}) \log _{\text {2}}(\text {1}+SINR _{g,k })\), where \(SINR _{g,k }\) denotes the SINR of user \(k (k =\text {1},\text {2},\ldots ,K _{g })\) in group \(g\) and then the overall ESE is \(R _{con } = \sum _{g =\text {1}}^{G }\sum _{k \in \kappa _{g }}\eta _{g,k }\).

Fig. 5
figure 5

Comparison of the ESEs under different \(\text {SNR}\)s. \(K=36\), \(\Delta =5^{\circ }\), \(\text {NAS}=10^{\circ }\), \(P_{C}=20\), \(\xi =4\)

The expected number of active users to be scheduled each time is \(U\). Figure 5 illustrates the ESEs of all algorithms under different SNRs. Our proposed algorithm (denoted by N-JSDM Mixed integer) is compared with two benchmark algorithms of conventional JSDM user scheduling (denoted by JSDM Agglomerative & Greedy and JSDM K-means & Greedy, respectively), the active channel sparsification method (denoted by ACS) as well as N-JSDM with random user scheduling (denoted by N-JSDM Random). The number of scheduled users in Fig. 5a, b is \(\text {30}\), while the number of scheduled users in Fig. 5c is \(\text {24}\). \(T _{C }\) in Fig. 5a, c are \(\text {100}\), and \(T _{C }\) in Fig. 5b is \(\text {50}\). All algorithms exhibit increasing ESE with higher SNR values. It can be seen that our proposed algorithm achieves higher ESE compared to the other algorithms. The performance difference between JSDM Agglomerative & Greedy and JSDM K-means & Greedy stems from their user grouping schemes. The agglomerative user clustering method does not depend on the initial choices of the cluster centers [28]. The ACS consistently exhibits lower ESE compared to other algorithms. This is because the ACS method approximates the downlink CCM of users using the columns of the discrete Fourier transformation matrix compared to other algorithms. This approximation enlarges the energy of both the received signal and interference, and the inter-user interference is directly proportional to the transmission power. When the transmission power is at low level, noise dominates over inter-user interference, and due to the large received power of the signal, the ACS method achieves a large SINR, resulting in a high ESE. However, as the transmission power increases, inter-user interference also increases, leading to no improvement in ESE with increasing SNR. All algorithms except for ACS exhibit similar performance at low SNR. This similarity arises from the fact that in smaller NAS, the impact of DTL on ESE is not significant, and ESE is primarily influenced by spectral efficiency. As the SNR increases, our algorithm achieves higher ESE by minimizing interference and considering spectral overhead. From Fig. 5, it can be observed that the performance gap between N-JSDM and JSDM widens as the number of users increases. This widening gap is attributed to the larger loss of signal space caused by JSDM grouping when the number of users is high, whereas N-JSDM, utilizing the neighbor strategy, can fully leverage the signal space, resulting in more significant advantages. Since the ACS method is primarily designed for scenarios where the number of antennas tends to infinity, detailed analysis of ACS performance will not be included in the following simulation.

Fig. 6
figure 6

Comparison of the ESEs under different \(T_{C}\)s. \(K=36\), \(\Delta =5^{\circ }\), \(\text {NAS}=10^{\circ }\), \(\text {SNR}=20~\text {dB}\), \(P_{C}=20\), \(\xi =4\)

Figure 6 shows the ESEs of all algorithms under different \(T _{C }\)s. The weight of spectral overhead in ESE varies with \(T _{C }\). When \(T _{C }\) is small, the influence of spectral overhead becomes significant since the DTL is positioned in the fractional numerator. As \(T _{C }\) gradually increases, the influence of spectral overhead diminishes, and the significance of spectral efficiency becomes more pronounced. Consequently, the ESE exhibits a gradual upward trend.

Fig. 7
figure 7

Comparison of the ESEs under different number of scheduled users. \(K=36\), \(\Delta =5^{\circ }\), \(\text {NAS}=10^{\circ }\), \(\text {SNR}=20~\text {dB}\), \(T_{C}=100\), \(P_{C}=20\), \(\xi =4\)

Figure 7 depicts the ESEs of all algorithms under different numbers of scheduled users. Several noteworthy observations can be made. Firstly, the ESEs of N-JSDM algorithms increase as the number of scheduled users grows, albeit at a gradually slowing rate. This is because increasing the number of users can enhance spectral efficiency, but it also leads to an increase in interference between users. Secondly, when the number of scheduled users reaches a certain threshold, the performance of conventional JSDM schemes with user scheduling begins to decline. This indicates that as the number of users in the system becomes larger, the performance degradation caused by JSDM grouping becomes more pronounced. Thirdly, due to the approximation used for the CCM, the performance of the ACS method is consistently lower than other algorithms.

In addition, we have conducted additional simulations to evaluate the performance of our proposed algorithm under extreme conditions. These simulations aim to assess the algorithm’s robustness and its behavior in challenging scenarios, including scenarios with extremely low SNR and non-uniform user distributions. The performance of the N-JSDM Random is not shown in the following as it is expected that random user scheduling performs inferior to our proposed algorithm. Our focus is on the performance differences between the proposed algorithm and the other benchmark algorithms.

Fig. 8
figure 8

Comparison of the ESEs under various extremely low \(\text {SNR}\)s. \(K=36\), \(U=30\), \(\Delta =5^{\circ }\), \(\text {NAS}=10^{\circ }\), \(T_{C}=100\), \(P_{C}=20\), \(\xi =4\)

Figure 8 presents the ESEs of all algorithms under different extremely low SNR conditions. It is evident from Fig. 8 that our proposed algorithm consistently achieves higher ESE compared to other algorithms. This superiority stems from our algorithm’s scheduling objective of minimizing system interference, which enables better reduction of inter-user interference in low SNR scenarios. The performance of JSDM Agglomerative & Greedy and JSDM K-means & Greedy is similar, as they both utilize the same scheduling criterion, namely maximizing SINR. The slight performance differences arise from their distinct user grouping methods. On the other hand, the ACS method exhibits the poorest performance due to the approximation employed for the CCM.

Fig. 9
figure 9

Comparison of the ESEs under various user distributions. \(K=36\), \(U=30\), \(\Delta =5^{\circ }\), \(\text {NAS}=10^{\circ }\), \(T_{C}=100\), \(P_{C}=20\), \(\xi =4\)

Figure 9 illustrates the ESEs of all algorithms under different user distributions, with a standard deviation of 20 for the normal distribution. Comparing it to Fig. 5a, it is evident that the performance of all algorithms experiences a significant decline. This decrease in performance can be attributed to the extreme user distribution, which leads to densely populated local user clusters, making it challenging to achieve the desired number of scheduled users. Furthermore, the interference among the scheduled users is substantial, further contributing to the degradation in performance. To enhance visual clarity, we have omitted the curve for JSDM K-means & Greedy, which exhibits marginally lower performance compared to JSDM Agglomerative & Greedy.

7 Conclusion

We proposed a user scheduling method in massive MIMO systems using channel directional characteristics and proposed a dynamic beam allocation method matching the proposed user scheduling. Compared with the complete CSI-based schemes, the two directional features used in this paper, i.e., the azimuth angle and the AS, are generally stable over large time scales. The proposed method scheduled users using mixed integer programming, aiming to improve system performance. Simulations validated the superiority of the proposed method. In our future work, we will extend our method to more channel models, such as Saleh-Valenzuela geometric model and multiple scatterer clusters model.

Availability of data and materials

The datasets simulated and/or analyzed during the current study are available from the corresponding author on reasonable request.


  1. We assume that the information about the CCM is known and accurate. This assumption is reasonable because there have been research studies on CCM estimation, as detailed in [30, 31]. These works leverage the angular reciprocity between the uplink and downlink channels in FDD systems to improve channel estimation.



Multiple-input multiple-output


Base station


Channel state information


Time division duplex


Frequency division duplex


Downlink training length


Joint spatial division multiplexing


Channel covariance matrix


Signal-to-interference-plus-noise ratio


Signal-to-leakage-plus-noise ratio


Neighbor-based JSDM




Channel quality indicator


Effective spectral efficiency


Uniform linear array




Angular spread


Neighbor angular spread


Inter-user interference


Overlap angle


  1. I.F. Akyildiz, J.M. Jornet, Realizing ultra-massive MIMO ( 1024 \(\times\) 1024 ) communication in the (0.06–10) Terahertz band. Nano Commun. Netw. 8, 46–54 (2016).

  2. R. Hussain, M.S. Sharawi, 5G MIMO antenna designs for base station and user equipment: some recent developments and trends. IEEE Antennas Propag. Mag. 64(3), 95–107 (2022).

    Article  Google Scholar 

  3. T.L. Marzetta, Noncooperative cellular wireless with unlimited numbers of base station antennas. IEEE Trans. Wirel. Commun. 9(11), 3590–3600 (2010).

    Article  Google Scholar 

  4. Z. Gao, L. Dai, W. Dai, Z. Wang, Block compressive channel estimation and feedback for FDD massive MIMO. in 2015 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), 49–50 (2015).

  5. B. Lee et al., Antenna grouping based feedback compression for FDD-based massive MIMO systems. IEEE Trans. Commun. 63(9), 3261–3274 (2015).

    Article  Google Scholar 

  6. D. Fan, F. Gao, G. Wang, Z. Zhong, A. Nallanathan, Angle domain signal processing-aided channel estimation for indoor 60-GHz TDD/FDD massive MIMO systems. IEEE J. Sel. Areas Commun. 35(9), 1948–1961 (2017).

    Article  Google Scholar 

  7. X. Yang et al., Design and implementation of a tdd-based 128-antenna massive MIMO prototype system. China Commun. 14(12), 162–187 (2017).

    Article  Google Scholar 

  8. X. Jiang, F. Kaltenberger, L. Deneire, How accurately should we calibrate a Massive MIMO TDD system? in 2016 IEEE International Conference on Communications Workshops (ICC), 706–711 (2016).

  9. D. Liu, W. Ma, S. Shao, Y. Shen, Y. Tang, Performance analysis of TDD reciprocity calibration for massive MU-MIMO systems with ZF beamforming. IEEE Commun. Lett. 20(1), 113–116 (2016).

    Article  Google Scholar 

  10. Global Market Monitor: China TDD and FDD Spectrum Industry Market Research Report 2023–2029 (2023),

  11. A. Abdallah, M.M. Mansour, Efficient angle-domain processing for FDD-based cell-free massive MIMO systems. IEEE Trans. Commun. 68(4), 2188–2203 (2020).

    Article  Google Scholar 

  12. A. Adhikary, J. Nam, J.-Y. Ahn, G. Caire, Joint spatial division and multiplexing-the large-scale array regime. IEEE Trans. Inf. Theory 59(10), 6441–6463 (2013).

    Article  MathSciNet  Google Scholar 

  13. H.-W. Liang, W.-H. Chung, S.-Y. Kuo, FDD-RT: a simple CSI acquisition technique via channel reciprocity for FDD massive MIMO downlink. IEEE Syst. J. 12(1), 714–724 (2018).

    Article  Google Scholar 

  14. Z. Gao, L. Dai, Z. Wang, S. Chen, Spatially common sparsity based adaptive channel estimation and feedback for FDD massive MIMO. IEEE Trans. Signal Process. 63(23), 6169–6183 (2015).

    Article  MathSciNet  Google Scholar 

  15. Y. Ding, B.D. Rao, Dictionary learning-based sparse channel representation and estimation for FDD massive MIMO systems. IEEE Trans. Wireless Commun. 8(17), 5437–5451 (2018).

    Article  Google Scholar 

  16. S. Noh, M.D. Zoltowski, Y. Sung, Love, J. David, Pilot beam pattern design for channel estimation in massive MIMO systems. IEEE J. Sel. Topics Signal Process. 8(5), 787–801 (2014).

  17. H. Ren et al., Long-term CSI-based design for RIS-aided multiuser MISO systems exploiting deep reinforcement learning. IEEE Commun. Lett. 26(3), 567–571 (2022).

    Article  Google Scholar 

  18. Z. Peng et al., Analysis and optimization for RIS-aided multi-pair communications relying on statistical CSI. IEEE Trans. Veh. Technol. 70(4), 3897–3901 (2021).

    Article  MathSciNet  Google Scholar 

  19. D. Kim, G. Lee, Y. Sung, Two-stage beamformer design for massive MIMO downlink by trace quotient formulation. IEEE Trans. Commun. 63(6), 2200–2211 (2015).

    Article  Google Scholar 

  20. Y. Jeon et al., New beamforming designs for joint spatial division and multiplexing in large-scale MISO multi-user systems. IEEE Trans. Wirel. Commun. 16(5), 3029–3041 (2017).

    Article  Google Scholar 

  21. Y. Song et al., Joint spatial division and multiplexing in massive MIMO: a neighbor-based approach. IEEE Trans. Wirel. Commun. 19(11), 7392–7406 (2020).

    Article  Google Scholar 

  22. M.B. Khalilsarai, S. Haghighatshoar, X. Yi, G. Caire, FDD massive MIMO via UL/DL channel covariance extrapolation and active channel sparsification. IEEE Trans. Wirel. Commun. 18(1), 121–135 (2019).

    Article  Google Scholar 

  23. W. Tang, Y. Teng, Y. Man, M. Song, Analysis of two-stage precoding schemes for massive multi-user MIMO downlink systems. in 2016 IEEE International Conference on Communication Systems (ICCS), 1–6 (2016).

  24. J. Ma, S. Zhang, H. Li, N. Zhao, V.C.M. Leung, Base station selection for massive MIMO networks with two-stage precoding. IEEE Wirel. Commun. Lett. 6(5), 598–601 (2017).

    Article  Google Scholar 

  25. J. Nam, A. Adhikary, J.-Y. Ahn, G. Caire, Joint spatial division and multiplexing: opportunistic beamforming, user grouping and simplified downlink scheduling. IEEE J. Sel. Top. Signal Process. 8(5), 876–890 (2014).

    Article  Google Scholar 

  26. Y. Xu, G. Yue, N. Prasad, S. Rangarajan, S. Mao, User grouping and scheduling for large scale MIMO systems with two-stage precoding. in 2014 IEEE International Conference on Communications (ICC), 5197–5202(2014).

  27. J. Nam, Y.-J. Ko, J. Ha, User grouping of two-stage MU-MIMO precoding for clustered user geometry. IEEE Commun. Lett. 19(8), 1458–1461 (2015).

    Article  Google Scholar 

  28. X. Sun, X. Gao, G.Y. Li, W. Han, Agglomerative user clustering and downlink group scheduling for FDD massive MIMO systems. in 2017 IEEE International Conference on Communications (ICC), 1–6 (2017).

  29. Z. Jiang, A.F. Molisch, G. Caire, Z. Niu, Achievable rates of FDD massive MIMO systems with spatial channel correlation. IEEE Trans. Wirel. Commun. 14(5), 2868–2882 (2015).

    Article  Google Scholar 

  30. H. Xie, F. Gao, S. Zhang, S. Jin, A unified transmission strategy for TDD/FDD massive MIMO systems with spatial basis expansion model. IEEE Trans. Veh. Technol. 66(4), 3170–3184 (2017).

    Article  Google Scholar 

  31. L. Miretti, R.L.G. Cavalcante, S. Stanczak, FDD massive MIMO channel spatial covariance conversion using projection methods. in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3609–3613 (2018).

  32. X. Li, S. Jin, H.A. Suraweera, J. Hou, X. Gao, Statistical 3-D beamforming for large-scale MIMO downlink systems over Rician fading channels. IEEE Trans. Commun. 64(4), 1529–1543 (2016).

    Article  Google Scholar 

  33. Y. Huang, L. Yang, M. Bengtsson, B. Ottersten, Exploiting long-term channel correlation in limited feedback SDMA through channel phase codebook. IEEE Trans. Signal Process. 59(3), 1217–1228 (2011).

    Article  MathSciNet  Google Scholar 

  34. Y. Song et al., Domain selective precoding in 3-D massive MIMO systems. IEEE J. Sel. Top. Signal Process. 13(5), 1103–1118 (2019).

    Article  Google Scholar 

  35. N. Omaki, K. Kitao, K. Saito, T. Imai, Y. Okumura, Experimental study on elevation directional channel properties to evaluate performance of 3D-mimo at base station in microcell outdoor to indoor environment. in 2014 IEEE International Workshop on Electromagnetics (iWEM). 219–220 (2014).

  36. S.P. Boyd, L. Vandenberghe, Convex Optimization (Cambridge University Press, UK, 2004)

    Book  Google Scholar 

  37. E.D. Andersen, K.D. Andersen, The MOSEK Interior Point Optimizer for Linear Programming: An Implementation of the Homogeneous Algorithm. Springer, Boston, 197–232 (2000)

Download references


The authors would like to thank the anonymous reviewers for their valuable comments and suggestions that helped improve the quality of this manuscript.


This work was supported in part by the Natural Science Foundation of China under Grants 61771257, 62101282, and 62371249.

Author information

Authors and Affiliations



HL conducted research, conceptualized, simulated and modified experiments, as well as wrote the manuscript; CL and YS helped to conceive the idea and revise the manuscript; TG suggested improvements and revised the manuscript; YZ assisted with data handling and manuscript revisions. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Chen Liu.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



1.1 A.1 Proof of Theorem 1


Let \({\textbf {x}}^{*}\) be the optimal solution of \(\mathcal {P}_{\textrm{2}}\). Since the elements of matrix \({\textbf {A}}_{P }\) are non-negative and \(\forall x _{i }\in \{\text {0},\text {1}\}\), we have \({\textbf {A}}_{P }{} {\textbf {x}}\ge {\textbf {0}}\). Thus for \({\textbf {x}}\) satisfying \({\textbf {e}}^{T }{} {\textbf {x}}=U\) and \({\textbf {n}}_{B }\odot (\hat{{\textbf {A}}}_{f }{} {\textbf {x}})\le P _{C }\cdot {\textbf {e}}\), there must exist \({\textbf {z}}\ge {\textbf {0}},{\textbf {s}}\ge {\textbf {0}},{\textbf {z}},{\textbf {s}}\in \mathbb {R}^{K \times \text {1}}\) such that

$$\begin{aligned}&{\textbf {A}}_{P }{} {\textbf {x}}-{\textbf {z}}-{\textbf {s}}={\textbf {0}} \end{aligned}$$
$$\begin{aligned}&~~~~~{\textbf {z}}^{T }{} {\textbf {x}}=\text {0}. \end{aligned}$$

For \({\textbf {x}}^{*}\), \({\textbf {z}}^{*}\) and \({\textbf {s}}^{*}\) satisfying (17a) and (17b), \({\textbf {e}}^{T }{} {\textbf {s}}^{*}\) is the smallest among all \({\textbf {e}}^{T }{} {\textbf {s}}\).

In the following, we prove that \(({\textbf {x}}^{*},{\textbf {z}}^{*},{\textbf {s}}^{*})\) is the optimal solution \(\mathcal {P}_{\textrm{3}}\). Use \({\textbf {x}}^{*}\) to replace \({\textbf {x}}\) in (17a), and left-multiply \({\textbf {x}}^{*T }\) at both ends, we have

$$\begin{aligned} {\textbf {x}}^{*T }{} {\textbf {A}}_{P }{} {\textbf {x}}^{*}-{\textbf {x}}^ {*T }{} {\textbf {z}}^{*}-{\textbf {x}}^{*T }{} {\textbf {s}}^{*}={\textbf {0}}. \end{aligned}$$

Due to the constraint \({\textbf {z}}^{*T }{} {\textbf {x}}^{*}= \text {0}\), Eq. (31) is equivalent to

$$\begin{aligned} {\textbf {x}}^{*T }{} {\textbf {A}}_ {P }{} {\textbf {x}}^{*}={\textbf {x}}^{*T }{} {\textbf {s}}^{*}. \end{aligned}$$

Since \({\textbf {x}}^{*T }{} {\textbf {A}}_{P }{} {\textbf {x}}^{*}\) is the smallest among all \({\textbf {x}}^{T }{} {\textbf {A}}_{P }{} {\textbf {x}}\), \({\textbf {x}}^{*T }{} {\textbf {s}}^{*}\) is also the smallest among all \({\textbf {x}}^{T }{} {\textbf {s}}\). If it can be proved

$$\begin{aligned} {\textbf {x}}^{*T }{} {\textbf {s}}^{*} = {\textbf {e}}^{T }{} {\textbf {s}}^{*}, \end{aligned}$$

since \(({\textbf {x}}^{*},{\textbf {z}}^{*},{\textbf {s}}^{*})\) satisfies the constraints of \(\mathcal {P}_{\textrm{3}}\), \(({\textbf {x}}^{*},{\textbf {z}}^{*},{\textbf {s}}^{*})\) is the optimal solution of \(\mathcal {P}_{\textrm{3}}\), and \(\mathcal {P}_{\textrm{2}}\) and \(\mathcal {P}_{\textrm{3}}\) have the same optimal value.

Equation (33) is proved in the following. First, it can be shown that there must be \(s _{i }^{*}=\text {0}\) for \(\forall i\) satisfying \(x _{i }^{*}=\text {0}\). Assuming this does not hold, then there exists some \(i _{s }\) such that when \(x _{i _{s }}^{*}=\text {0}\), \(s _{i _{s }}^{*}>\text {0}\), and consequently, \({\textbf {e}}^{T }{} {\textbf {s}}^{*}\) is the smallest of all \({\textbf {e}}^{T }{} {\textbf {s}}\). Define new \(\tilde{{\textbf {z}}}\) and \(\tilde{{\textbf {s}}}\). For \(j =\text {1},\text {2},\ldots ,K\), when \(j =i _{s }\), let \(\tilde{z }_{j }=z _{i _{s }}^{*}+s _ {i _{s }}^{*},\tilde{s }_{j }=\text {0}\); when \(j \ne i _{s }\), let \(\tilde{z }_{j }=z _{j }^{*},\tilde{s }_{j }=s _{j }^{*}\). Since \(\tilde{{\textbf {z}}} + \tilde{{\textbf {s}}}={\textbf {z}}^{*} + {\textbf {s}}^{*}\), then \(({\textbf {x}}^{*},\tilde{{\textbf {z}}},\tilde{{\textbf {s}}})\) also satisfies (17a) and (17b), but \({\textbf {e}}^{T }\tilde{{\textbf {s}}}<{\textbf {e}}^{T }{} {\textbf {s}}^{*}\), which contradicts the method of choosing \({\textbf {s}}^{*}\). Thus, there must be \(s _{i }^{*}=\text {0}\) for \(\forall i\) satisfying \(x _{i }^{*}=\text {0}\), and Eq. (33) holds. As a result, \(({\textbf {x}}^{*},{\textbf {z}}^{*},{\textbf {s}}^{*})\) is the optimal solution of \(\mathcal {P}_{\textrm{3}}\), and \(\mathcal {P}_{\textrm{2}}\) and \(\mathcal {P}_{\textrm{3}}\) have the same optimal value.

\(sufficiency:\) In the following, we prove that if \(({\textbf {x}}^{*},{\textbf {z}}^{*},{\textbf {s}}^{*})\) is the optimal solution of \(\mathcal {P}_{\textrm{3}}\), then \({\textbf {x}}^{*}\) is the optimal solution of \(\mathcal {P}_{\textrm{2}}\). We use the contradiction method to complete the proof.

Assuming that \({\textbf {x}}^{*}\) is not the optimal solution of \(\mathcal {P}_{\textrm{2}}\), and \(\bar{{\textbf {x}}}\) is the optimal solution of \(\mathcal {P}_{\textrm{2}}\), then \(\bar{{\textbf {x}}}^{T }{} {\textbf {A}}_{P }\bar{{\textbf {x}}}<{\textbf {x}}^{*T }{} {\textbf {A}}_{P }{} {\textbf {x}}^{*}\). Since \(\bar{{\textbf {x}}}\) is the optimal solution of \(\mathcal {P}_{\textrm{2}}\), according to the method of finding the optimal solution of \(\mathcal {P}_{\textrm{3}}\) in necessity, \(\bar{{\textbf {z}}}\) and \(\bar{{\textbf {s}}}\) satisfying (17a) and (17b) can be obtained, and \({\textbf {e}}^{T }\bar{{\textbf {s}}}\) is minimized. From the proof of necessity, it can be known that \((\bar{{\textbf {x}}},\bar{{\textbf {z}}},\bar{{\textbf {s}}})\) is the optimal solution of \(\mathcal {P}_{\textrm{3}}\) and satisfies

$$\begin{aligned} \bar{{\textbf {x}}}^{T }{} {\textbf {A}}_{P }\bar{{\textbf {x}}}=\bar{{\textbf {x}}}^{T }\bar{{\textbf {s}}} = {\textbf {e}}^{T }\bar{{\textbf {s}}}. \end{aligned}$$

However, since \(({\textbf {x}}^{*},{\textbf {z}}^{*},{\textbf {s}}^{*})\) is the optimal solution of \(\mathcal {P}_{\textrm{3}}\), it follows from the proof of necessity that

$$\begin{aligned} {\textbf {x}}^{*T }{} {\textbf {A}}_{P }{} {\textbf {x}}^{*}={\textbf {x}}^{*T }{} {\textbf {s}}^{*} = {\textbf {e}}^{T }{} {\textbf {s}}^{*}. \end{aligned}$$

Since \(\bar{{\textbf {x}}}^{T }{} {\textbf {A}}_{P }\bar{{\textbf {x}}}<{\textbf {x}}^{*T }{} {\textbf {A}}_{P }{} {\textbf {x}}^{*}\), \({\textbf {e}}^{T }{} {\textbf {s}}^{*}>{\textbf {e}}^{T }\bar{{\textbf {s}}}\) can be obtained, which contradicts that \(({\textbf {x}}^{*},{\textbf {z}}^{*},{\textbf {s}}^{*})\) is the optimal solution of \(\mathcal {P}_{\textrm{3}}\). Hence, \(Theorem~1\) is proved.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, H., Liu, C., Song, Y. et al. Neighbor-based joint spatial division and multiplexing in massive MIMO: user scheduling and dynamic beam allocation. EURASIP J. Adv. Signal Process. 2024, 1 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: