Regression-based beam training for UAV mmWave communications

For unmanned aerial vehicle (UAV) millimeter-wave (mmWave) communication systems, efficient and accurate beam training is urgently required to overcome beam misalignment. By taking into account the mmWave propagation environment, a three-dimensional (3D) intelligent beam training strategy that leverages the polynomial regression (PR) model and optimized beam patterns is proposed in this paper. We treat mmWave beam selection as a PR problem. By using machine learning (ML), the regression function is determined. The training dataset applied in the ML method consists of measured power and estimated angles and is obtained by carefully designed beam patterns. Furthermore, a noise suppression method involving the use of a denoising autoencoder (DAE) is developed to overcome the noise sensitivity of the proposed regression model. The numerical simulation results demonstrate that our proposed beam training strategy is capable of obtaining the same precision as an exhaustive search with a shorter time.

example, in [16], the authors proposed a hierarchical multiresolution codebook to lower training time.However, this method introduce a nonideal probability of estimation error from some special angles.In [17], the authors developed two fast search procedures based upon the Luus Jaakola and Tabu methods.The authors in [18] designed an iterative search scheme over sectional subarrays.This method resembles a digital beamforming search scheme to reduce the discovery time.The authors in [19] proposed an efficient beam search scheme supported by the joint judgment technique.The authors in [20] presented an energy-efficient beam-alignment protocol to reduce power consumption.The authors in [21] gave a three-dimensional (3D) hierarchical codebook to estimate both the vertical and horizontal angles at the same time.
Considering the beam search in mmWave UAV communication scenarios, it is difficult to obtain a 3D aligned beam pair via a power measurement without any additional information.Due to the increasing complexity of beam searches in 3D dynamic environments, the use of ML or deep learning (DL) to address the problem has become promising.
In recent years, the application of ML/DL to beam search has received great attention.In [22], the authors used reinforcement learning to realize beam selection based on a ray tracing simulator (RTS), which generated mmWave channels with transceiver mobility.In [23], training data was generated by using RTS and aligned beam pairs were obtained on the basis of vehicle positions.These methods can only be applied to particular ground scenarios, such as the Internet of vehicles (IoV).Once the scenario changes, it is necessary to rebuild the training model and recollect the training data.In [24,25], the authors investigated k-nearest neighbours (KNN) and support vector classifiers (SVC) to select the optimal configuration for the analog beamforming (ABF) network based on the estimated AOA and received powers.In [26,27], the authors presented a Gaussian process-based ML scheme to achieve fast and accurate UAV position prediction to help complete beam selection.These methods require considerable prior messages to calculate the probability distribution function of target variables.To address these drawbacks, this paper aims to find an efficient method that can realize high-precision beam training in UAV scenarios.The major contributions and novelties of our work are summarized as follows: (1) A fast 3D beam training strategy is proposed in this paper.This strategy utilizes special frames with a two-phase structure consisting of beam training and data transmission, and the linear regression (LR) model and novel beam patterns are presented in the corresponding phase.(2) A special LR model is derived to replace the exhaustive beam search process, and the ML algorithm is applied to complete the fitting process.In addition, a novel beam pattern is designed based on the Fourier series method (FSM) to promote the formation of the LR model.(3) Based on the new training model, a denoising autoencoder (DAE) is proposed to increase the signal-to-noise ratio (SNR).A neural network is applied to establish a mapping between the original data and noise data.The training data are employed as labels, which can help to obtain a denoising learning model.
The remainder of this paper is organized as follows.Section 2 describes the system model of UAV communications.Section 3 introduces a fast 3D beam training model.Section 4 presents the DAE method.The simulation and test results are provided in Sect. 5.The conclusions are drawn in Sect.6.
For notations, the matrix and vector are denoted by A and a , respectively.a 2 is the Euclidean norm of a , and A T and A H are the transpose and conjugate transpose of A , respectively.

System model
Beam training is indispensable because of the narrow beams used in mmWave communications.UAVs need to obtain the aligned beams by beam training before establishing communication.When communication is interrupted by drastic changes in the UAV's attitude and position, it is necessary to align the beams again [28,29].In this scenario, a uniform planar array (UPA) with a size of M × M is equipped in both the base sta- tion (BS) and mobile station (MS).The channel model between the BS and MS [30][31][32], denoted by H ∈ C M 2 ×M 2 , can be expressed as where q denotes the complex channel gain; θ h and θ v represent the horizontal and verti- cal beam directions of the BS, respectively; θ ′ h and θ ′ v represent the horizontal and vertical beam directions of the MS, respectively; and α BS (θ h , θ v ) and α MS (θ ′ h , θ ′ v ) are the array responses of the BS and MS, respectively.Furthermore, the array response of the BS can be defined as where and Here, a ∈ h, v includes both the horizontal and vertical domains, φ a is the AOA, is the signal wavelength, d h and d v are the distances between the adjacent antenna ele- ments in the horizontal and vertical directions, respectively, and α MS (θ ′ h , θ ′ h ) can be formed in the same way.
The received signal can be modelled as (1) α BSa (θ a ) = 1, e jθ a , . . ., e j(M−1)θ a T , (4) where P is the total transmit power; w ∈ C MN ×1 and c ∈ C MN ×1 are the combining and beamforming vectors, respectively; r is the transmitted signal; and n ∈ C MN ×1 is the complex white Gaussian noise with mean zero and variance σ 2 .
3 Methods section: beam training

Training strategy
The UAV beam training process can be regarded as a problem of angle selection.However, the main challenge is how to obtain the angles without knowing the positions and attitudes of the BS and MS.In this paper, we design a fast beam training strategy to overcome this problem.The strategy can be implemented by using the framework proposed in Fig. 1.It consists of the beam training phase and data transmission phase.In the beam training phase, the BS transmits training sequences to the MS at each time slot.At the same time, the MS performs power measurements for beam configuration and feeds the results back to the BS.For simplicity, this paper assumes that the channel between BS and MS is reciprocal.The beam patterns are shaped, and the power measurements y k are collected.The corresponding BF vectors for shaping the beam pattern are obtained by the proposed method in Sect.3.2, and the beam directions can be achieved by θ = f (y) .
In the data transmission phase, the BS and MS utilize their beam pairs obtained in the training phase to transmit data.
Figure 2 shows the traditional training strategy, which further maximizes the power of the optimal beam pair.However, our method uses θ = f (y) to simplify the search pro- cess.The special beam pattern is described in Sect.3.2, which can win the additional information that is beneficial for the training efficiency from the power measurements.
The regression model proposed in Sect.3.3 is the key point of the proposed novel training strategy to fit the function θ = f (y) .The LR model is often adopted for its sim- plicity; therefore, we improve θ = f (y) to an LR function, and the fitting process of the function is completed by using the ML algorithm in this paper.
For the traditional searching method, all beam patterns are consistent, but the beam directions are different.The optimal beam pairs can be obtained by maximizing the received power.However, our strategy utilizes novel training beams with additional angle information, which can improve the efficiency and accuracy of beam training.

Specially designed beam patterns
In this section, we design the beam patterns to make θ = f (y) an LR function and fit it by the ML method with a large amount of training data.To complete the fitting process, the input parameter is designed as It contains two power measurements y k 2 and y k+1

2
, where k represents the time slot.Since each time slot is very short, it can be considered that the channels of two adjacent measurements are almost the same.Thus, the input parameter is independent of the signal attenuation K and other factors.
To make x and θ present a linear relationship, the denominator is designed as a constant C, and the numerator is designed as a linear function of θ as where Z and b are both constants.Furthermore, y k 2 and y k+1 2 are modelled as Fig. 2 The traditional strategy versus proposed strategy where It is difficult to design the BF vector that satisfies the beam patterns as in (9).Therefore, we quantize the spatial domain into multiple regions.The beam gain of each quantified region is determined by the sample value of (9).For convenience of explanation, an example is shown in Fig. 3.It describes the relationship between the horizontal direction and beam gain.A similar approach can be followed to estimate the beam direction.
Note that the numbers 3, 4, ……, 10 in Fig. 3 only show a simplified distribution of beam gain.Using (7) to calculate x in all beam regions, we find that each θ h of its correspond- ing region correlates to a specific x .Furthermore, we find that x and θ h obey a monotonic relationship, which is a perfect relationship for the fitting function f (x) mentioned in this section utilizing ML.
The whole beam region in Fig. 3 is divided into several parts, and the BF vector c for the i th region can be obtained by using the FSM [33] (9) where (m h , m v ) is the serial number of antennas, X(m h ) is the ratio of the antenna abscissa to horizontal separation d h , Y (m v ) is the ratio of the antenna ordinate to vertical separation d v , (ω h0 , ω v0 ) is the centre of the beam region in the horizontal and vertical domains, and ω hb and ω vb are the widths from the centre to the horizontal and vertical boundaries, respectively.The BF vector c k is the sum of all c i values and is defined as Since the actual beam patterns generated by ( 12) cannot be exactly the same as those in Fig. 3, it only needs to ensure that the distribution of the beam gain meets the expectation.The relationship between the input and output is not perfectly linear; therefore, ML is adopted to obtain an accurate regression model.

Polynomial regression model
ML provides a variety of regression algorithms.Polynomial regression is a kind of LR model and has a wide range of applications since any function can be approximated by a polynomial.Compared with the basic linear regression, it is suitable for nonlinear functions.In this paper, we utilize polynomial regression to fit f (x) , which can be expressed as where x and β n are the feature and coefficient, respectively.The loss function of this model is where X = [1, x, x 2 , . . ., x n ] and β = [β 0 , β 1 , . . ., β n ] .In this paper, power measurements are saved as x in the dataset.The beam direction θ is used as the training label for the regression.By minimizing the loss function, the coefficients β n can be obtained.
The training result of the ML method depends on both the learning model and the dataset.When the handcrafted feature of ( 7) is applied, the search strategy is shown in Algorithm 1. (12) In Algorithm 1, A 1 and A 2 are the two BS beams shown in Fig. 3.A 3 and A 4 are obtained by exchanging the parameters of θ h and θ v .The beam B of MS is obtained in the same way.According to the description in Sect.3.2, two measurements can determine one beam angle, θ h or θ v .For 3D beams, the four angles of the beam pair can be obtained with eight measurements.As described in Algorithm 1, the optimized searching strategy can obtain all beam angles only through six measurements, while the traditional strategy can only complete the searching process of the first layer over the same time.Since the feature used by Algorithm 1 is a one-dimensional variable, the polynomial regression of one indeterminate model is sufficient.In the initial stage, the function f (x) is designed as an ideal linear function, therefore, the esti- mated function curve is close to a straight line.
The features used in Algorithm 1 are artificially designed.The error mainly comes from the difference between the actual and ideal beams.Taking the estimation of the horizontal angle θ h as an example, the ripples of the non-ideal beam in the horizontal domain result in different horizontal angles with the same received power.In addition, the ripples in the vertical domain result in different power values with the same horizontal angle.Note that the error affects the direction of the aligned beam, but the main lobe can still cover the actual angle region.
Algorithm 1 can be used to verify the rationality of the proposed beam design method.On this basis, the features can be replaced by the original power measurements.As shown in Algorithm 2, four measurements are taken as characteristic variables, and the horizontal and vertical angles are taken as labels.High-dimensional features can fit more complex data relationships.When there is an error between the actual beam and the ideal beam, our method can better learn the gain distribution of the actual beams.

Noise reduction
Since inaccurate measurements caused by a noisy environment lead to incorrect estimations, a recurrent neural network (RNN)-based DAE is proposed in this paper.Figure 4 shows the framework of the neural network.
The DAE is composed of an encoder and a decoder.The consists of one gated recurrent unit (GRU) layer with 512 units, one GRU layer with 256 units, one dense layer with 256 units, and one dense layer with 32 units.The structure of the decoder is designed in a similar way.Note that the simplest DAE comprises only a number of dense layers.To better process the sequence data, we add GRU layers to the encoder and decoder.
In Fig. 5, the neural network establishes a mapping between the original data and noisy data.The original data are the beam direction values while the UAV is working.The noisy data are assumed to be obtained by adding Gaussian noise to the original data.We first acquire the low-dimensional feature by using the encoder to encode the noisy data, Fig. 4 Framework of the DAE and then we restore the feature into the corresponding output by utilizing the decoder.By employing the training data as labels, we can obtain a denoising learning model.

Discussion and results section
In this section, we provide numerical simulations to verify the effectiveness of the proposed strategy.The beam pattern, regression model and DAE are the main factors that affect the performance of beam training.This paper mainly analyses the efficiency of these main factors.The simulation parameters are set as follows:

Parameters Value
Array elements M 21

Beam pattern
The beam pattern is very important for the formation of the LR model, as it can add additional information to the power measurement.Therefore, the actual beam needs to be designed as an ideal beam with the patterns proposed in Sect.3.2.Figure 6 shows the actual beam patterns formed by the proposed BF method in Sect.3.2, where ] .From Fig. 6a, the designed beam pattern can satisfy the require- ment of Fig. 3. Figure 6b shows the coverage area of the beam through the top view.The variations in the beam gain with angles θ h and θ v are illustrated in Fig. 6c and d, respec- tively.The simulation result verifies that the beam gain is not affected by θ v but changes with θ h .However, due to the limited number of array elements, there are ripples in the beam pattern.Generally, the ripples can be reduced by adding windows, but they can increase the beam width and reduce the beam gain.This makes the simulation result more susceptible to noise interference.The ripples can also be reduced by increasing the number of array elements.The beam pattern with different array elements is shown in Fig. 7.The variance between the actual beam and ideal beam is displayed at the top right of the picture.It can be found from the simulation results that the more array elements there are, the closer the beam pattern is to the ideal one.

Polynomial regression
In this section, the attitude of the UAV is simulated by changing the AOA and AOD.The power measurements with different angles are collected as the input of the training model.The actual angle is set as the label.To ensure the reliability of the proposed model, the sampling angle values uniformly cover the entire beam width.
To verify the reliability of polynomial regression, the beam proposed in Fig. 6 is applied to Algorithm 1.In this paper, we use the normalized mean squared error (MSE) to evaluate the accuracy of the proposed model and analyse the influence of parameters.In Fig. 8, the degree value represents the highest power of the polynomial Fig. 6 Beam patterns of the proposed method in the polynomial regression (PR) model.The simulation results show that when the highest power of the polynomial is from 1 to 5, the MSE is relatively low.The simulation results verify that the function f (x) can be well approximated as a linear model, and it is consistent with the design value of Sect.3.2.It is worth mentioning that noise has a great impact on the performance.Figure 9 shows the fitting performance of different regression models.The dotted line is the actual curve, while the solid line is obtained by the PR model.Notably, the designed beam pattern is adopted in the simulation.The simulation result proves that the designed beam is beneficial for promoting the accurate fitting of the regression model.However, the difference between the actual pattern and the ideal pattern may lead to different outputs with the same input, which would increase the training error.
A comparison of the three regression models shows that the PR model has the lowest MSE.This means that the PR can obtain reliable results when the training dataset is nonideal, and the KNN algorithm can deal with both the classification and regression problems simultaneously.The algorithm uses the mean value of several neighbour points as the predicted value of the model.The predicted value of the decision tree depends on the mean value of the sample points.The regression tree divides the feature space into several units, and each division unit has a specific output.For the test data, we need to group it into a unit according to its characteristics and then search the corresponding output value.Both regression models can handle lowdimensional data, but they are not as effective as polynomial regression in dealing with the dataset, as in this paper.
To verify the influence of the original features on the regression model, we compare the MSE of Algorithm 1 and Algorithm 2 in Fig. 10.As shown in Fig. 10, the regression method employing original features can effectively conduct angle estimation, and the estimation error is much smaller than that of the regression method using handcrafted features.In addition, the number of array elements has almost no effect on the estimation error because the error of beam gain caused by the number of array elements does not affect the distribution characteristics of the training data.
Due to the introduction of the regression function and designed beam, only a small number of time slots are needed to complete the beam configurations.In our proposed training strategy, the reduction of training slots means an increase in the transmission time and data rate.

Fig. 9 MSE of different regression models
Figure 11 shows the data rates of our method and hierarchical search when the beam configuration is finished.As shown in Fig. 11, in the conventional method, if the hierarchical beam search only performs one layer training (S = 1), then the configuration time is less.However, the data rate is reduced as the beam gain of the first layer is too low.With an increase in the number of training layers (S = 3), the beam gain increases.However, the time slots for data transmission are reduced with increasing training time.In contrast, the training strategy proposed in this paper not only has a short training time but also uses a narrow beam with high gain for data transmission.Therefore, the proposed method is more effective and suitable for UAV scenes than conventional approaches.

DAE
To evaluate the effectiveness of the DAE model for noise reduction, we compared the normalized MSE of different algorithms with the same SNR.We take the noisy signal waveform as the input of the DAE and the actual waveform as the training label.To prevent the neural network from overfitting, the input of training data should include a large number of waveforms with different amplitudes and noise powers.
Figure 12 shows that the DAE could greatly reduce the error caused by noise.Therefore, our proposed beam search strategy could obtain aligned beams with less error.The MSE of the exhaustive search represents the error between the maximum radiation direction of the aligned beam and the actual direction.To obtain the minimum MSE of exhaustive search, we assume that there is no mismatch.The simulation result proves that the proposed strategy can achieve nearly the same performance as the exhaustive search but consumes much less training time.

Conclusion
In this paper, a regression-based beam training strategy for UAV mmWave communication systems is proposed.Specifically, we formulated the training problem as an LR function and fitted it with the ML method.A special beam pattern has been proposed to promote the fitting of the function by providing additional information.Moreover, an RNN-based DAE has been introduced to reduce the impact of noise on the proposed model.The simulation results proved that the proposed strategy can effectively reduce the beam search overhead while guaranteeing the matching accuracy of the beam alignment requirements.In the future, the performance of the proposed beam search strategy will be evaluated on an actual UAV platform.

Fig. 3
Fig. 3 Ideal beam patterns for the training data

Fig. 5
Fig. 5 Process of the DAE

Fig. 10 Fig. 11
Fig.10 The influence of the original and hand-crafted features