### 3.1 Training strategy

The UAV beam training process can be regarded as a problem of angle selection. However, the main challenge is how to obtain the angles without knowing the positions and attitudes of the BS and MS. In this paper, we design a fast beam training strategy to overcome this problem. The strategy can be implemented by using the framework proposed in Fig. 1. It consists of the beam training phase and data transmission phase. In the beam training phase, the BS transmits training sequences to the MS at each time slot. At the same time, the MS performs power measurements for beam configuration and feeds the results back to the BS. For simplicity, this paper assumes that the channel between BS and MS is reciprocal. The beam patterns are shaped, and the power measurements \(y_{k}\) are collected. The corresponding BF vectors for shaping the beam pattern are obtained by the proposed method in Sect. 3.2, and the beam directions can be achieved by \(\theta = f(y)\). In the data transmission phase, the BS and MS utilize their beam pairs obtained in the training phase to transmit data.

Figure 2 shows the traditional training strategy, which further maximizes the power of the optimal beam pair. However, our method uses \(\theta = f(y)\) to simplify the search process. The special beam pattern is described in Sect. 3.2, which can win the additional information that is beneficial for the training efficiency from the power measurements.

The regression model proposed in Sect. 3.3 is the key point of the proposed novel training strategy to fit the function \(\theta = f(y)\). The LR model is often adopted for its simplicity; therefore, we improve \(\theta = f(y)\) to an LR function, and the fitting process of the function is completed by using the ML algorithm in this paper.

For the traditional searching method, all beam patterns are consistent, but the beam directions are different. The optimal beam pairs can be obtained by maximizing the received power. However, our strategy utilizes novel training beams with additional angle information, which can improve the efficiency and accuracy of beam training.

### 3.2 Specially designed beam patterns

In this section, we design the beam patterns to make \(\theta = f(y)\) an LR function and fit it by the ML method with a large amount of training data. To complete the fitting process, the input parameter is designed as

$$x = \frac{{\left| {y_{k} } \right|^{2} - \left| {y_{k + 1} } \right|^{2} }}{{\left| {y_{k} } \right|^{2} + \left| {y_{k + 1} } \right|^{2} }}.$$

(7)

It contains two power measurements \(\left| {y_{k} } \right|^{2}\) and \(\left| {y_{k + 1} } \right|^{2}\), where *k* represents the time slot. Since each time slot is very short, it can be considered that the channels of two adjacent measurements are almost the same. Thus, the input parameter is independent of the signal attenuation \(K\) and other factors.

To make \(x\) and \(\theta\) present a linear relationship, the denominator is designed as a constant C, and the numerator is designed as a linear function of \(\theta\) as

$$\left\{ \begin{gathered} \left| {y_{k} } \right|^{2} - \left| {y_{k + 1} } \right|^{2} = z\theta + b \hfill \\ \left| {y_{k} } \right|^{2} + \left| {y_{k + 1} } \right|^{2} = C \hfill \\ \end{gathered} \right.$$

(8)

where \(Z\) and \(b\) are both constants. Furthermore, \(\left| {y_{k} } \right|^{2}\) and \(\left| {y_{k + 1} } \right|^{2}\) are modelled as

$$\left\{ \begin{gathered} \left| {y_{k} } \right|^{2} = \frac{1}{2}z\theta + b_{1} \hfill \\ \left| {y_{k + 1} } \right|^{2} = - \frac{1}{2}z\theta + b_{2} \hfill \\ \end{gathered} \right.$$

(9)

where

$$\left\{ \begin{gathered} b_{1} + b_{2} = C \hfill \\ b_{1} - b_{2} = b \hfill \\ \end{gathered} \right..$$

(10)

It is difficult to design the BF vector that satisfies the beam patterns as in (9). Therefore, we quantize the spatial domain into multiple regions. The beam gain of each quantified region is determined by the sample value of (9). For convenience of explanation, an example is shown in Fig. 3. It describes the relationship between the horizontal direction and beam gain. A similar approach can be followed to estimate the beam direction.

Note that the numbers 3, 4, ……, 10 in Fig. 3 only show a simplified distribution of beam gain. Using (7) to calculate \(x\) in all beam regions, we find that each \(\theta_{h}\) of its corresponding region correlates to a specific \(x\). Furthermore, we find that \(x\) and \(\theta_{h}\) obey a monotonic relationship, which is a perfect relationship for the fitting function \(f(x)\) mentioned in this section utilizing ML.

The whole beam region in Fig. 3 is divided into several parts, and the BF vector *c* for the \(i\) th region can be obtained by using the FSM [33]

$$\begin{aligned} c_{i} (M \times (m_{h} - 1) + m_{v} ) & = e^{{ - j(X(m_{h} )\omega_{h0} (i) + Y(m_{v} )\omega_{v0} (i))}} . \\ & \quad \frac{{\sin (\omega_{h0} (i)X(m_{h} ))}}{{\pi X(m_{h} )}}.\frac{{\sin (\omega_{vb} (i)Y(m_{v} ))}}{{\pi Y(m_{v} )}}, \\ \end{aligned}$$

(11)

where \((m_{h} ,m_{v} )\) is the serial number of antennas, \(X(m_{h} )\) is the ratio of the antenna abscissa to horizontal separation \(d_{h}\), \(Y(m_{v} )\) is the ratio of the antenna ordinate to vertical separation \(d_{v}\), \((\omega_{h0} ,\omega_{v0} )\) is the centre of the beam region in the horizontal and vertical domains, and \(\omega_{hb}\) and \(\omega_{vb}\) are the widths from the centre to the horizontal and vertical boundaries, respectively. The BF vector \({\varvec{c}}_{k}\) is the sum of all \({\varvec{c}}_{i}\) values and is defined as

$${\varvec{c}}_{k} = \sum {{\varvec{c}}_{i} } .$$

(12)

Since the actual beam patterns generated by (12) cannot be exactly the same as those in Fig. 3, it only needs to ensure that the distribution of the beam gain meets the expectation. The relationship between the input and output is not perfectly linear; therefore, ML is adopted to obtain an accurate regression model.

### 3.3 Polynomial regression model

ML provides a variety of regression algorithms. Polynomial regression is a kind of LR model and has a wide range of applications since any function can be approximated by a polynomial. Compared with the basic linear regression, it is suitable for nonlinear functions. In this paper, we utilize polynomial regression to fit \(f(x)\), which can be expressed as

$$f({\varvec{x}}) = \beta_{0} + \beta_{1} {\varvec{x}} + \beta_{2} {\varvec{x}}^{2} + \cdots + \beta_{n} {\varvec{x}}^{n} ,$$

(13)

where \({\varvec{x}}\) and \(\beta_{n}\) are the feature and coefficient, respectively. The loss function of this model is

$${\text{J}}({\varvec{\beta}}) = \frac{1}{2}({\mathbf{X}}{\varvec{\beta}} - {\mathbf{Y}})^{{\text{T}}} ({\mathbf{X}}{\varvec{\beta}} - {\mathbf{Y}}),$$

(14)

where \({\mathbf{X}} = [1,{\varvec{x}},{\varvec{x}}^{2} , \ldots ,{\varvec{x}}^{n} ]\) and \({\varvec{\beta}} = [\beta_{0} ,\beta_{1} , \ldots ,\beta_{n} ]\). In this paper, power measurements are saved as \({\varvec{x}}\) in the dataset. The beam direction \(\theta\) is used as the training label for the regression. By minimizing the loss function, the coefficients \(\beta_{n}\) can be obtained.

The training result of the ML method depends on both the learning model and the dataset. When the handcrafted feature of (7) is applied, the search strategy is shown in Algorithm 1.

In Algorithm 1, \({\text{A}}_{1}\) and \({\text{A}}_{2}\) are the two BS beams shown in Fig. 3. \({\text{A}}_{3}\) and \({\text{A}}_{4}\) are obtained by exchanging the parameters of \(\theta_{h}\) and \(\theta_{v}\). The beam \({\text{B}}\) of MS is obtained in the same way. According to the description in Sect. 3.2, two measurements can determine one beam angle, \(\theta_{h}\) or \(\theta_{v}\). For 3D beams, the four angles of the beam pair can be obtained with eight measurements. As described in Algorithm 1, the optimized searching strategy can obtain all beam angles only through six measurements, while the traditional strategy can only complete the searching process of the first layer over the same time. Since the feature used by Algorithm 1 is a one-dimensional variable, the polynomial regression of one indeterminate model is sufficient. In the initial stage, the function \(f\left( x \right)\) is designed as an ideal linear function, therefore, the estimated function curve is close to a straight line.

The features used in Algorithm 1 are artificially designed. The error mainly comes from the difference between the actual and ideal beams. Taking the estimation of the horizontal angle \(\theta_{h}\) as an example, the ripples of the non-ideal beam in the horizontal domain result in different horizontal angles with the same received power. In addition, the ripples in the vertical domain result in different power values with the same horizontal angle. Note that the error affects the direction of the aligned beam, but the main lobe can still cover the actual angle region.

Algorithm 1 can be used to verify the rationality of the proposed beam design method. On this basis, the features can be replaced by the original power measurements. As shown in Algorithm 2, four measurements are taken as characteristic variables, and the horizontal and vertical angles are taken as labels. High-dimensional features can fit more complex data relationships. When there is an error between the actual beam and the ideal beam, our method can better learn the gain distribution of the actual beams.