 Research
 Open access
 Published:
RF fingerprint extraction and device recognition algorithm based on multiscale fractal features and APWOALSSVM
EURASIP Journal on Advances in Signal Processing volumeÂ 2023, ArticleÂ number:Â 131 (2023)
Abstract
RF fingerprints can be used for device identification and network access authentication. An RF fingerprint extraction and device identification algorithm based on multiscale fractal features and APWOALSSVM is proposed. First, a Hilbert transform is performed on the original RF signal, and differential operations are performed on the I/Q signal. Then, the variational modal decomposition is performed on the differentially processed I/Q signals separately to obtain a number of bandlimited Intrinsic Modal Function (IMF) components. The fractal box dimension of each IMF component is calculated separately as the firstdimensional RF fingerprint. The multifractal spectrum of the original RF signal is further computed by Multifractal Detrended Fluctuation Analysis (MFDFA) as the seconddimensional RF fingerprint. The RF fingerprint feature vector consists of a combination of the first and seconddimensional RF fingerprints and is used to train the LSSVM model. Since the penalty factor and kernel parameter width directly affect the recognition accuracy and generalization ability of the LSSVM model, APWOA is used to optimize the hyperparameters of the LSSVM model. Finally, the LSSVM model is constructed using the optimal hyperparameters and the performance of the model is verified on a test dataset. The experimental results show that the proposed model achieves an average recognition accuracy of 99.13\(\%\) for 16 Bluetooth devices. Multiscale fractal features consisting of fractal box dimensions and multiple fractal dimensions are more beneficial in the recognition of wireless devices as compared to single features. In addition, the proposed model possesses superior recognition capabilities when compared to classical machine learning algorithms.
1 Introduction
The new generation of wireless communications represented by 5Â G can support ultradense device connectivity for a wide range of wireless devices as well as the Internet of Things such as smart homes and industrial automation [1]. However, due to the openness of wireless transmission, massive device access faces serious security threats. Malicious attackers can use a variety of wireless devices to interfere with normal communications and illegally steal privacy and sensitive data from legitimate devices [2]. In order to prevent network security problems that may arise at any time, it is urgent to explore reliable and efficient communication security guarantee technology. In recent years, radio frequency fingerprint (RFF) extraction and recognition techniques for wireless devices have received much attention. RF fingerprints are reflected in the RF signals emitted by the device and are formed due to hardware differences such as manufacturing tolerances and drift tolerances of the electronic components inside the device. RF fingerprints are the basic characteristics of physical devices, reflecting the hardware characteristics of the device itself, with uniqueness and shortterm invariance. RF fingerprint extraction and identification technology is to analyze the RF signals of wireless devices to extract the features that can characterize the hardware differences of the devices as RF fingerprints, and then identify and classify the wireless devices.
Feature extraction methods have garnered considerable attention among researchers due to their substantial impact on improving the performance of radio frequency signal classification. Reference [3] introduces three feature extraction approaches based on signal energy, characterizing alternative and innovative ways for its use. Reference [4] introduces three innovative feature extraction methods based on zerocrossing rates (ZCRs) for pattern recognition. Reference [5] introduces two feature extraction methods based on entropy for digital signal processing and pattern recognition. Furthermore, researchers have investigated feature extraction based on the Enhanced Teager Operator [6] and wavelet transform [7].
Wavelet transforms [8, 9] have a strong impact on signal feature extraction. Reference [10] applies wavelet transforms to image information content processing. Reference [11] introduces a new wavelet transformbased smooth ordering (WTSO) method, which effectively enhances the classification accuracy of spectral images. Reference [12] proposes a new framework of adaptive multiscale graph wavelet decomposition for signals defined on undirected graphs. Feature extractionbased entropy [13, 14] methods are also widely used in digital signal processing and pattern recognition.
Reference [15] first proposed that the fractal dimension of a transient RF signal can be extracted as an RF fingerprint to identify a device. Reference [16] extracts the amplitude and phase of transient RF signals as RFF. Reference [17] investigates RF fingerprinting of Bluetooth devices and IEEE 802.11b devices, and for the first time, features of different dimensions such as transient signal amplitude, phase, inphase component, quadrature component, power, and discrete wavelet transform coefficients are combined. In addition, considering that environmental factors may lead to changes in RF fingerprints, it is proposed that the RF fingerprint libraries of wireless devices should be updated at regular intervals. Reference [18] used the statistically filtered spectrum of transient RF signals as RF fingerprints, and the similarity of RF fingerprints of different devices was measured by using the Mahalanobis distance metric to realize the identification of CC2420 wireless sensor nodes. Reference [19] proposed an RF fingerprinting method, which firstly uses the discrete Fourier transform to extract the power spectrum of the logarithmic envelope, then uses the cosine transform to compress the frequency domain features, and finally obtains a 7dimensional feature vector. Reference [20] proposed an RF fingerprint extraction method based on adaptive noise complementary integrated empirical modal decomposition and multidomain joint entropy. After extracting the RF fingerprints, corresponding classifiers need to be constructed to recognize and classify them. Probabilitybased classifiers identify the samples to be tested according to the probability magnitude, such as Bayesian classifiers, and KNearest Neighbor (KNN) classifiers. Reference [21] extracted 15 statistical features from energy transients for neighborhood component analysis for 15 different types of UAVs, and selected the three most significant features as RFF, and the classification accuracy of the KNN classifier could reach 98.13\(\%\) when the SNR was 25Â dB. Reference [22] used Artificial Neural Network (ANN) classifier and reference [23] used Probabilistic Neural Network (PNN) classifier. Since PNN needs to store all the training vectors, the memory requirement and training time grows linearly with the number of training vectors. Kernel functionbased classifiers convert the inner product operation of the samples into kernel function operation so as to map the input data to a higher dimensional space to process to find more efficient classification boundaries, such as Support Vector Machine (SVM) and Fisher classifier. Reference [24] used Gaussian Mixture Models (GMM) to extract the RFF of wireless devices and recognized the RFF at different SNRs by SVM. The results showed that the classification accuracy was also higher than 75\(\%\) when the SNR was 0Â dB. Reference [25] proposed a weighted integrated Convolutional Neural Network (CNN) migration learning algorithm for the small sample problem to identify radar active spoofing interference. The algorithm first generates the timefrequency distribution map of the jamming signal by ShortTime Fourier Transform (STFT), then constructs multiple datasets by combining the real part, imaginary part, mode, and phase of the signal, and finally realizes the jamming recognition by the integrated CNN model based on weighted voting and migration learning. Reference [26] proposed a lightweight convolutional neural networkbased RF fingerprint identification model, where the raw data reflecting the characteristics of the Ichannel, Qchannel, and both I/Qchannel signals are sequentially used as inputs to the CNN, and based on the results of the three submodels, the final identification result is determined by voting. The results show that when the SNR is 5Â dB, the recognition accuracy of the four transmitting devices can reach 97.25\(\%\). In recent years, Deep Learning (DL) technology has been successfully applied to image recognition, speech recognition, and target detection [27,28,29]. Many scholars have also tried to apply it to RF fingerprintbased device identification. Reference [30] proposed a Deep Neural Network (DNN) with multistage training, which achieved 100\(\%\) recognition accuracy for 12 transmitters. Some scholars have also directly used the raw data as the input to the deep learning model so that the model performs adaptive feature extraction and accomplishes the recognition task. Reference [31] first utilizes a complexvalued residual network to mine the correlation between the inphase and orthogonal components of the RF signal and then extracts onedimensional timedomain signal features by onedimensional convolution. The features are weighted according to their importance and combined with the attention mechanism to realize device recognition. The results show that the method has high accuracy and performance for communication radiation source identification. Reference [32] proposed a RF fingerprinting scheme for IoT devices based on Longshort Term Memory (LSTM) network and convolutional neural network. Firstly, the RF data are used as an input to the LSTM network to extract the longterm dependent features containing time information, and then CNN is utilized for secondary feature extraction and used for device identification, which can reach more than 99\(\%\) recognition accuracy. In this paper, an RF fingerprint extraction and device identification algorithm based on multiscale fractal features and an APWOALSSVM model is proposed with the following main contributions:

A framework for RF fingerprint extraction and device identification based on the APWOALSSVM model is proposed. The Least Squares Support Vector Machine (LSSVM) model transforms inequality constraints into equality constraints and modifies solving a quadratic programming problem into solving a set of linear equations. The computational complexity of the classifier is reduced by using the paradigm of the error as the loss function. Whale Optimization Algorithm based on Adaptive Parameters (APWOA) can improve the convergence speed of the traditional WOA and avoid falling into local optimum by adaptive position weights and probability thresholds.

An RF fingerprint feature vector construction method based on multiscale fractal feature combination is proposed. Firstly, the Hilbert transform is executed on the original RF signals and the differential operation is performed on the I/Q signals. Then, the variational modal decomposition is executed on the differentially processed I/Q signals separately to obtain several bandlimited Intrinsic Modal Function (IMF) components. The fractal box dimension of each IMF component is calculated separately as the firstdimensional RF fingerprint. The multifractal spectrum of the original RF signal is further calculated by Multifractal Detrended Fluctuation Analysis (MFDFA) as the seconddimensional RF fingerprint. The first and seconddimensional RF fingerprints are combined to form the RF fingerprint feature vector to train the LSSVM model.

Since the penalty factor and the width of the kernel parameter affect the recognition accuracy and generalization ability of the LSSVM model, the hyperparameters of the LSSVM model are firstly optimized by using APWOA, and then, the optimal hyperparameters are used to construct the LSSVM model, and then, the performance of the model is finally verified on the test dataset.

The RFF dataset of Bluetooth devices disclosed in reference [33] was expanded, and experimental tests were conducted on the expanded dataset. The results show that the proposed model achieves an average recognition accuracy of 99.13\(\%\) for 16 Bluetooth devices. Compared with single features such as fractal box dimension or multiple fractal dimensions, multiscale fractal features are more conducive to the correct recognition of wireless devices. Compared with the classical machine learning algorithms, the proposed model has superior recognition ability.
The rest of this paper is organized as follows: SectionÂ 2 introduces the relevant techniques used in the proposed detection model. Then, the scheme design is presented in Sect.Â 3. Next, the simulation result is presented in Sect.Â 4. Finally, Sect.Â 5 concludes the paper.
2 Fractal dimension theory and variational modal decomposition
In this section, the relevant techniques used in the proposed system model are illustrated first. Based on those, we then propose the RF fingerprint extraction and device recognition algorithm based on MultiScale fractal features and APWOALSSVM.
2.1 Fractal box dimension
Fractal Dimension (FD) is based on fractal geometry, using fractional dimensions to measure the fractal shape features in nature, mainly used to describe the complexity and selfsimilarity of irregular things, widely used in signal complexity analysis [34]. The box dimension, also known as the counting box dimension or Minkowski Dimension, is simple to compute and can accurately characterize the geometric complexity of shapes, and is mainly used to measure the geometric characteristics of shapesâ€™ contours or internal structures. Extending the idea of fractal box dimensions for fractal shapes based on the definition of dimensions for Euclidean geometries has the following relationship:
Equation (1) implies that the fractal and the fractal form m of dimension D are equipartitioned, with each equipartition having side length \(\varepsilon\). An expression for the dimension D of the fractal box is obtained by taking logarithms on both sides of the equation simultaneously:
If there exists a D such that \(\varepsilon \rightarrow 0\), then m and \(\varepsilon\) satisfy the following relation:
Further, denote by \(N(\mathbf{{S}},\varepsilon )\) the minimum number of boxes covering a twodimensional curve \({\varvec{S}}\) of side length \(\varepsilon\). If there exists a fractal box dimension \({D_b}\) such that \(\varepsilon \rightarrow 0\), then there is as follows:
If \(\varepsilon \rightarrow 0\) and the limit exists, then the fractal box dimension of the curve \({\varvec{S}}\) can be obtained:
As shown in Fig.Â 1, in order to extract the fractal box dimension of the acquired RF signal waveform, the first step is to determine the metric scale \(\varepsilon\) of \({\varvec{S}}\), and cover the unit square with a small square box of side length \(\varepsilon\). The size of the metric scale determines the fineness of the feature description by the fractal box dimension. The size of the metric scale \(\varepsilon\) determines the fineness of the fractal box dimension for the feature description, the smaller \(\varepsilon\) is, the finer the fractal box dimension characterizes the features; the larger \(\varepsilon\) is, the coarser the fractal box dimension characterizes the features. Obviously, in the process of covering, not all small square boxes can have overlapping parts with curve \({\varvec{S}}\), there must be a part of small square boxes with empty contents, the number of nonempty boxes is counted and recorded as \(N({\varvec{S}},\varepsilon )\), and then all the scattering coordinates \((\log (1/\varepsilon ),\log (N({\varvec{S}},\varepsilon )))\) are depicted in the double logarithmic coordinate system, and then finally a straight line is fitted by the method of least squares, and the absolute value of the slope of the straight line is the fractal box dimension \({D_b}\) of the curve \({\varvec{S}}\).
In practice, the RF signal obtained using the acquisition device is discrete, and the fractal box dimension is calculated as follows:
Step 1 Intercept an RF signal time series \({\varvec{S}} = \{ s(i)i = 1,2, \cdots ,N\}\) of length N;
Step 2 Select M small square box \({\varepsilon _m},m = 1,2, \ldots , M\), to cover the square appearing in the RF signal time series \({\varvec{S}}\). Note the number of nonempty boxes:
Step 3 The least squares algorithm is used to fit the scatter coordinates \((\ln (1/{\varepsilon _m}),\ln (N({\varvec{S}},{\varepsilon _m}))),m = 1,2, \cdots , M\) into a straight line, and the absolute value of the slope of the line is the fractal box dimension \({D_b}\) of \({\varvec{S}}\).
2.2 Multifractal detrended fluctuation analysis
MFDFA aims to eliminate the disturbing trend terms [35], which can not only effectively measure the fluctuation of nonlinear time series, but also accurately estimate the multifractal spectrum. Compared with the fractal box dimension, multifractal can reflect the irregularity and selfsimilarity of the signal as a whole as well as highlight the local singularity of the signal. For the RF signal time series \({\varvec{S}} = \{ s(i)i = 1,2, \cdots ,N\}\), the MFDFA is computed as follows: Step 1: Calculate the cumulative deviation of \({\varvec{S}}\) to obtain the new time series \({\varvec{y}} = \{ y(i)i = 1, \cdots ,N\}\):
Step 2 Divide the new time series \({\varvec{y}} = \{ y(i)i = 1, \cdots , N\}\) into \(\left\lceil {N/s} \right\rceil\) equal subintervals of length s of nonoverlapping lengths in the forward direction from the starting point. Since the length of the sequence N is not necessarily an integer multiple of s, in order to ensure that the valid information of the original RF signal is not discarded, the same division process is repeated from the end point of \({\varvec{y}} = \{ y(i)i = 1, \cdots ,N\}\) in the reverse direction. Then \(\left\lceil {N/s} \right\rceil\) equallength subintervals of length s are also obtained without overlapping each other, and finally \(2 \times \left\lceil {N/s} \right\rceil\) equallength subintervals of length s are obtained. For subinterval v, \(\nu = 1, \ldots ,2 \times \left\lceil {N/s} \right\rceil\), a least squares algorithm is fitted to obtain the \(m\mathrm{{th}}\) order multinomial \((m)_v^{(m)}(t),t = (v  1)s + i\). \((m)_v^{(m)}(t)\) is a local trend function to characterize the general trend of the time series in subinterval v. Calculate the residuals \({{\tilde{y}}_v}(i)\):
Step 3 In order to eliminate the localized trend in the subinterval v, calculate its rootmeansquare error:
Step 4 Calculate the mean of the rootmeansquare error according to equation (9) and equation (10) to get the order q volatility function \({F_q}(s)\):
Different values of the subinterval length s affect the fluctuation function \({F_q}(s)\). If s is too small, \({F_q}(s)\) is mainly affected by the fastvarying fluctuations in \({\varvec{S}}\). If s is too large, \({F_q}(s)\) is mainly affected by the slowvarying fluctuations in \({\varvec{S}}\). The fluctuation function \({F_q}(s)\) is also affected by different values of the order q. When \(q < 0\) and \(q \ge 1\), the larger rootmeansquare error \({F_\nu }(v,s)\) tends to 0 to the power of q, and \({F_q}(s)\) depends mainly on the smaller rootmeansquare error \({F_\nu }(v,s)\); when \(q > 0\) and \(q \ge 1\), \({F_q}(s)\) depends mainly on the larger rootmeansquare error \({F_\nu }(v,s)\). Therefore, different subinterval lengths s and orders q need to be considered to reflect the degree of volatility of the time series \({\varvec{S}}\).
Step 5 Test whether \({F_q}(s)\) and s satisfy the following power function law relationship:
where h(q) is called the generalized Hurst index, which is used to measure whether the time series has fractal characteristics. If the above equation holds, it means that the time series \({\varvec{S}}\) has multiple fractal characteristics. A least squares algorithm is used to fit a double logarithmic curve about \({F_q}(s)\) and s whose slope is h(q):
where C is the fitting constant.
If the time series \({\varvec{S}}\) is monofractal, its rootmeansquare error \({F_\nu }(v,s)\) is the same in all subintervals and h(q) is constant; if the time series \({\varvec{S}}\) is uncorrelated or shortrange correlated, \(h(q) = 0.5\); if the time series \({\varvec{S}}\) is characterized by multifractality, h(q) is nonlinear with order q.
Step 6 Define the relationship between the generalized Hurst index and the quality index \(\tau (q)\) as:
The singularity index \(\alpha\) and multifractal spectrum \(f(\alpha )\) are computed using the Legendre transform:
The singularity index \(\alpha\) characterizes the strength of the singularity in different subintervals, and multifractal spectrum \(f(\alpha )\) reflects the different fractal dimensions \(\alpha\). If the time series \({\varvec{S}}\) is multifractal, then \(f(\alpha )\) has a singlepeaked bellshaped profile about \(\alpha\). Otherwise, \(f(\alpha )\) is constant.
2.3 Variational mode decomposition
Variational Mode Decomposition (VMD) is a nonlinear and nonsmooth time series signal decomposition method. By controlling the convergence conditions of the decomposition, the signal is decomposed into a number of bandlimited IMF components, and then a combination of a number of IMF components is selected to recover the original signal [36] realizing a compromise between accuracy and resolution. Compared with Empirical Mode Decomposition (EMD) and Local Mean Decomposition (LMD), VMD adopts frequencydomain nonrecursive modal solving to adaptively decompose nonlinear and nonsmooth signals, which can avoid mode aliasing and endpoint effects and obtain better decomposition results. The core idea of VMD is to iteratively update the center frequency and finite bandwidth of different IMF components. The update is stopped when the sum of the finite bandwidths of the IMF components is minimized. Essentially, the decomposition process of VMD is to construct and solve the variational problem, which is implemented in the following steps:
Step 1 Constructing variational models
Step 1.1 Assuming that the original RF signal is s(t) and the IMF component \({u_k}(t)\) (\(k = 1, \cdots , K\) denoting the number of decomposition layers) is a bandlimited function with center frequency \({f_k}\), then the spectrum of \({u_k}(t)\) can be made close to the fundamental spectrum by the Hilbert transform with:
where \({r_k}(t)\) is the new IMF component after performing the Hilbert transform, \(\delta (t)\) is the ideal impulse function, \({\omega _k} = 2\pi {f_k}\), \(*\) is convolution operators.
Step 1.2 Estimate the bandwidth of each IMF component by calculating the square of the \({l_2}\) parameter of the partial derivative of \({r_k}(t)\) with respect to time t:
where \({\partial _t}\) denotes the partial derivative with respect to time t.
Step 1.3 In order to minimize the bandwidth sum of all IMF components, the following variational model is constructed with the constraint that the sum of the IMF components is equal to the original RF signal s(t):
where \(\{ {u_k}\}\) denotes s(t) the set of different IMF components and \(\{ {\omega _k}\}\) denotes the set of center frequencies corresponding to different IMF components.
Step 2 solving variational problems
Step 2.2 Introducing a quadratic penalty factor a and a Lagrange multiplier b, P1 is transformed into an unconstrained variational problem to obtain an augmented generalized Lagrangian function of the following form:
where \(\alpha\) is used to attenuate the effect of Gaussian noise and \(\lambda\) is used to ensure the rigor of the constrained problem.
Step 2.2 The center frequency, bandwidth, and Lagrange multipliers of each IMF component are iteratively updated using Alternate Direction Method of Multipliers (ADMM) with the \(n\mathrm{{th}}\) iteration \(u_k^n(t)\),\(\omega _k^n\), and \({\lambda ^n}(t)\), respectively. The alternating iteration to find the optimal \({u_k}\), \({\omega _k}\), \(\lambda\), iteration process is as follows:
where \(i = 1,2, \ldots ,k\), \(\rho\) are preset noise tolerance limits, which should be set to meet the fidelity requirements of the modal decomposition of the original RF signal. \({\hat{u}}_i^n(\omega )\), \({\hat{s}}(\omega )\) and \({{\hat{\lambda }} ^n}(\omega )\) correspond to the Fourier transforms of \(u_i^n(t)\), s(t) and \({\lambda ^n}(t)\).
3 RF fingerprint extraction and device identification scheme design
Several IMF components reflecting different frequency information can be obtained using the VMD with the same amount of data as the original signal. If the individual IMF components are directly input into the classifier to perform classification, the highdimensional feature matrix will reduce the efficiency. In order to improve the classification efficiency, the highdimensional feature vectors need to be downscaled. In the field of communication, certain features of a signal can be characterized by complexity, and dimensionality reduction can be achieved by extracting the complexity features of different IMF components and representing them with different values. Therefore, there is a need to find a metric that can visually represent the decomposition detail component features. As mentioned earlier, the fractal dimension can reflect the irregularity and selfsimilarity of things and can be applied to physical layer authentication [37]. RF signals are decomposed into different IMF components by VMD, and FD can characterize the geometry of the waveform and reflect the nature of the signal as a whole. Therefore, in order to characterize the effective features of different IMF components, FD can be used to quantify the complexity of different IMF components. However, a single FD cannot highlight the local characteristics of the signal, and the FDs of different IMF components may be similar, which is not conducive to RFF identification. In contrast, the multiple fractal dimension can take into account the overall and local characteristics and describe the fractal structure of the signal more comprehensively. In this paper, the multiple fractal dimensions of different signals are extracted as the seconddimensional features to avoid classification errors caused by different signals having similar fractal box dimensions. The multiscale fractal feature formed by combining the fractal box dimension of the IMF component and the multiple fractal dimension as the RFF aims to fully reflect the subtle differences between wireless devices. As mentioned above, the LSSVM model is able to balance classification accuracy and generalization ability with low computational complexity. Therefore, the LSSVM model is selected in the recognition stage, and the recognition task is performed after training using multiscale fractal features. Considering that the penalty factor \(\gamma\) and kernel parameter width \(\sigma\) of the LSSVM model have a large impact on the recognition performance, the advantages of APWOAâ€™s strong global search capability and fast convergence speed are further utilized to perform hyperparameter optimization of the LSSVM model during the training process.
The framework of the RF fingerprint extraction and device identification algorithm based on multiscale fractal features and APWOALSSVM proposed in this paper is shown in Fig.Â 2. First, the RF signal r(t) emitted by the wireless device is collected using the acquisition device, and the parsed signal \(Y(t) = {y_I}(t) + j{y_Q}(t)\) is constructed by Hilbert transform. In order to comprehensively utilize the feature information of the I and Q signals and obtain the fractal box dimension features with more significant differences, the I and Q signals are differentiated to obtain:
where \(\delta\) is the difference interval, \(\eta\) is the I/Q phase mismatch distortion, and \({( \cdot )^ * }\) denotes the conjugate operation.
For the sake of brevity:
where \({d_I}(t)\) and \({d_Q}(t)\) are the differentially processed I and Q channel signals.
Let \(\delta = 10\),\(\eta = 1\), perform a differential operation on the I and Q channel signals. Then VMD is executed on the I and Q channel signals to decompose them into IMF components with different frequency components. First, the fractal box dimension of each IMF component is calculated separately as the firstdimensional feature, while the multiple fractal dimension of the original RF signal is calculated as the seconddimensional feature using MFDFA. The multiscale fractal features formed by the combination of the firstdimensional features and the seconddimensional features are used as RF fingerprint feature vectors. The constructed RFF feature vector is then used as input to the LSSVM classifier to perform training and classification. Model hyperparameter optimization is performed through APWOA to further improve the classification accuracy and generalization ability of the model. Finally, the trained LSSVM classifier is utilized to perform unknown device classification and its output is the recognition result.
4 Simulation experiment
4.1 Data preprocessing and realization process
4.1.1 Data preprocessing
In order to evaluate the performance of RF fingerprint extraction and device framework based on multiscale fractal features and APWOALSSVM, the RFF dataset of Bluetooth devices, which was disclosed by Uzundurukan in 2020 [33], is used for experiments. The dataset collects RF signals from several smartphones with builtin Bluetooth devices through four different sampling rates, with 150 sample data recorded for each device separately. Bluetooth devices with a sampling rate of 5 GSPS, 4 brands and 8 models are selected for RFF extraction and identification, and the brands and models of the selected phones are shown in Table 1.
The selected raw RF data contain 1200 samples from 16 Bluetooth devices. Due to the spurious signals generated by the digital oscilloscope used in the RF acquisition device, some irrelevant frequency components exist in the raw data, as shown in Fig.Â 3. For this reason, in the data preprocessing, a digital bandpass filter is used to filter the original data, and the filtered data are normalized to \([  1,1]\), obtaining the original signal waveform before and after preprocessing as shown in Fig.Â 4.
Due to the small sample size of the original dataset, the model is prone to overfitting, and in this paper, the Dynamic Time Wraping (DTW) Barycenter Averaging (DBA) method [38] is used to expand the dataset. As a data augmentation method, DBA is capable of generating an infinite number of new time series based on a given time series as follows:
Step 1 A randomly selected initial time series and assigned a weight of 0.5 is used as the initialized time series for the DBA;
Step 2 Calculate the DTW distance between the remaining time series and the initial time series;
Step 3 Two time series were randomly selected from the five DTW distance nearest time series and assigned a weight of 0.15 each;
Step 4 Distribute the remaining 0.2 weights equally among the remaining time series;
Step 5 Finally a new time series is obtained.
To save computational time, the number of remaining time series is set to 10. The DTW distances of the 10 time series from the original time series are calculated. Select 2 time series with the closest DTW distances and assign a weight of 0.15 to each of them, and distribute the weight of 0.2 equally to the remaining 8 time series to generate the new time series. The number of samples per Bluetooth device is expanded to 1200 using this method, and the expanded dataset has 19200 samples.
4.1.2 Experimental procedure
The implementation flow of the RF fingerprint extraction and device recognition framework based on multiscale fractal features and APWOALSSVM is shown in Fig.Â 5, which includes a training phase and a testing phase.
The complexity is quantified by the fractal box dimension to characterize the subtle features of different IMF components, while the multifractal dimension of the sample is obtained using MFDFA. Set the box edge length \({\varepsilon _m} = {2^{m  1}},m = 1,2, \ldots ,17\), subinterval length \(s = {2^\alpha },\alpha = 4,5, \ldots ,10\), and order q in steps of every 0.5, and sequentially take values in the range of \([  5,5]\). Combine the fractal box dimensions and multiple fractal dimensions of the I and Q signals to construct multiscale fractal features, and generate the RFF feature set for the training set. Finally, the RFF feature vectors are used as inputs to the LSSVM model, which is trained, and the classification performance of the model is evaluated by 10fold crossvalidation, while the model is subjected to hyperparameter optimization based on APWOA. The acquired optimal hyperparameters \(({\gamma ^ * },{\sigma ^ * })\) are utilized to build the APWOALSSVM model for subsequent test sets.
Test phase: similar to the training phase, the remaining 30\(\%\) of the expanded dataset is selected as the test set and the same data preprocessing operation is performed. The data samples in the test set are Hilbert transformed into I and Q signals and differentiated, and decomposed into IMF components with different spectral characteristics using VMD, respectively. The quantized complexity is quantified by the fractal box dimension to characterize the subtle features of different IMF components, and the multiple fractal dimensions of the samples are obtained by using MFDFA, and the related parameters are taken as above. The fractal box dimensions and multiple fractal dimensions of the I and Q signals are combined to construct multiscale fractal features to generate the RFF feature set for the test set. Finally, the RFF feature vectors are used as inputs to the trained APWOALSSVM model to classify them, and the classes of Bluetooth devices are determined based on the output of the model.
4.2 Experimental results and analysis
4.2.1 RF fingerprint features
Taking Apple iPhone5 A, LG G4 A, Samsung Note3 A and Sony XperiamaM5 A as examples, the multiple fractal spectra of the above four types of devices are computed separately using MFDFA, as shown in Fig.Â 6. The horizontal axis represents the singularity index, and the vertical axis represents the multifractal spectrum \(f(\alpha )\). The slopes of the left endpoints of \(f(\alpha )\) tend to positive infinity and the slopes of the right endpoints tend to negative infinity, corresponding to the singular exponents \(\alpha\) of the maximum and minimum fluctuations, respectively. The larger the spectral width of \(f(\alpha )\), the more inhomogeneous and multifractal the sequence is. The larger the singularity index corresponding to the extreme point of \(f(\alpha )\), the more irregular and random the sequence is. It can be seen that the left and right endpoint positions, spectral widths and singularity positions of the multifractal spectra of various types of devices have different degrees of differences, indicating that multifractal spectra of different devices have feasibility as RFF.
The device recognition accuracy is largely determined by the extracted RFF, and in order to verify the effectiveness of different RFF feature vectors, the Apple iPhone5 A, LG G4 A, Samsung Note3 A and Sony XperiamaM5 A are also taken as examples. The fractal box dimensions and multifractal dimensions of the dataset are calculated, respectively, as shown in Figs. 7 and 8. Figure 7 shows a collection of 100 sets of fractal box dimension feature vectors for different devices, each containing 10 feature values; Fig. 8 shows a collection of 100 sets of multiple fractal dimension feature vectors for different devices, each containing 20 feature values. From the figure, it can be seen that the two feature vectors take stable values and the feature vectors of different devices show clear boundaries. This indicates that fractal box dimension and multiple fractal dimension can reflect the differences between devices and can be used as RFF for device identification.
4.2.2 Model performance analysis
The experiments were carried out via MATLAB R2020b according to the implementation flow in Fig.Â 5. The parameters of APWOA were set as follows: the maximum number of iterations \(T = 50\) and the number of whale populations \(N = 30\). The penalty factor \(\gamma\) and the kernel parameter \(\sigma\) take values in the range of [0.0001,Â 500]. The classification error is chosen as the fitness function, and the optimized LSSVM parameter combination is obtained as \(({\gamma ^ * },{\sigma ^ * }) = (1.0181,28.7971)\) by iterative optimization of \(\gamma\) and \(\sigma\) via APWOA.
After completing the optimal parameter selection, the trained APWOALSSVM model is utilized to classify the test set and the results are shown in Fig.Â 9. The horizontal axis represents the test set samples and the vertical axis represents the device categories. Precision, recall and F1 score are used to evaluate the recognition performance of the model and the results are shown in Table 2. From Fig.Â 9 and Table 2, the APWOALSSVM model achieves an average recognition accuracy of 99.13\(\%\) for 16 Bluetooth devices. Specifically, the eight Apple devices and two LG devices in categories 1â€“10 have higher performance indicators in all three categories and better classification performance. This indicates that the APWOALSSVM model can effectively recognize these two categories of devices. The Apple iPhone5 B and Apple iPhone5s B have a recall rate of 98.33\(\%\) and 97.78\(\%\), respectively, and are misidentified as models of four other categories, which is the highest number of misclassified device model types among all devices. The three performance indicators for devices in categories 13 and 14, i.e., A and B with the model number Samsung S5, are low and more confusion has arisen. This is due to the better consistency of the hardware process between the devices, resulting in small differences in the RF signals, and the extracted RFFs do not fully reflect the differences between the devices, resulting in misclassification. In addition, the precision rate of category 11, i.e., the model Samsung Note3 A device, is only 96.73\(\%\), which is the lowest precision rate among all devices, and a total of 12 samples from the remaining five devices are misclassified as this category, indicating that the extracted RFFs do not adequately reflect the differences between the Samsung model devices, which leads to the model not being able to correctly identify them. The Sony model of the Precision, recall and F1 scores of both devices reach 100\(\%\), indicating that the RFF feature vectors distinguish them well enough to reflect the differences between the devices.
4.2.3 Multifeature ablation experiment
In order to verify that the multiscale fractal features consisting of the combination of fractal box dimension and multiple fractal dimension possess better properties, they are trained as inputs to the LSSVM model along with the single features such as Iway fractal box dimension, Qway fractal box dimension and multiple fractal dimension, and the combination of I and Qway fractal box dimensions. In addition, in order to further compare the noise sensitivity of different RFF features, random noise with signaltonoise ratio varying from 0 to 30Â dB was added in steps of every 5Â dB on the basis of the original data, respectively, to compare the accuracy of different features with SNR, and the results are shown in Fig.Â 10.
As can be seen from the figure, with the increase of SNR, the degree of fluctuation of the features is weakened, the RFF carried in the signal is more significant, the accuracy of recognition using different features increases, and the accuracy of the multiscale fractal feature is always higher than that of the remaining four features. At SNR of 0Â dB, the accuracy of multiscale fractal feature is 78.56\(\%\), which is 18.26\(\%\), 30.07\(\%\), 74.38\(\%\), and 4.77\(\%\) higher than the single features such as Ichannel fractal box dimensions, Qchannel fractal box dimensions and multiple fractal dimensions as well as the combination of features with two fractal box dimensions of I and Q respectively, which indicates that multiscale fractal feature can provide more effective information. Since the VMD can effectively separate the useful frequency components from the noise components, the fractal box dimension features of the I and Q signals are less affected by noise, while the multifractal dimension features extracted using MFDFA are sensitive to noise. With the improvement of SNR, the classification accuracy of the multiple fractal dimension features rises very significantly and keeps narrowing the gap with other features, indicating that they are susceptible to noise. Based on the above analysis, it can be seen that combining fractal box dimension and multiple fractal dimension together to form multiscale fractal features contains richer feature information of wireless devices and improves the recognition accuracy of wireless devices.
4.2.4 Comparison with other models
LDA, Decision Tree and SVM are selected as the comparison models of LSSVM to further compare their respective classification performance and the results are shown in Fig.Â 11.
As can be seen from the figure, the recognition accuracy of the four models gradually increases with the increase of SNR, and the recognition accuracy of the LSSVM model is always at the highest position. The recognition accuracy of LSSVM model is 78.56\(\%\) at SNR of 0Â dB, which is higher than the three comparison models of LDA, Decision Tree, and SVM by 10.11\(\%\), 21.31\(\%\), and 64.66\(\%\), respectively. The accuracy is more than 90\(\%\) at SNR of 15Â dB, while LDA and decision tree reach 90\(\%\) accuracy only at SNR of 25Â dB. This indicates that the APWOALSSVM model possesses good recognition accuracy at low SNR. The LDA and decision tree also show good recognition performance, while the SVM modelâ€™s recognition results differ greatly from the other three models, with the accuracy always lower than 82\(\%\), which indicates poor learning of multiscale fractal features. The superiority of the APWOALSSVM model is demonstrated by comparing it with the three models.
5 Conclusion
In this paper, an RF fingerprint extraction and device identification algorithm based on multiscale fractal features and WOALSSVM is proposed by manually constructing RF fingerprint feature vectors. In order to comprehensively utilize the feature information of I and Q signals, the original RF signals collected are firstly divided into I and Q by Hilbert transform and differentiated, and then decomposed into a number of bandlimited IMF components by using VMD, respectively, and the fractal box dimension of each IMF component is calculated separately. Considering that a single fractal dimension is difficult to reflect the local characteristics of the signal, multiple fractals are further introduced to make up for the defect. Multifractal spectrum of the original RF signal is calculated by MFDFA, which is used as the seconddimensional feature. The combination of the fractal box dimensions of the I and Q signals and the multiple fractal dimensions of the original RF signal constitutes the RFF feature vector, which is input into the LSSVM model for training, and then the trained model is utilized to perform the recognition and classification. Since the penalty factor \(\gamma\) and kernel parameter width \(\sigma\) directly affect the classification accuracy and generalization ability of the LSSVM model, the hyperparameters \((\gamma ,\sigma )\) of the LSSVM model are optimized using APWOA. Finally, the optimal hyperparameters \(({\gamma ^ * },{\sigma ^ * })\) are used to train the LSSVM model, and the test set is used to verify the classification performance of the model. The effectiveness of the model is demonstrated through experiments, and the proposed model achieves an average recognition accuracy of 99.13\(\%\) for 16 Bluetooth devices. Compared with single features such as fractal box dimension or multiple fractal dimensions, the combination of the two constitutes a multiscale fractal feature that is conducive to the correct identification of devices. Compared with classical machine learning algorithms, the proposed model possesses excellent classification performance.
This article considers multiscale fractal features and multifractal dimension features in feature selection. Therefore, in future research, we will also consider using or combining features such as discretetime wavelet transformbased features, traditional features such as energy, entropy, zerocrossing rates, and the Enhanced Teager Operator to enhance our model. Combining features extracted in different ways can better capture signal variations, and we believe it will lead to improved results in classification.
Availability of data and materials
The data that support the findings of the study are available on the request from the corresponding author Y.L.
References
A. Jagannath, J. Jagannath, T. Melodia, Redefining wireless communication for 6g: signal processing meets deep learning with deep unfolding. IEEE Trans. Artif. Intell. 2(6), 528â€“536 (2021)
C. Iwendi, J.H. Anajemba, T. Yue, P. Chatterjee, W.S. Alnumay, A secure multiuser privacy technique for wireless IoT networks using stochastic privacy optimization. IEEE Internet Things J. 9(4), 2566â€“2577 (2022)
R.C. Guido, A tutorial on signal energy and its applications. Neurocomputing 179, 264â€“282 (2016)
R.C. Guido, ZCRaided neurocomputing: a study with applications. Knowl.Based Syst. 105, 248â€“269 (2016)
R.C. Guido, A tutorial review on entropybased handcrafted feature extraction for information fusion. Inf. Fusion 41, 161â€“175 (2018)
R.C. Guido, Enhancing teager energy operator based on a novel and appealing concept: signal mass. J. Frankl. Inst. 356(4), 2346â€“2352 (2019)
R.C. Guido, Wavelets behind the scenes: practical aspects, insights, and perspectives. Phys. Rep. 985, 1â€“23 (2022)
E. Guariglia, R.C. Guido, Chebyshev wavelet analysis. J. Funct. Spaces 2022(1), 1â€“17 (2022)
E. Guariglia, S. Silvestrov, Fractionalwavelet analysis of positive definite distributions and wavelets on dâ€™(c), Engineering Mathematics II, ed. by S. Silvestrov, M. Rancic (Springer, 2016) vol. 1, pp. 337â€“353
S.G. Mallat, A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 674â€“693 (1989)
E. Guariglia, R.C. Guido, Hyperspectral image classification using wavelet transformbased smooth ordering. Int. J. Wavel. Multiresolut. Inf. Process 17, 1950050 (2019)
X. Zheng, Y.Y. Tang, J. Zhou, A framework of adaptive multiscale wavelet decomposition for signals on undirected graphs. IEEE Trans. Signal Process. 67(7), 1696â€“1711 (2019)
E. Guariglia, Harmonic sierpinski gasket and applications. Entropy 20(9), 714 (2018)
E. Guariglia, Primality, fractality and image analysis. Entropy 21(3), 304 (2019)
N. Serinken, O. Ureten, Generalised dimension characterisation of radio transmitter turnon transients. Electron. Lett. 36(12), 1064â€“1066 (2000)
O. Tekbas, N. Serinken, O. Ureten, An experimental performance evaluation of a novel radiotransmitter identification system under diverse environmental conditions. Can. J. Electr. Comput. Eng. 29(3), 203â€“209 (2004)
J. Hall, M. Barbeau, E. Kranakis, Enhancing intrusion detection in wireless networks using radio frequency fingerprinting. Commun. Internet Inf. Technol. 1, 201â€“206 (2004)
B. Danev, S. Capkun, Proceedings of 2009 international conference on information processing in sensor networks, San Francisco. Commun. Internet Inf. Technol. 1, 25â€“36 (2009)
J. Zhang, Q. Wang, X. Guo, X. Zheng, D. Liu, Radio frequency fingerprint identification based on logarithmic power cosine spectrum. IEEE Access 10, 79165â€“79179 (2022)
J. Wei, L. Yu, L. Zhu, X. Zhou, Rf fingerprint extraction method based on ceemdan and multidomain joint entropy. Wirel. Commun. Mobile Comput. 5326892 (2022)
M. Ezuma, F. Erden, C.K. Anjinappa, O. Ozdemir, I. Guvenc, Detection and classification of UAVs using RF fingerprints in the presence of WiFi and bluetooth interference. IEEE Open J. Commun. Soc. 1, 60â€“76 (2020)
H.C. Choe, C.E. Poole, A.M. Yu, H.H. Szu, H.H. Szu, Novel identification of intercepted signals from unknown radio transmitters, in Proceedings of Wavelet Applications (1995), pp. 504â€“517
D. Shaw, W. Kinsner, Multifractal modelling of radio transmitter transients for classification, in IEEE WESCANEX 97 Communications, Power and Computing. Conference Proceedings (1997), pp. 306â€“312
Y. Ma, Y. Hao, Antenna classification using gaussian mixture models (GMM) and machine learning. IEEE Open J. Antennas Propag. 1, 320â€“328 (2020)
G. Bahle, V.F. Rey, S. Bian, H. Bello, P. Lukowicz, Using privacy respecting sound analysis to improve bluetooth based proximity detection for COVID19 exposure tracing and social distancing. Sensors 21(16) (2021)
T. Yang, S. Hu, W. Wu, L. Niu, D. Lin, J. Song, Conventional neural networkbased radio frequency fingerprint identification using raw I/Q data. Wirel. Commun. Mob. Comput. 8681599 (2022)
S. Zhang, X. Zhao, Q. Tian, Spontaneous speech emotion recognition using multiscale deep convolutional LSTM. IEEE Trans. Affect. Comput. 13(2), 680â€“688 (2022)
G. Qi, Y. Zhang, K. Wang, N. Mazur, Y. Liu, D. Malaviya, Small object detection method based on adaptive spatial parallel convolution and fast multiscale fusion. Remote Sens. 14(2), 420 (2022)
G. Gao, Y. Yu, J. Yang, G.J. Qi, M. Yang, Hierarchical deep CNN feature setbased representation learning for robust crossresolution face recognition. IEEE Trans. Circuits Syst. Video Technol. 32(5), 2550â€“2560 (2022)
K. Youssef, L. Bouchard, K. Haigh, J. Silovsky, B. Thapa, C.V. Valk, Machine learning approach to RF transmitter identification. IEEE J Radio Freq. Identif. 2(4), 197â€“205 (2018)
Q. Lingzhi, J.A. Yang, K. Huang, H. Liu, Specific emitter identification based on onedimensional complexvalued residual networks with an attention mechanism. Bull. Polish Acad. Sci. Tech. Sci. 69(5), 138814 (2021)
K. Huang, X. Li, S. Wang, Z. Geng, G. Niu, Rfid scheme for IoT devices based on LSTMCNN. J. Sens. 8122815 (2022)
E. Uzundurukan, Y. Dalveren, A. Kara, A database for the radio frequency fingerprinting of bluetooth devices. Data 5(2), 55 (2020)
P. Maragos, F.K. Sun, Measuring the fractal dimension of signals: morphological covers and iterative optimization. IEEE Trans. Signal Process. 41(1), 108â€“121 (1993)
J.W. Kantelhardt, S.A. Zschiegner, E. KoscielnyBunde, S. Havlin, A. Bunde, H. Stanley, Multifractal detrended fluctuation analysis of nonstationary time series. Phys. A Stat. Mech. 316(1â€“4) (2002)
K. Dragomiretskiy, D. Zosso, Variational mode decomposition. IEEE Trans. Signal Process. 62(3), 531â€“544 (2014)
X. Li, S. Zeng, W. Tong, Enhancing carrier frequency offset authentication via fractal dimension, in International Conference on Networking and Network Applications (NaNA), vol. 2018, 137â€“142 (2018)
H.I. Fawaz, G. Forestier, J. Weber, L. Idoumghar, P.A. Muller, Data augmentation using synthetic data for time series classification with deep residual networks (2018)
Acknowledgements
Not applicable.
Funding
The work was supported by the National Natural Science Foundation of China under Grant 62001067.
Author information
Authors and Affiliations
Contributions
The concept of this paper was put forward by Wenjiang Feng, Yuan Li and Chongchong Wu. Wenjiang Feng built and optimized the model, made the performance simulations, wrote the paper and revised the paper. Other authors assisted in related work. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisherâ€™s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The work was supported by the National Natural Science Foundation of China under Grant 62001067.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Feng, W., Li, Y., Wu, C. et al. RF fingerprint extraction and device recognition algorithm based on multiscale fractal features and APWOALSSVM. EURASIP J. Adv. Signal Process. 2023, 131 (2023). https://doi.org/10.1186/s13634023010989
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13634023010989