Hierarchical coordinated anti-jamming channel access in clustering networks: a multi-leader multi-follower Stackelberg game approach

This paper mainly investigates the multi-user coordinated anti-jamming problem in clustering communication networks. In such kinds of networks, there exist multiple clusters and multiple users who communicate with their receivers simultaneously. Besides, a malicious jammer persistently attacks channels with wide-band and dynamic changing jamming signals. To cope with these challenges brought by the large-scale clustering network and the dynamic wide-band jamming, a hierarchical coordinated anti-jamming approach is proposed, and a multi-leader multi-follower Stackelberg game is introduced to model the anti-jamming problem. In detail, cluster heads act as leaders, and select available frequency bands to avoid jamming attacks, while users in each cluster act as followers and select corresponding channels distributedly and independently. Moreover, it is proved that there exist multiple Stackelberg equilibriums (SEs) in the proposed game. To obtain SEs, a hierarchical coordinated anti-jamming channel access (HCACA) algorithm is designed. Simulation results illustrate that the proposed approach is effective to cope with the dynamic wide-band jamming attacks. Furthermore, it is also depicted that the proposed approach outperforms the distributed anti-jamming comparative approach in terms of convergence speed.


Introduction
With the development of wireless communication technologies, the number of wireless communication devices increases explosively, causing severe shortage of spectrum resources.Thus, how to coordinate the internal interference is a vital problem that remains to be solved.In addition, due to the advance of the software-defined radio, it is convenient for malicious users (e.g., malicious jammer) to carry out jamming attacks with low-cost and wide-band noise signals; hence, there exists more serious security threaten for communication devices.In a word, it is an important issue to "coordinate the internal interference" and "resist external malicious attacks" simultaneously in wireless communication networks [1][2][3][4].
Focusing on the internal coordination and external confrontation problem for effective spectrum resource utilization, game theory [5][6][7][8][9][10][11][12][13][14] is a feasible tool to model and analysis the internal and external competitive or cooperative relationships among different decision-makers (i.e., users or malicious jammer).For example, as for internal coordination, authors in [6] reviewed the decision-theoretic solutions for channel access strategies in the opportunistic spectrum access system and presented some effective game models with respect to internal interference avoidance.Besides, authors in [7] reviewed the applications of repeated games in wireless networks, while authors in [8] formulated users' competition for channel access using a non-cooperative game and proved that the proposed game model is an ordinal potential game where there exists at least one pure strategy Nash equilibrium (NE).
While considering the existence of malicious jamming attacks, the Stackelberg game [15][16][17][18][19][20][21][22][23][24][25][26][27] is a suitable game model to formulate the competitive interactions between the user-side and the jammer-side as it can well depict the sequential decision-making relationship between two countermeasure sides.For example, Yang et al. formulated the anti-jamming power control problem using the Stackelberg game [15,16].In detail, the user was the leader, and the smart jammer acted as the follower of the game.Besides, the utility functions of the user and the jammer were designed respectively.Authors in [17] considered the anti-jamming power control problem with observation error and derived the Stackelberg equilibrium (SE) of the proposed Stackelberg game.In addition, the authors in [18] formulated the competition between one user and one jamming as a Bayesian Stackelberg game and took the incomplete information into consideration.In [19], the authors investigated the discrete anti-jamming power control problem, and solved the problem using the Stackelberg game and Q-learning algorithm.Considering the multi-user anti-jamming scenario, in [20], a one-leader multi-follower Bayesian Stackelberg game was formulated, and the influence of observation error as well as mobility of unmanned aerial vehicles (UAVs) were investigated.Moreover, authors in [21] investigated the communication confrontation scenario, and modeled the interactions in the confrontation scenario as a Stackelberg game, with each player seeking for maximizing their utility respectively.However, there exist some main challenges to extend the above game approaches to the multi-user coordinated channel access scenario in the clustering networks with dynamic wide-band jamming attacks.(i) The increase of network scale causes more severe internal conflicts, thus it is important to coordinate the co-channel interference as well as to reduce the convergence time and complexity of the channel access algorithm.(ii) Considering the existence of the dynamic wide-band attacker, it is harder for users to adapt to the dynamic change of the jamming attack and to avoid the wide-band jamming signals which take up more spectrum resources than the single tone jamming signals.
To solve these challenges, and to fully take advantage of the clustering network, a multi-leader multi-follower Stackelberg game is formulated to model the competitive interactions between cluster heads and cluster users.Moreover, to accommodate the dynamic changing of the jamming attacks, the idea of jamming utilization is also introduced.To sum up, the main contributions of this paper are as follows: • The malicious wide-band jamming signals are "utilized" by cluster heads as coordination signals [28] to guide channel access of users, which means for different sensed jamming signals, different anti-jamming channel access strategies are adopted by users.The application of coordination signals helps users to adapt to the dynamic changing of jamming attacks.
• A multi-leader multi-follower Stackelberg game is formulated, where cluster heads act as leaders, and select available frequency bands to avoid jamming attacks.While users in each cluster act as followers, and select corresponding channels distributedly and independently.
• The proposed Stackelberg game is decomposed into multi-leader sub-game from the coarse granularity and the multi-follower sub-game from the fine granularity.It is proved that the proposed game has at least one SE for each jamming attack.
• To obtain SEs, a hierarchical coordinated anti-jamming channel access (HCACA) algorithm is designed.In addition, simulation results illustrate that the proposed approach is effective to cope with the dynamic wide-band jamming attacks, and it accelerates the convergence of the clustering network compared with the distributed learning scheme.
Note that some existing works investigated the anti-jamming problem using gametheoretic framework [20,29].The main differences are summarized as follows: as mentioned in Section 1, the authors in [20]proposed a one-leader multi-follower Bayesian Stackelberg game, and the closed-form of the SE was derived.However, work [20] is not suitable for multi-user scenarios with a large scale, and the proposed approach can not adapt to the dynamic changing characteristic of jamming attacks.Besides, in [29], the authors investigated the distributed coordinated anti-jamming scenario.However, the proposed approach may not be well-adapted to the wide-band jamming in the clustering network.Thus, different from these two works, we fully take advantage of the clustering network in this work, and then propose a hierarchical coordinated anti-jamming channel access approach that is effective with the property of fast convergence.
The rest of this paper is organized as follows.In Section 2, methods are introduced briefly.In Section 3, the system model and problem formulation are presented.In Section 4, the multi-leader multi-follower Stackelberg game for hierarchical coordinated anti-jamming channel access is formulated.While in Section 5, the hierarchical coordinated anti-jamming channel access algorithm is proposed.In Section 6, simulation results and discussions are illustrated.Finally, the conclusion is made in Section 7.

Methods
The aim of this study is to solve the multi-user coordinated anti-jamming channel access problem in clustering communication networks.In detail, we propose a hierarchical coordinated anti-jamming approach, and formulate the anti-jamming problem as a multi-leader multi-follower Stackelberg game.Besides, we prove that there exist multiple Stackelberg Equilibriums (SEs) in the proposed game, and we then design a hierarchical coordinated anti-jamming channel access (HCACA) algorithm to obtain NEs of the game.
The paper presents a "game & learning" structure, where the proposed game is used to model the interactions among different decision-makers, and the proposed learning algorithm is to find the optimal or nearly optimal solutions of the game model.The theoretical analysis is supported by the experimental evaluation with the randomly generated cluster-ing network under different dynamic wide-band jamming patterns, while the simulations have been carried out with Matlab R2019a.

System model
In this paper, we consider a multi-user clustering network that is under the attack of a malicious jammer.As shown in Fig. 1, the network is divided into H clusters according to the geographical location of communication devices [30], and the cluster set is H= {1, ..., h, ..., H}.Besides, the neighbor set of cluster h is denoted as J h .In each cluster, there exist multiple transmitter-receiver pairs, and each transmitter-receiver pair is assumed to be one user.Denote M h as the number of users in cluster h, and the user set is M h .In cluster h, the kth user is denoted as m h,k , thus we have m h,k ∈ M h .There exist A orthogonal channels for users to access, and the channel set is A. To disturb the legitimate transmissions of users, a malicious jammer incessantly sends high-power noise signals to multiple channels.
In each cluster, there exists a cluster head who can exchange information with cluster users.For example, each cluster head can send available frequency band information to its governed users.After obtaining such frequency information, cluster users are able to select their transmission channels distributedly.Denote the transmission channel of user k in cluster h as a h,k , and the transmission throughput of this user is expressed as follows: where B a h,k is the channel bandwidth, P h,k is the transmission power of the kth user in cluster h, and G h,k is the channel gain of the transmission link.Assume that wireless Fig. 1 The multi-user clustering network under malicious jamming attacks communication channels undergo block fading, and the Rayleigh fading channel model is considered [31].Thus, G h,k = d −α h,k ε h,k , where d h,k denotes the transmission distance, α is the path-loss factor, and ε h,k is the instantaneous random component.Besides, N a h,k is the noise power of the channel.In this paper, users adopt multiple access technique that is based on multi-channel slotted ALOHA, and only one transmission is allowed in a slot.To show the influence of the intra-cluster interference, inter-cluster interference and external malicious jamming, a cumulative jamming-interference indicator function a h,k , a −(h,k) , A j is designed [32], where a −(h,k) denotes the channel access strategy combination of all users except the kth user in cluster h, and A j is the jamming channel combination of the malicious jammer.In detail, a h,k , a −(h,k) , A j is expressed as follows: where M h \k denotes all users in cluster h except user k, a h,i is the channel access strategy of user i in the cluster, while a g,i is the channel access strategy of user i in the neighboring cluster g.Note that there exist three indicator functions δ a h,k , a h,i , δ a h,k , a g,i and δ a h,k , A j , which show the intra-cluster interference, the inter-cluster interference and the external malicious jamming respectively.Besides, these indicator functions can be denoted as follows: where P y d −α y→h,k > τ threshold represents that the received interference/jamming signal power from user/jammer y is higher than the threshold τ threshold .The above indicator function illustrates that if the channel access strategies of users are conflicting, they will suffer from co-channel interference when the interference threshold is reached.While if the channel access strategy of one user is overlapped with the attack strategy of the malicious jammer, this user will suffer from malicious jamming as the jammer will send high-power noise signals to those attacking channels.Note that the jammer is able to carry out wide-band jamming attacks, the above relationship turns to a h,k ∈ A j for the wide-band jamming case.

Problem formulation
Denote the channel set obtained by the cluster h as A h , thus we have a h,k ∈ A h .Besides, the optimization objective of the clustering network is: where , ∀h ∈ H is all users' channel access strategies in the cluster h.The above optimization objective indicates that each user is willing to find a optimal channel access strategy to maximize the sum of the throughput of the clustering network.The first constraint condition limits the range of jamming channels, while the second constraint means that each user either selects one channel or keeps silent at each time slot.
However, there exist some challenges to solve the above optimization objective.(i) The optimization objective is a non-convex problem, which is difficult to be solved directly in distributed clustering networks.(ii) Due to the dynamic and wide-band characteristics of malicious jamming, cluster users should coordinate internal interference and avoid external jamming attacks.(iii) With the increase of the network scale, the computation complexity increases correspondingly, and thus, a low-complexity channel access approach is preferred to enhance the convergence speed of the network.To cope with these three challenges, the idea of "game and learning" is introduced, in which a multileader multi-follower Stackelberg game is formulated to analyze the relationship among users, and a hierarchical coordinated anti-jamming channel access algorithm is designed to obtain the game equilibriums.

Stackelberg game formulation
To solve the hierarchical coordinated anti-jamming channel access problem in the clustering network, a multi-leader multi-follower Stackelberg game is formulated firstly, where cluster heads act as multiple leaders of the game, while cluster members are followers.
Besides, the proposed game model can be decomposed into multi-leader sub-game and multi-follower sub-game, as shown in Fig. 2.
Considering the influence of the malicious, wide-band and dynamic jamming attacks, each cluster competes with the jammer to obtain available spectrum bands for its cluster members.Due to the dynamic characteristic of the jammer, the band selection strategies of each cluster need to be adjusted to adapt to the changing of the attacking strategies.
After avoiding the malicious jamming bands, each cluster sends available band information to its cluster members.Then, users in the cluster access these available channels Fig. 2 The structure of multi-leader multi-follower Stackelberg game distributedly to coordinate the intra-cluster and inter-cluster conflicts.Mathematically, the game model is expressed as follows: where H is the cluster set, {M h } h∈H is the user set of each cluster, and J represents the jammer.A is the channel set of the clustering network, {A h } h∈H is the channel set of each cluster h ∈ H, and U h is the utility function of cluster head h.u h, k is the utility function of the user k in the cluster h, which is the same as the throughput R h,k shown in Eq. 1.In addition, A j is the attack strategy set of the wide-band jammer.
After the Stackelberg game formulation, the definition of Stackelberg equilibrium (SE) is given, which is shown as follows: Definition 1 In the proposed multi-leader multi-follower Stackelberg game, SE delivers the optimal strategy combination of cluster heads and users for each jamming strategy A j ∈ A j .For multiple leaders, there exists strategy combination A * 1 , • • • A * H , which maximizes the cumulative utility of the cluster heads.While for multiple followers, strategy combination a is the best-response strategy combination with respect to leaders' strategies [33].Here, strategy combination is the SE of the game, which satisfies the following conditions: In Eq. 6, A * −h denotes all the cluster heads' optimal strategy combination except cluster head h, and a * −(h,k) is all users' best-response strategy combination except user k in cluster h.
According to the hierarchical characteristic of the multi-leader multi-follower Stackelberg game, the optimization objective P 1 is rewritten as: The best response solution : Note that there are multiple coupling relationships in the clustering network, and there also exist competitive relationships between different clusters and the malicious jammer, thus the above game formulation is hard to be solved directly using traditional optimization approaches.In this paper, the game is decomposed into two sub-games from two different granularities, that is, the multi-leader(cluster head) sub-game anti-jamming spectrum sharing from the coarse granularity, and the multi-follower (cluster member) distributed channel access from the fine granularity.

Multi-leader sub-game from the coarse granularity
The objective of the multi-leader sub-game is to analyze and model the competitive interactions between different cluster heads and the jammer.Generally, this sub-game is formulated as: where CH h is the cluster head of the cluster h, A h ⊆ A h is one of the band selection strategy of the cluster h, A j ⊆ A j is one of the jamming strategy, and U h is the utility function of corresponding cluster head, which is expressed as follows: where e ∈ A h is a specific channel in spectrum band A h , |A| is the cardinality of channel set A. Moreover, the jamming indicator function δ e, A j is expressed as follows: Equation 10 means that the utility of the cluster head h is the difference between the cardinality of channel set A and the jamming degree of current spectrum band strategy A h .Hence, the optimization objective of the multi-leader sub-game is: The above optimization objective means that each cluster head tries to find the optimal spectrum band that minimizes the cumulative jamming degrees.Then, we give the definition of NE, and we then prove that there exists at least one pure strategy NE in the multi-leader sub-game.

Definition 2 The strategy combination A
H is the pure strategy NE of the game if and only if no cluster head could enhance its utility via unilaterally changing its spectrum band selection strategy [34]: The above equation shows that for a specific jamming strategy A j , the pure strategy NE improves the total network utility via unilaterally optimizing each cluster head's utility.

the following conditions:
Equation 13 depicts that the NE strategy of each cluster head changes with the variation of the jamming strategies.
Proof The proof by contradiction is applied for the multi-leader sub-game [29].Assume that strategy combination Actually, unilaterally changing from A * h to A h means that the cluster head h selects some channels that are jammed.Then, e∈A h δ e, A j increases, and U h decreases correspondingly.The above case depicts that the unilateral change does not satisfy U h A * h , A j < U h A h , A j , and it is contradictory with the previous assumption.Thus, the proof of the above theorem has been completed.

Multi-follower sub-game from the fine granularity
The objective of the multi-follower sub-game is to analyze and model the intra-cluster and inter-cluster competitive interactions among different cluster users.Note that the multi-leader sub-game is to guide the coarse-grained spectrum band selection of different cluster heads, and the multi-follower sub-game is to guide the fine-grained channel access of different users.Mathematically, the multi-follower sub-game is expressed as follows: where u h,k = R h,k denotes the throughput of the user k in the cluster h.In the multifollower sub-game, only interference coordination needs to be considered after the multifollower sub-game reaching NE for jamming strategy A j .Thus, the objective of the fine-grained multi-follower sub-game is: The above optimization objective means that each user tries to find the optimal channel access strategy that maximizes the cumulative throughput of the clustering network.

Theorem 2 For a specifical NE strategy
a h,k = 0 means that the user keeps silent when strategy 0 is selected.The first line of Eq. 16 means that each user selects one channel from the optimal spectrum band of the cluster it belongs, or keeps silent.The second line depicts that neighboring users choose different channels to access when reaching NE.The third line illustrates that users avoid those jamming channels.
Proof Similarly, the proof by contradiction is also applied for the multi-follower subgame.Assume that the strategy combination Then, the throughput of the user k in cluster h can be rewritten as: , only user k in cluster h selects a * h,k , 0, otherwise.However, note that unilateral change from channel selection a * h,k to a h,k causes the throughput of current user down to zero, as the conflict degree increases when breaking the equilibrium state.Thus, the above case means that u h,k a * h,k , a * −(h,k) , A j < u h,k a h,k , a −(h,k) , A j can not be satisfied, and the above hypothesis is not valid.To sum up, the conclusion of Theorem 2 holds.Theorem 3 For a specific jamming strategy A j , there exists at least one SE in the proposed multi-leader multi-follower Stackelberg game.Proof In the above subsections, we proved that there exist at least one NE strategy combination A * 1 , • • • A * H for the leaders' sub-game, which maximizes the cumulative utility of all cluster heads, chooses optimal spectrum bands that are not jammed by the malicious jammer.In response to leaders' strategies, the follower sub-game shows that users can reach at least one NE strategy combination which maximize the network throughput of cluster users.Combining Theorem 1, Theorem 2 and the definition of SE, we conclude that the proposed game has at least one SE.
Figure 3 shows the operation process to obtain SE of the multi-leader multi-follower Stackelberg game.After observing the jamming strategy A j , cluster heads (leaders) firstly adjust their spectrum band strategies to avoid the jamming attacks.Then, cluster users (followers) distributedly access channels after obtaining the spectrum band information sent by its cluster head.After that, cluster heads continue adjusting their band strategies until reaching the NE of the leader sub-game.When cluster users obtain the NE band strategy of their cluster heads, they adjust their channel access strategies until reaching the best-response strategy (which is also the NE of the follower sub-game).In the process of continuous adjustment of cluster heads and cluster users, the system gradually reaches the SE for the current jamming strategy.A j While with the dynamic changing of jamming attacks, different SEs can be achieved via the Stackelberg game iterations.

Hierarchical coordinated anti-jamming channel access algorithm
In this section, a hierarchical coordinated anti-jamming channel access (HCACA) algorithm is designed to obtain SEs of the multi-leader multi-follower Stackelberg game.

Slot structure
Before introducing the algorithm, we first design a slot structure for the operation of the hierarchical coordinated anti-jamming channel access, as shown in Fig. 4. At each time slot, actions are taken respectively for cluster heads and cluster users.Firstly, each cluster head observes the jamming signal, and maintains a corresponding strategy table for every jamming strategy.Secondly, each cluster head selects a spectrum band according to its strategy table.Then, all these cluster heads update their band strategies to avoid the current jamming signal.After the implementation of cluster heads, spectrum band strategies are sent to their cluster users, which means the first action for cluster users is the available channel acquisition.Each user will also maintain a specific strategy table for every jamming strategy to keep consistent with its cluster head, and it chooses one channel from its access table, which is randomly generated according to the information of the available channel.Once accessing the channel, each user transmission its data on the selected channel.When the transmission has been completed, users start updating their access strategies to avoid inter-cluster and intra-cluster conflicts among users.

Algorithm description
With the proposed slot structure, cluster heads and users update their spectrum band strategies and channel access strategies respectively using the hierarchical coordinated anti-jamming channel access (HCACA) algorithm.The detail of the HCACA algorithm description is shown in Algorithm 1.
Fig. 4 The slot structure for the operation of the hierarchical coordinated anti-jamming channel access 2. Spectrum band selection: Each cluster selects one certain band strategy according to the recommended strategy A h C A j .
3. Strategy updating of cluster heads: For each cluster h, it updates the band strategy according to the following rule:

Available channel acquisition of users:
Each user acquires the available channel set A h C A j from its cluster head h.

Channel access: Each user chooses one strategy a h,k C A j from its own strategy table, and a h,k C A
6. Data transmission: Each user transmits on the selected channel, and judge wether the transmission is successful.

Strategy updating of users:
For each user k in cluster h, it updates the channel access strategy according the following rule: a , channel a is idle (sensing); 0, channel a is busy (sensing); a n,k A j , successful transmission; a n,k A j , conflicting, with probability 1 − P back ; 0, conflicting, with probability P back .(19) 8.If t > T max , the algorithm terminates.

End loop
The key steps of the proposed algorithm are the strategy updating processes of cluster heads and users, as shown in Eq. 18 and 19.In detail, when the cluster head h updates its strategy, it judges whether A h C A j = 0.If A h C A j = 0, then the cluster head chooses one spectrum band A to sense [35].If the sensing band is available (not jammed), then it sets A h C A j = A .Otherwise, it keeps A h C A j = 0 if the sensed band is being jammed.Similarly, when user k in cluster h updates its strategy, it also judges whether a n,k A j = 0, and the sensing process of users are similar to cluster heads.However, if a n,k A j = 0, the current user will transmit on the selected channel, and keep strategy a n,k A j unchanged for successful transmission.However, if conflicts occurs when transmitting, the user keeps its strategy with probability 1 − P back , or sets a n,k A j = 0 with probability P back .

Complexity analysis
Motivated by [29], we analyze the complexity of the proposed HCACA algorithm, which is consisted of computational complexity and storage size.
First, we give the computation complexity under the fixed jamming.Note that each cluster head should observe the jamming signal A j , and the corresponding complexity is O (Jo 1 ).Generally, (Jo 1 ) is a small constant which is related to the observation process.Hence, the computation complexity for jamming observation is |H| O (Jo 1 ).
After the spectrum band selection, each cluster head needs to update their band strategy table ATH h according to Eq. 18, and the computation complexity is O B up1 , where B up1 is a small constant related to the band updating process.Hence, the total computation complexity for band updating is |H| O B up1 .
Following the updating process, each cluster head sends the band selection strategies to its cluster users, and then each user updates its channel strategy table according to Eq. 19, and the computation complexity is denoted as O B up2 .Thus, the total compu- tation complexity for channel updating is O h∈H |M h |B up2 .Assume that the convergence iterations is T 1 for the fixed jamming, and the total computation complexity is expressed as: Similarly, the computation complexity for the sweep jamming case is: where A sweep is the strategy set of the sweep jammer.While the computation complexity for the random jamming case is: where A random is the strategy set of the sweep jammer.Note that the random jammer randomly selects one strategy from its jamming strategy at each time, thus average O |A random | 2 T 1 iterations will be necessary for the convergence of the proposed HCACA algorithm (According to [28] Lemma 3).Moreover, cluster heads and cluster users should store their band strategy tables and channel strategy tables respectively.Thus, the storage size is: where A j denotes different jamming strategy sets for different jamming patterns (1, A sweep or A random ) , St 1 is a small constant which is related to the size of band strategy  It is summarized that the computational complexity at each slot is a small constant, and the storage size is not too large as is mentioned above.In a word, the proposed HCACA algorithm has low complexity, and the approach can be implemented using typical technologies.

Simulation results and discussions
In this section, the simulations are conducted, and then some discussions with respect to simulation results are made.

Setting of parameters
First, the details of parameters are given, as shown in Table 1.Here, users randomly locate in an area of 12 km × 12 km, and different users may belong to different clusters.A typical scenario of the clustering network is shown in Fig. 5, where there exist 4 clusters and 24 users.In each cluster, there is a cluster head and 6 users (transmitter-receiver pairs).Users are influenced by the intra-cluster interference and inter-cluster interference if they choose the same channel to transmit data.It is assumed that the bandwidth of each channel is 2 MHz, the transmission power of each user is 0.1 W, the path-loss factor is assumed to be 3, and the background noise power is − 100 dBm.Besides, the interference distance is set to be 2000 m, which is positively related to the interference threshold τ thres .Besides, the time is slotted, and the duration of one time slot is 10 ms.In detail, the duration of jamming observation, spectrum band selection, strategy updating of the cluster head, available channel acquisition, channel access and strategy updating of users are set to be 0.5 ms, and the duration of data transmission is 7 ms.Furthermore, the clustering network is under the attack of a malicious jammer, and three different wide-band jamming patterns are considered, i.e., the fixed jamming, the sweep jamming and the random jamming, as shown in Fig. 6.

Convergence analysis
Secondly, the convergence performance of the proposed HCACA algorithm is depicted in this subsection.Here, simulations are conducted under these three jamming patterns mentioned above.In detail, Fig. 7 shows the convergence process of the network throughput with fixed, sweep and random wide-band jamming (At each time slot, the jammer chooses 5 continuous channels to attack).It can be concluded that the proposed HCACA algorithm converges with the fastest speed under fixed jamming, as it is much easier for  Besides, the convergence time increases with the increase of the dynamic characteristic of the jammer as we can see that the proposed algorithm converges slower under the random jamming than under the sweep jamming.Note that in the initial stage, the network throughput fluctuates acutely under the sweep jamming and the random jamming, as the jamming state varies as time goes by.While with the updating of the strategy tables, the network throughput converges gradually.In a word, it depicts that the proposed algorithm is well-adapted to the dynamic and wide-band jamming environment.Moreover, the existence of channel fading also leads to slight fluctuations in network throughput; however, this phenomenon does not affect the convergence process of the HCACA algorithm.

Performance comparison
To further investigate the performance of the proposed approach, four different algorithms are compared in this subsection, the hierarchical coordinated anti-jamming algorithm HCACA, the distributed anti-jamming learning algorithm (DAL), the coordinated learning algorithm proposed in [36], and the random selection algorithm [37].
Here, users in the DAL algorithm first utilize the jamming signal as the coordination signal, and distributedly update their channel access strategies according to Eq. 19.While users in the coordinated learning algorithm adopt random integers as coordination signals, and avoid collision via distributed learning.Moreover, users in the random selection algorithm randomly select a channel from their channel set to access.Figures 8 and 9 show the performance comparison under the sweep wide-band jamming and the random wide-band jamming.As shown in Fig. 8, it is concluded that the proposed HCACA outperforms other comparative algorithms and achieves the highest throughput with the fastest convergence speed under the sweep wide-band jamming.Note that the proposed DAL algorithm also performs well compared with the coordinated learning algorithm and the random selection algorithm, which shows the advantage of jamming signal utilization.Besides, due to that the coordinated algorithm adopts random integers as coordination signals, it can not adapt to the dynamic wide-band jamming well.Thus, the throughput with coordination learning fluctuates and does not converge to a stable state.Similarly, Fig. 9 depicts that the proposed HCACA algorithm has good adaptive capacity under random jamming.In detail, the proposed HCACA algorithm adapts to the stochastic behavior of the random jamming and converges gradually.The proposed DAL algorithm also performs well thanks to the observation of jamming signals.
While the coordination learning algorithm and the random selection algorithm are hard to accommodate the random jamming and obtain relatively low network throughput than proposed algorithms.
To analyze the advantage in convergence speed of the proposed HCACA algorithm and DAL algorithm.Here we count the convergence time of 100 times Monte Carlo experiments, and then take the average value when reaching the equilibrium strategy combination of the network (for the HCACA algorithm, different SEs are achieved, while for the DAL algorithm, NEs are achieved for different jamming attacks).As shown in Fig. 10, the proposed hierarchical coordinated anti-jamming approach with Stackelberg game formulation converges more faster than the distributed anti-jamming learning approach under three different wide-band jamming attacks.The reason is that the proposed HCACA approach avoids jamming attacks via the cluster heads, and it reduces the strategy space greatly.In addition, it is also illustrated that with the increase of the dynamic and random characteristic of the jamming attacks, the average iterations increases accordingly, however, the proposed hierarchical anti-jamming approach still outperforms the distributed anti-jamming approach in terms of convergence speed.

Conclusion
This paper investigated the multi-user coordinated anti-jamming problem in clustering communication networks.A hierarchical coordinated anti-jamming approach was proposed, and a multi-leader multi-follower Stackelberg game was introduced to model the anti-jamming problem.In detail, cluster heads acted as leaders, and selected available frequency bands to avoid jamming attacks.While users in each cluster acted as followers, and selected corresponding channels distributedly and independently.Moreover, it was proved that there exist multiple Stackelberg equilibriums (SEs) in the proposed game.To obtain SEs, a hierarchical coordinated anti-jamming channel access (HCACA) algorithm was designed.Simulation results illustrated that the proposed approach is effective to cope with the dynamic wide-band jamming attacks.Furthermore, it was also depicted that the proposed approach outperforms the distributed anti-jamming comparative approach in terms of convergence speed.

Fig. 3
Fig.3 The process of obtaining SE of the multi-leader multi-follower Stackelberg game

Fig. 8 Fig. 9
Fig.8 The performance comparison among four different algorithms under sweep jamming

Fig. 10
Fig. 10 The comparison of average convergence time between two different algorithms Set the maximal time slot T max , the slot length, the channel set, the cluster set, the spectrum band strategy table ATH h = A h (C 1 ) , ..., A h C |Aj| for cluster head h, ∀h ∈ H, and the channel access table AT h,k = a h,k (C 1 ) , ..., A h,k C |Aj| for user k in cluster h, ∀h ∈ H, k ∈ M h .Each cluster head observes current jamming signal A j , and confirm corresponding coordination signal C A j .
Loop: t = 1, ..., t, ..., T max 1. Jamming observation: table for one jamming strategy, and St 2 is a small constant with respect to the size of channel strategy table for one jamming strategy.

Table 1
Parameters settings in simulations

Table 2
Summary of main notations Noise power of channel a h,k .δ(an(t))Jamming-interference conflict indicator functiona h,k , a −(h,k) , A j Cumulative jamming-interference indicator function a −(h,k)Channel access strategy combination of all users except the kth user in cluster h