### 4.1 Stackelberg game formulation

To solve the hierarchical coordinated anti-jamming channel access problem in the clustering network, a multi-leader multi-follower Stackelberg game is formulated firstly, where cluster heads act as multiple leaders of the game, while cluster members are followers. Besides, the proposed game model can be decomposed into multi-leader sub-game and multi-follower sub-game, as shown in Fig. 2.

Considering the influence of the malicious, wide-band and dynamic jamming attacks, each cluster competes with the jammer to obtain available spectrum bands for its cluster members. Due to the dynamic characteristic of the jammer, the band selection strategies of each cluster need to be adjusted to adapt to the changing of the attacking strategies. After avoiding the malicious jamming bands, each cluster sends available band information to its cluster members. Then, users in the cluster access these available channels distributedly to coordinate the intra-cluster and inter-cluster conflicts. Mathematically, the game model is expressed as follows:

$$ {\mathcal{G}} = \left\{ {{\mathcal{H}},{{\left\{ {{{\mathcal{M}}_{h}}} \right\}}_{h \in {\mathcal{H}}}},{\mathcal{A}},{\mathcal{J}},{{\left\{ {{{\mathcal{A}}_{h}}} \right\}}_{h \in {\mathcal{H}}}},{{\left\{ {{U_{h}}} \right\}}_{h \in {\mathcal{H}}}},{{\left\{ {{u_{h,}}_{k}} \right\}}_{h \in {\mathcal{H}},k \in {{\mathcal{M}}_{h}}}},{{\mathcal{A}}_{j}}} \right\}, $$

(5)

where \({\mathcal {H}}\) is the cluster set, \({\left \{ {{{\mathcal {M}}_{h}}} \right \}_{h \in {\mathcal {H}}}}\) is the user set of each cluster, and \({\mathcal {J}}\) represents the jammer. \({\mathcal {A}}\) is the channel set of the clustering network, \({\left \{ {{{\mathcal {A}}_{h}}} \right \}_{h \in {\mathcal {H}}}}\) is the channel set of each cluster \(h \in {\mathcal {H}}\), and *U*_{h} is the utility function of cluster head *h*. *u*_{h,}_{k} is the utility function of the user *k* in the cluster *h*, which is the same as the throughput *R*_{h,k} shown in Eq. 1. In addition, \({{\mathcal {A}}_{j}}\) is the attack strategy set of the wide-band jammer.

After the Stackelberg game formulation, the definition of Stackelberg equilibrium (SE) is given, which is shown as follows:

###
**Definition 1**

In the proposed multi-leader multi-follower Stackelberg game, SE delivers the optimal strategy combination of cluster heads and users for each jamming strategy \(A_{j} \in {{\mathcal {A}}_{j}}\). For multiple leaders, there exists strategy combination \(\left ({{{A}}_{1}^ *, \cdots {{A}}_{H}^ *} \right)\), which maximizes the cumulative utility of the cluster heads. While for multiple followers, strategy combination \({\mathrm { }}\left \{ {\left ({a_{1,1}^ *, \cdots a_{1,{{\mathcal {M}}_{1}}}^ *} \right)...\left ({a_{H,1}^ *, \cdots a_{H,{{\mathcal {M}}_{H}}}^ *} \right)} \right \}\) is the best-response strategy combination with respect to leaders’ strategies [33]. Here, strategy combination \(\left \{ {\left ({A_{1}^{*}, \cdots A_{H}^{*}} \right),\left ({a_{1,1}^{*}, \cdots a_{1,{M_{1}}}^{*}} \right)...\left ({a_{H,1}^{*}, \cdots a_{H,{M_{H}}}^{*}} \right)} \right \}\) is the SE of the game, which satisfies the following conditions:

$$ \left\{ \begin{array}{l} {U_{h}}\left({A_{h}^ *,A_{- h}^ *,a_{1,1}^{*},...,a_{H,{{\mathcal{M}}_{H}}}^{*}} \right) \ge {U_{h}}\left({{A_{h}},A_{- h}^ *,a_{1,1}^{*},...,a_{H,{{\mathcal{M}}_{H}}}^{*}} \right),\forall h \in {\mathcal{H}},\\ {u_{h,k}}\left({a_{h,k}^ *,a_{- \left({h,k} \right)}^ *,A_{h}^ *,A_{- h}^ *} \right) \ge {u_{h,k}}\left({{a_{h,k}},a_{- \left({h,k} \right)}^ *,A_{h}^ *,A_{- h}^ *} \right),\forall h \in {\mathcal{H}},k \in {{\mathcal{M}}_{h}}. \end{array} \right. $$

(6)

In Eq. 6, *A*−*h*∗ denotes all the cluster heads’ optimal strategy combination except cluster head *h*, and \(a_{- \left ({h,k} \right)}^ *\) is all users’ best-response strategy combination except user *k* in cluster *h*.

According to the hierarchical characteristic of the multi-leader multi-follower Stackelberg game, the optimization objective **P**_{1} is rewritten as:

$$ \left\{ \begin{array}{l} For{\mathrm{~}}h = 1,...,H\\ \mathop {\max }\limits_{{A_{h}}} {U_{h}}\left({{A_{h}},{A_{- h}},{A_{j}}} \right),h \in {\mathcal{H}},{A_{h}} \in {{\mathcal{A}}_{h}}\\ {\mathrm{ }}\mathrm{The~best~ response ~solution}:{\mathrm{ }}a_{1,1}^{*},...,a_{H,{{\mathcal{M}}_{H}}}^{*}\\ {\mathrm{ }}\left\{ \begin{array}{l} \mathop {\max }\limits_{{a_{h,k}}} {u_{h,k}}\left({{a_{h,k}},{a_{- \left({h,k} \right)}}} \right),\\ {a_{h,k}} \in {A_{h}} \end{array} \right. \end{array} \right. $$

(7)

Note that there are multiple coupling relationships in the clustering network, and there also exist competitive relationships between different clusters and the malicious jammer, thus the above game formulation is hard to be solved directly using traditional optimization approaches. In this paper, the game is decomposed into two sub-games from two different granularities, that is, the multi-leader(cluster head) sub-game anti-jamming spectrum sharing from the coarse granularity, and the multi-follower (cluster member) distributed channel access from the fine granularity.

### 4.2 Multi-leader sub-game from the coarse granularity

The objective of the multi-leader sub-game is to analyze and model the competitive interactions between different cluster heads and the jammer. Generally, this sub-game is formulated as:

$$ {{\mathcal{G}}_{\mathrm{{ML}}}} = \left\{ {{\mathcal{H}},{{\left\{ {{\mathrm{{CH}}_{h}}} \right\}}_{h \in {\mathcal{H}}}},{\mathcal{A}},{\mathcal{J}},{{\left\{ {{{\mathcal{A}}_{h}}} \right\}}_{h \in {\mathcal{H}}}},{{\left\{ {{U_{h}}} \right\}}_{h \in {\mathcal{H}}}},{{\mathcal{A}}_{j}}} \right\}, $$

(8)

where *C**H*_{h} is the cluster head of the cluster *h*, \(A_{h} \subseteq {{\mathcal {A}}_{h}}\) is one of the band selection strategy of the cluster *h*, \(A_{j} \subseteq {{\mathcal {A}}_{j}}\) is one of the jamming strategy, and *U*_{h} is the utility function of corresponding cluster head, which is expressed as follows:

$$ {U_{h}} = \left| {\mathcal{A}} \right| - \sum\limits_{e \in {{{A}}_{h}}} {\delta \left({e,{{ A}_{j}}} \right)}, $$

(9)

where *e*∈*A*_{h} is a specific channel in spectrum band \({{{A}}_{h}},\left | {\mathcal {A}} \right |\) is the cardinality of channel set \(\mathcal {A}\). Moreover, the jamming indicator function *δ*(*e*,*A*_{j}) is expressed as follows:

$$ \delta \left({e,{{ A}_{j}}} \right) = \left\{ \begin{array}{l} 1,{\mathrm{ }}e \in {{ A}_{j}},A_{j} \subseteq {{\mathcal{A}}_{j}},\\ 0,{\mathrm{ }}\text{otherwise}. \end{array} \right. $$

(10)

Equation 10 means that the utility of the cluster head *h* is the difference between the cardinality of channel set \(\mathcal {A}\) and the jamming degree of current spectrum band strategy *A*_{h}. Hence, the optimization objective of the multi-leader sub-game is:

$$ {\mathbf{P}_{2}}:{\mathrm{ }}\left({{{A}}_{1}^ *, \cdots {{A}}_{H}^ *} \right) = \arg \max \sum\limits_{h \in {\mathcal{H}}} {{U_{h}}}. $$

(11)

The above optimization objective means that each cluster head tries to find the optimal spectrum band that minimizes the cumulative jamming degrees. Then, we give the definition of NE, and we then prove that there exists at least one pure strategy NE in the multi-leader sub-game.

###
**Definition 2**

The strategy combination \(\left ({{{A}}_{1}^ *, \cdots {{A}}_{H}^ *} \right)\) is the pure strategy NE of the game if and only if no cluster head could enhance its utility via unilaterally changing its spectrum band selection strategy [34]:

$$ {U_{h}}\left({A_{h}^{*},{A_{j}}} \right) \ge {U_{h}}\left({{A_{h}},{A_{j}}} \right),\forall h \in H,\forall {A_{h}} \in {{\mathcal{A}}_{h}}\backslash A_{h}^{*}. $$

(12)

The above equation shows that for a specific jamming strategy *A*_{j}, the pure strategy NE improves the total network utility via unilaterally optimizing each cluster head’s utility.

###
**Theorem 1**

For a specific jamming strategy *A*_{j}, there exists at least one pure strategy NE \(\left ({{{A}}_{1}^ *, \cdots {{A}}_{H}^ *} \right)\) with the following conditions:

$$ A_{h}^{*} = {\mathcal{A}}\backslash {A_{j}},\forall h \in {\mathcal{H}},{A_{j}} \in {{\mathcal{A}}_{j}},\exists A_{h}^{*} \in {{\mathcal{A}}_{h}}. $$

(13)

Equation 13 depicts that the NE strategy of each cluster head changes with the variation of the jamming strategies.

###
*Proof*

The proof by contradiction is applied for the multi-leader sub-game [29]. Assume that strategy combination \(\left ({{{A}}_{1}^ *, \cdots {{A}}_{H}^ *} \right)\) is not the NE, which means \(\exists h \in {\mathcal {H}}\) that satisfies \({U_{h}}\left ({A_{h}^{*},{A_{j}}} \right) < {U_{h}}\left ({{A_{h}},{A_{j}}} \right)\). Actually, unilaterally changing from \(A_{h}^{*}\) to *A*_{h} means that the cluster head *h* selects some channels that are jammed. Then, \(\sum \limits _{e \in {{{A}}_{h}}} {\delta \left ({e,{{ A}_{j}}} \right)}\) increases, and *U*_{h} decreases correspondingly. The above case depicts that the unilateral change does not satisfy \({U_{h}}\left ({A_{h}^{*},{A_{j}}} \right) < {U_{h}}\left ({{A_{h}},{A_{j}}} \right)\), and it is contradictory with the previous assumption. Thus, the proof of the above theorem has been completed. □

### 4.3 Multi-follower sub-game from the fine granularity

The objective of the multi-follower sub-game is to analyze and model the intra-cluster and inter-cluster competitive interactions among different cluster users. Note that the multi-leader sub-game is to guide the coarse-grained spectrum band selection of different cluster heads, and the multi-follower sub-game is to guide the fine-grained channel access of different users. Mathematically, the multi-follower sub-game is expressed as follows:

$$ {{\mathcal{G}}_{\mathrm{{MF}}}} = \left\{ {{\mathcal{H}},{{\left\{ {{{\mathcal{M}}_{h}}} \right\}}_{h \in {\mathcal{H}}}},{{\left\{ {{{\mathcal{A}}_{h}}} \right\}}_{h \in {\mathcal{H}}}},{{\left\{ {{u_{h,k}}} \right\}}_{h \in {\mathcal{H}},k \in {{\mathcal{M}}_{h}}}}} \right\}, $$

(14)

where *u*_{h,k}=*R*_{h,k} denotes the throughput of the user *k* in the cluster *h*. In the multi-follower sub-game, only interference coordination needs to be considered after the multi-follower sub-game reaching NE for jamming strategy *A*_{j}.

Thus, the objective of the fine-grained multi-follower sub-game is:

$$ {\mathbf{P}_{3}}:{\mathrm{ }}\left\{ {\left({a_{1,1}^ *, \cdots a_{1,{{\mathcal{M}}_{1}}}^ *} \right)...\left({a_{H,1}^ *, \cdots a_{H,{{\mathcal{M}}_{H}}}^ *} \right)} \right\} = \arg \max \sum\limits_{h \in {\mathcal{H}}} {\sum\limits_{k \in {{\mathcal{M}}_{h}}} {{u_{h,k}}} }. $$

(15)

The above optimization objective means that each user tries to find the optimal channel access strategy that maximizes the cumulative throughput of the clustering network.

###
**Theorem 2**

For a specifical NE strategy \(\left ({{{A}}_{1}^ *, \cdots {{A}}_{H}^ *} \right)\) when the jammer attacks band *A*_{j}, there exists at least one pure strategy NE \({\mathrm { }}\left \{ {\left ({a_{1,1}^ *, \cdots a_{1,{{\mathcal {M}}_{1}}}^ *} \right)...\left ({a_{H,1}^ *, \cdots a_{H,{{\mathcal {M}}_{H}}}^ *} \right)} \right \}\) with the following condition:

$$ \left\{ \begin{array}{l} \left\{ {\left({a_{1,1}^{*}, \cdots a_{1,{M_{1}}}^{*}} \right)...\left({a_{H,1}^{*}, \cdots a_{H,{M_{H}}}^{*}} \right)} \right\} = \mathop \cup \limits_{h \in {\mathcal{H}}} A_{h}^ * \cup \left\{ 0 \right\};\\ {\mathcal{X}} = \left\{ {\left. x \right|\forall x \in {{\mathcal{M}}_{h}},m \in {{\left\{ {{{\mathcal{M}}_{h}}} \right\}}_{h \in {\mathcal{H}}}}\backslash x,a_{x}^ * \ne a_{m}^ * {\mathrm{ }}~if~{\mathrm{ }}{{\mathrm{P}}_{m}}{\mathrm{d}}_{m \to x}^{- \alpha} > {\tau_{{\text{thres}}}}} \right\};\\ \forall x \in {\mathcal{X}},a_{x}^ * \notin {A_{j}}. \end{array} \right. $$

(16)

*a*_{h,k}=0 means that the user keeps silent when strategy 0 is selected. The first line of Eq. 16 means that each user selects one channel from the optimal spectrum band of the cluster it belongs, or keeps silent. The second line depicts that neighboring users choose different channels to access when reaching NE. The third line illustrates that users avoid those jamming channels.

###
*Proof*

Similarly, the proof by contradiction is also applied for the multi-follower sub-game. Assume that the strategy combination \({\mathrm { }}\left \{ {\left ({a_{1,1}^ *, \cdots a_{1,{{\mathcal {M}}_{1}}}^ *} \right)...\left ({a_{H,1}^ *, \cdots a_{H,{{\mathcal {M}}_{H}}}^ *} \right)}\right \}\) is not one NE of the multi-follower sub-game, which means \(\exists h \in {\mathcal {H}},\exists k \in {{\mathcal {M}}_{k}}\) that satisfies \({u_{h,k}}\left ({a_{h,k}^ *,a_{- \left ({h,k} \right)}^ *,{A_{j}}} \right) < {u_{h,k}}\left ({{a_{h,k}},{a_{- \left ({h,k} \right)}},{A_{j}}} \right)\). Then, the throughput of the user *k* in cluster *h* can be rewritten as:

$$ \begin{array}{l} {u_{h,k}}\left({a_{h,k}^ *,a_{- \left({h,k} \right)}^ *,{A_{j}}} \right) = \\ \left\{ \begin{array}{l} {B_{a_{h,k}^{*}}}\text{log}{_{2}}\left({1 + \frac{{{P_{h,k}}d_{h,k}^{- \alpha }}}{{{N_{a_{h,k}^{*}}}}}} \right),\mathrm{only~user~k~in~cluster}~h~\text{selects}~a_{h,k}^{*},\\ 0,\text{otherwise}. \end{array} \right. \end{array} $$

(17)

However, note that unilateral change from channel selection \(a_{h,k}^{*}\) to *a*_{h,k} causes the throughput of current user down to zero, as the conflict degree increases when breaking the equilibrium state. Thus, the above case means that \({u_{h,k}}\left ({a_{h,k}^ *,a_{- \left ({h,k} \right)}^ *,{A_{j}}} \right) < {u_{h,k}}\left ({{a_{h,k}},{a_{- \left ({h,k} \right)}},{A_{j}}} \right)\) can not be satisfied, and the above hypothesis is not valid. To sum up, the conclusion of Theorem 2 holds. □

###
**Theorem 3**

For a specific jamming strategy *A*_{j}, there exists at least one SE in the proposed multi-leader multi-follower Stackelberg game.

###
*Proof*

In the above subsections, we proved that there exist at least one NE strategy combination \(\left ({{{A}}_{1}^ *, \cdots {{A}}_{H}^ *} \right)\) for the leaders’ sub-game, which maximizes the cumulative utility of all cluster heads, chooses optimal spectrum bands that are not jammed by the malicious jammer. In response to leaders’ strategies, the follower sub-game shows that users can reach at least one NE strategy combination \({\mathrm { }}\left \{ {\left ({a_{1,1}^ *, \cdots a_{1,{{\mathcal {M}}_{1}}}^ *} \right)...\left ({a_{H,1}^ *, \cdots a_{H,{{\mathcal {M}}_{H}}}^ *} \right)}\right \}\) which maximize the network throughput of cluster users. Combining Theorem 1, Theorem 2 and the definition of SE, we conclude that the proposed game has at least one SE. □

Figure 3 shows the operation process to obtain SE of the multi-leader multi-follower Stackelberg game. After observing the jamming strategy *A*_{j}, cluster heads (leaders) firstly adjust their spectrum band strategies to avoid the jamming attacks. Then, cluster users (followers) distributedly access channels after obtaining the spectrum band information sent by its cluster head. After that, cluster heads continue adjusting their band strategies until reaching the NE of the leader sub-game. When cluster users obtain the NE band strategy of their cluster heads, they adjust their channel access strategies until reaching the best-response strategy (which is also the NE of the follower sub-game). In the process of continuous adjustment of cluster heads and cluster users, the system gradually reaches the SE for the current jamming strategy. *A*_{j} While with the dynamic changing of jamming attacks, different SEs can be achieved via the Stackelberg game iterations.