Skip to main content

Secure relay selection based on learning with negative externality in wireless networks


In this paper, we formulate relay selection into a Chinese restaurant game. A secure relay selection strategy is proposed for a wireless network, where multiple source nodes send messages to their destination nodes via several relay nodes, which have different processing and transmission capabilities as well as security properties. The relay selection utilizes a learning-based algorithm for the source nodes to reach their best responses in the Chinese restaurant game. In particular, the relay selection takes into account the negative externality of relay sharing among the source nodes, which learn the capabilities and security properties of relay nodes according to the current signals and the signal history. Simulation results show that this strategy improves the user utility and the overall security performance in wireless networks. In addition, the relay strategy is robust against the signal errors and deviations of some user from the desired actions.

1. Introduction

Relay selection has been recognized as a critical issue for both cooperative communications [13] and multi-hop wireless networks. Efficient and secure relay selections in wireless networks have to overcome various technical challenges at different levels, such as the channel state estimation regarding the relay nodes and attack detection [4, 5]. For example, source nodes have to avoid choosing the relay nodes that play packet dropping attacks by deliberately dropping some messages and never forwarding them to the destination [4]. In the presence of multiple potential relay nodes in the coverage area, a user has to use the relay node that can provide a high secure data rate with a good radio propagation condition and high transmit power. On the other hand, due to the limited transmission and processing capability of a relay node, each customer user achieves less utility if the corresponding relay simultaneously serves more users.

To this end, game theory is a powerful math tool to constitute a formal analytical framework that enables the study of complex interactions among the source nodes and relay nodes with different serving properties in wireless networks. In particular, the Chinese restaurant game (CRG), initially inspired by the Chinese restaurant process, is a promising tool to address the negative externality issue in the relay selection, where each player makes decision sequentially based on the received signals reflecting the state of the tables in a Chinese restaurant and avoids choosing a crowded table [6]. The Chinese restaurant game model is a prominent tool to address emerging problems in wireless communications, especially the cooperative spectrum accessing [7] and the spectrum sharing in cognitive radio networks [8].

In this paper, we consider a wireless network with multiple source nodes or users, which aim at sending their messages to the destination nodes. There are multiple potential relay nodes with different transmission capabilities, due to the radio channel conditions, transmit power, and processing speed, as well as various security properties. For instance, some nodes might drop some relay messages on purpose or leak the relay messages, resulting in the privacy loss of the source nodes. By formulating the secure relay selection process into a sequential Chinese restaurant game, we propose a learning-based relay selection strategy to improve the secure end-to-end data rate in wireless networks. This scheme captures the characteristics of relay nodes at different levels, including their security properties, buffer sizes, transmission capacities, and processing speeds, as well as the number of current serving users. Users estimate the relay state by learning from the history and the current signals that reflect the relay properties. The relay nodes are chosen to maximize their own expected secure data rates accordingly.

Our contributions can be summarized as follows:

  1. 1.

    The CRG-based relay selection strategy takes into account the relay security and avoids choosing a crowded relay and thus can improve the user utility.

  2. 2.

    By exploiting the previous signals received by the neighboring nodes on the relay properties, the CRG-based strategy can provide some degree of robustness against the signal error. Moreover, this strategy is also robust against possible irrational decisions or deviations from the proposed schemes. In other words, even when some users deviate from this strategy, the other users can still benefit from following the scheme in the long term.

1.1 Related work

Many interesting works have investigated how a single source node selects relay in cooperative wireless communications according to the radio channel information, such as the channel state information (CSI) [1, 3], the parameters in the Nakagami channel model [2], and the finite-state Markov channel model [9]. In [10], a cooperative relay transmission strategy was proposed over multiple potential relays. The relay selection can be formulated using an optimization model based on the constrained Markov decision process [11]. In [12], a cooperative relay diversity protocol was designed to increase the coverage area in wireless networks. In addition, it is shown in [13, 14] that node cooperation with known CSI information in wireless networks can improve the user secrecy capacity.

In wireless networks with multiple users that simultaneously transmit messages, the work [4] provides a distributed relay selection strategy that applies the Stackelberg game to reduce the overall power consumption. Yu and Ray Liu proposed reputation-based, cheat-proof, and attack-resistant cooperation stimulation strategies to improve the security performance in autonomous mobile ad hoc networks [5]. In [15], an indirect reciprocity principle was applied to improve the performance of a large-scale mobile network, and the stability condition was investigated in [16]. In order to improve the communication efficiency, a min-max coalition-proof channel allocation scheme was proposed in [17] for multi-hop wireless networks.

The remainder of the paper is organized as follows: We describe the network model in Section 2 and formulate it into a Chinese restaurant game in Section 3. We propose the secure relay selection scheme based on the Chinese restaurant game in Section 4. Next, in Section 5, we present the simulation results to evaluate its performance. Finally, a short conclusion is drawn in Section 6.

2. Network model

We consider a typical wireless network as shown in Figure 1, which consists of C source nodes or users, K relay nodes, and a common destination node. Each source node has to deliver a message to the destination with the help of a relay node. For simplicity, we assume a two-hop wireless network, where the destination node is out of the coverage area of the source nodes but can be reached by the relay nodes.

Figure 1
figure 1

System model of relay selection in a wireless network with C users and K relay nodes.

The message transmission process consists of two stages: (1) the C source nodes send messages to the relay nodes in sequence, and (2) the relay nodes amplify and forward the messages to the destination node. This work can be extended straightforwardly to the cooperative communication scenarios in single-hop networks, where both the source and relay nodes transmit cooperatively during the second stage.

Without loss of generality, we assume that the relay nodes have different buffer sizes, security properties, and transmission capabilities due to various transmit power and radio channel states and thus provide different service qualities to the users. For instance, a relay node performing the packet dropping attack deliberately drops some messages and thus reduces the user's end-to-end data rate. In addition, a relay node with serious propagation fading or low transmit power provides lower transmission rates to the users.

Therefore, we classify the service quality of a relay node into Q levels, where 1 is the worst, and Q is assigned to the most powerful relay. Let R(k,w) {1, 2,…, Q} denote the total secure throughput of relay k, where the second parameter w {1, 2,…, W} is the relay or network state and is unknown to the users. As some relay nodes have the same transmission capability, the number of relay state, denoted with W, is usually much less than Q K.

In the first stage of the transmission, C users choose sequentially from K relay nodes, based on the relay state learnt from the history and current signal. The latter users cannot change the relay selection decisions of the former users. In general, User i has better understanding on the relay state by investigating more signals than the former users. Once messages from the source nodes are received, relay nodes forward the messages to the destination. The source nodes choosing the same relay node use time rotations to share the transmission and processing capability of the relay. Thus, a crowded relay degrades the end-to-end data rate for each customer user.

3. Game formulation

The Chinese restaurant game is a dynamic game, where players have knowledge on both the decisions of the former players and the table state in a Chinese restaurant [6, 7]. We study the relay selection in a two-hop wireless network with a CRG model, where the players are C source nodes and the tables are K relay nodes. The action set in this model is A = {1, 2,…, K}, and the action represents the relay node, which the player selects to deliver their messages to a destination in sequence. The players that are assumed to be rational choose actions to maximize their own utilities, which correspond to their secure data rates to the destination node. For the scope of this paper, we interchangeably use the terms users, source nodes, and players.

Each player is assumed to receive a signal on both the qualities of the K relay nodes and the signal history of the previous users. Without loss of generality, we take User i as an example, with 1 ≤ iC. In such a game, User i obtains from a control channel a signal on the relay state, denoted with s i {1, 2,…, W}, and the signal history, h i = {s 1, s 2,…, s i − 1}, which contains the revealed signals for the previous i − 1 users. Note that the signals mentioned in this paper inform users about the relay states, instead of being the messages sent to the destination.

The signals are in general imperfect. Let Pr(s i |w) represent the probability that the signal to User i is s i , given the true relay state w. For simplicity, we model the signals with the following Bernoulli distribution:

Ρr s i | w = p , if s i = w 1 p , o . w . ,

where p indicates the signal accuracy. Note that this work is not limited to the Bernoulli model in Equation 1 and can be easily extended to the other signal models. The prior distribution of w is given by g 0 = {g(0,1), g(0,2),…, g(0,K)}, where g(0,w) = Pr(w = q) is the prior distribution of the relay state w, which is known by all the users.

In this system, the users choosing the same relay node apply time rotations to share the processing and transmission resources. The goal of each user is to maximize its own secure data rate. Thus, we define the utility function to User i that takes action k, denoted with U i,k , as follows:

U i , k = U i R k , w , N k = R k , w N k ,

where N k is the number of users selecting Relay k at the end of the game. User i takes the action in a deterministic manner, and his best response denoted with r i is given by r i = arg k max U i , k = arg k max R k , w N k .

We will present a learning algorithm in the next section to obtain the solution to such an optimization problem. In this way, User i broadcasts his choice and transmits his message to Relay r i . Next, User i + 1 chooses relay in a similar way, and the game ends when all the C users have taken actions. For ease of reference, we summarize the commonly used notations in Table 1.

Table 1 Summary of symbols and notations

4. Relay selection algorithm based on CRG

In this section, we present a secure relay selection algorithm for users to choose relay nodes in sequence in wireless networks. This is essentially a learning algorithm that enables users to reach a desirable outcome of the CRG as described in Section 3. Each user makes a decision in three steps: (1) learns the relay state based on the current signal, the signal history, and the actions of the previous users, if there are any, (2) estimates the expected utility, and (3) chooses the relay node that maximizes its own utility.

To constitute a concrete example of the learning process, we consider User i with 1 ≤ iC and present how the message is delivered to the destination node via a relay. In this process, User i exploits its signal s i and the signal history h i = {s j }1 ≤ j < i to estimate the relay state g(i) = {g(i,w)}1 ≤ wW , i.e., the service qualities of these K relays, where g(i,w) = Pr(w|h i ,s i ,g 0) is the probability that User i believes that the relay state is w, and g 0 is the prior distribution of the relay state known by the users.

Rational users can apply the Bayesian rule to update their beliefs on the relay state, and the belief of User i is given by

g i , w = P r w | h i , s i , g 0 = P r w | h i 1 , s i 1 , s i , g 0 = P r w , s i | h i 1 , s i 1 , g 0 P r s i | h i 1 , s i 1 , g 0 = P r w | h i 1 , s i 1 , g 0 P r s i | w q ' = 1 W P r q ' | h i 1 , s i 1 , g 0 P r s i | q ' = g i 1 , w P r s i | w q ' = 1 W g i 1 , q ' P r s i | q ' ,

where Pr(s i |w) is given by Equation 1. Note that g(i,w) provides the service profile of all the K relay nodes in relay state w, and the secure throughput of Relay k is R(k,w) in this case.

Users have to avoid the crowded relays because of the negative externality of relay sharing as indicated by Equation 2. Let M i = {M i,k }1 ≤ k K denote the current relay grouping state, where M i,k is the number of users before User i choosing Relay k. Since each rational user aims at maximizing his expected utility, the action of User i is given by

r i = arg max k E U i , k | M i , h i , s i ,

where the expectation is taken over both the relay state and the number of users choosing Relay k after User i.

Given by Equation 2, U i,k is the utility to User i, if the relay state is w, and N k users including User i choose Relay k. By definition, the expected utility that User i can obtain in this case is given by

E U i , k | Μ i , h i , s i = q = 1 W g i , q E U i R k , q , N k | Μ i , h i , s i = q = 1 W g i , q R k , q E 1 N k | Μ i , h i , s i ,

where q is the relay state and the expectation in the second line is taken over the number of users choosing Relay k after User i. In order to calculate Equation 5, we introduce n i,k to denote the number of users choosing Relay k since User i. It is clear that the total number of users on Relay k is N k = M i,k + n i,k , where M i,k and n i,k can be obtained by definition:

M i + 1 , k = M i , k + 1 if r i = k M i , k o . w . , n i , k = n i + 1 , k + 1 if r i = k n i + 1 , k o . w . .

According to Equations 4 to 6, we can rewrite Equation 4 as a double summation of the function that is also conditioned on the number of users choosing Relay k after User i and the relay state q in U i,k :

r i = arg max k q = 1 W y = 0 C i + 1 g i , q P r n i , k = y | Μ i , h i , s i , r i = k , q R k , q M i , k + y .

The solution to Equation 7 depends on the distribution of n i,k , which can be derived by the following recursive method:

P r n i , k = y | Μ i , h i , s i , r i , w = q = P r n i + 1 , k = y 1 | Μ i , h i , s i , r i , w = q , if r i = k P r n i + 1 , k = y | Μ i , h i , s i , r i , w = q , o . w . = z = 1 K l = 1 W P r n i + 1 , k = y 1 | Μ i + 1 , h i + 1 , s i + 1 = l , r i + 1 = z , w = q P r s i + 1 = l | w = q , if r i = k z = 1 K l = 1 W P r n i + 1 , k = y | Μ i + 1 , h i + 1 , s i + 1 = l , r i + 1 = z , w = q Ρr s i + 1 = l | w = q , o . w . ,

where the history for User i + 1 is h i + 1 = {h i ,s i } and M i + 1 is the grouping result before User i + 1 given by Equation 6. The second line in Equation 8 considers both the signal at time i + 1 (s i + 1) and the corresponding relay selection (r i + 1).

As the last user knows the decisions of all the other users, User C can easily calculate the distribution of n C,k , based on his own choice, r c , i.e.,

P r n C , k = 1 | Μ C , h C , s C , r C , w = 1 , if r C = k 0 , o . w . .

The total number of iterations to address Equation 8 depends on C. For example, if there are C = 7 users in the area, it takes seven iterations to calculate the conditional probabilities in Equation 8.

In this algorithm, User i chooses a relay node based on Equations 3 to 9, transmits the message to Relay r i , and broadcasts his decision to the other users in the neighborhood. Users learn the relay state according to the current signals and the signal history, and predict the behaviors of the following users based on the current relay selection results. This algorithm can be used for radio users to choose relay nodes in wireless networks such as cognitive radio networks and sensor networks.

5. Simulation results

We performed simulation to evaluate the performance of the CRG-based relay selection scheme for a wireless network, which consisted of seven users and two relays that forwarded the users' messages to the destination node. The average signal-to-noise-ratio for each relay node's signal at the destination node was 15 dB, and the overall bandwidth of a relay node was 10 MHz.

We considered two situations regarding relay performance and set W = 2. Each situation took place with the same probability. There was a selfish relay node in each situation who dropped 60% of the relay messages to save power. Following straightforward calculation, we can see that for the first situation, w = 1, the total utilities that Relays 1 and 2 provided were 35 and 14, respectively. Otherwise, if w = 2, the total utilities for Relays 1 and 2 were 14 and 35, respectively. The signals to each user were generated independently and uniformly according to the signal accuracy denoted by P a .

For comparison, we also evaluated another two relay selection strategies: The simplest is the random relay strategy, where users choose relay nodes randomly and independently, in disregard of the signals. The second is the myopic strategy, which is also a signal-based strategy. In this strategy, users choose relay sequentially. Each user aims at maximizing his current utility and ignores the impacts of the latter users in the network. More specifically, User i chooses the relay node given by the following:

r i myopic = arg max m w = 1 W g i , w U i R m , w , M i , m + 1 .

Unlike the CRG-based strategy, the decisions of the latter users are ignored, and users make decisions according to their own signals, the signal history, and the decisions of the former users.

Simulation results in Figure 2 show that compared with the random and the myopic strategy, the CRG-based scheme can provide a higher utility because users estimate the other users' decisions and make decisions accordingly. Clearly, the user utility in this strategy changes with the signal quality P a . However, the performance is mostly stable, if P a is greater than 0.9, which means that the scheme has some degree of robustness against the signal errors. In addition, as shown in Figure 2C, users in the middle of the decision making queue, such as User 3, usually have lower utilities than the other users. The reason is that User 1 has the freedom to choose any relay, and meanwhile, the last user has the best knowledge on the relay performance and the choice of the other users.

Figure 2
figure 2

Simulation results for the relay selection. In a network with seven users, two relay nodes, and a destination node. One of the relay nodes deliberately drops some relay messages and forwards the remaining 40% of the messages. (A) Average utility of User 1. (B) Average utility of User 7. (C) Average utility of the seven users for the CRG-based relay strategy.

In the second simulation, we consider the CRG-based relay strategy in a scenario similar to experiment 1, except that User 3 deviates from the CRG strategy with a probability denoted by P miss. As shown in Figure 3, the utility of User 4 slightly decreases with the probability P miss. On the other hand, the performance loss is small if P miss is less than 0.2, indicating that this strategy can provide robustness against the user deviation to some degree.

Figure 3
figure 3

Average utility of User 4 in the CRG-based relay selection. In a network with seven users, two relay nodes, and a destination node, where User 3 deviates from the CRG-based strategy with probability P miss.

6. Conclusion

In this paper, we have investigated the secure relay selection in wireless networks and formulated it with a sequential Chinese restaurant game model that can take into account the security properties, buffer size, transmission strength, and processing ability of relay nodes. We propose a secure relay selection strategy to improve the user utility by avoiding crowded relay nodes. Simulation results show that the proposed scheme can achieve a higher average utility than the other two relay strategies. In addition, this scheme has some degree of robustness against both the signal inaccuracy and user deviation from the given strategy.


  1. Vicario JL, Bel A, Lopez-Salcedo JA, Seco G: Opportunistic relay selection with outdated CSI: outage probability and diversity analysis. IEEE T. Wirel. Commun. 2009, 8(6):2872-2876.

    Article  Google Scholar 

  2. Michalopoulos DS, Chatzidiamantis ND, Schober R, Karagiannidis GK: Relay selection with outdated channel estimates in Nakagami-m fading. In Proceedings of the IEEE International Conference on Communications (ICC). Kyoto; 2011.

    Google Scholar 

  3. Soysa M, Suraweera HA, Tellambura C, Garg HK: Partial and opportunistic relay selection with outdated channel estimates. IEEE T. Commun. 2012, 60(3):840-850.

    Article  Google Scholar 

  4. Wang BB, Han Z, Ray KJ: Liu, Distributed relay selection and power control for multiuser cooperative communication networks using Stackelberg game. IEEE T. Mobile. Comput. 2009, 8(7):975-990.

    Article  Google Scholar 

  5. Yu W, Ray KJ: Liu, Game theoretic analysis of cooperation stimulation and security in autonomous mobile ad hoc networks. IEEE T. Mobile. Comput. 2007, 6(5):459-473.

    Google Scholar 

  6. Wang CY, Chen Y, Ray KJ: Liu, Chinese restaurant game. IEEE Signal. Proc. Let. 2012, 19(12):898-901.

    Article  Google Scholar 

  7. Wang CY, Chen Y, Ray KJ: Liu, Sequential Chinese restaurant game. IEEE Transactions on Signal Processing 2013, 61(3):571-584.

    Article  MathSciNet  Google Scholar 

  8. Zhang BL, Chen Y, Wang CY, Liu KJR: Learning and decision making with negative externality for opportunistic spectrum access. In Proceedings of the IEEE Globecom. Anaheim; 2012.

    Google Scholar 

  9. Wei YF, Yu FR, Song M, Leung VCM: Energy efficient distributed relay selection in wireless cooperative networks with finite state Markov channels. In Proceedings of the IEEE Globecom. Honolulu; 2009.

    Google Scholar 

  10. Beres E, Adve R: Selection cooperation in multi-source cooperative networks. IEEE T. Wirel. Commun. 2008, 7(1):118-127.

    Article  Google Scholar 

  11. Li YF, Wang P, Niyato D, Zhuang WH: A dynamic relay selection scheme for mobile users in wireless relay networks, in Proceedings of the IEEE INFOCOM . Shanghai Apr 2011, 10–15: 256-260.

    Google Scholar 

  12. Sadek AK, Han Z, Liu KJR: Distributed relay-assignment protocols for coverage expansion in wireless cooperative networks. IEEE T. Mobile. Comput. 2010, 9(4):505-515.

    Article  Google Scholar 

  13. Dong L, Han Z, Petropulu AP, Poor HV: Improving wireless physical layer security via cooperating relays. IEEE Trans. Signal. Proc. 2010, 58(3):1875-1888.

    Article  MathSciNet  Google Scholar 

  14. Huang J, Swindlehurst AL: Cooperative jamming for secure communications in MIMO relay networks. IEEE Trans. Signal. Proc. 2011, 59(10):4871-4884.

    Article  MathSciNet  Google Scholar 

  15. Xiao L, Lin WS, Chen Y, Liu KJR: Indirect reciprocity game modeling for secure wireless networks. In Proceedings of the IEEE International Conference on Communications (ICC). Ottawa; 2012.

    Google Scholar 

  16. Xiao L, Lin WS, Chen Y, Liu KJR: Indirect reciprocity security game for large-scale mobile wireless networks. IEEE Trans Information Forensics & Security 2012, 7: 1368-1380.

    Article  Google Scholar 

  17. Gao L, Wang X, Xu Y: Multi-radio channel allocation in multi-hop wireless networks. IEEE Trans. Mobile Computing 2009, 8(11):1454-1468.

    Article  Google Scholar 

Download references


The work is partly supported by NSFC (61271242, 61001072, 61172097), the Natural Science Foundation of Fujian Province of China (no. 2010J01347), NCETFJ, SRF for ROCS, SEM, and the Fundamental Research Funds for the Central Universities (2012121028).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Liang Xiao.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Zhao, C., Xiao, L., Kang, S. et al. Secure relay selection based on learning with negative externality in wireless networks. EURASIP J. Adv. Signal Process. 2013, 89 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: