Saddle-Point Properties and Nash Equilibria for Channel Games

In this paper, transmission over a wireless channel is interpreted as a two-person zero-sum game, where the transmitter gambles against an unpredictable channel, controlled by nature. Mutual information is used as payo ﬀ function. Both discrete and continuous output channels are investigated. We use the fact that mutual information is a convex function of the channel matrix or noise distribution densities, respectively, and a concave function of the input distribution to deduce the existence of equilibrium points for certain channel strategies. The case that nature makes the channel useless with zero capacity is discussed in detail. For each, the discrete, continuous, and mixed discrete-continuous output channel, the capacity-achieving distribution is characterized by help of the Karush-Kuhn-Tucker conditions. The results cover a number of interesting examples like the binary asymmetric channel, the Z-channel, the binary asymmetric erasure channel, and the n -ary symmetric channel. In each case, explicit forms of the optimum input distribution and the worst channel behavior are achieved. In the mixed discrete-continuous case, all convex combinations of some noise-free and maximum-noise distributions are considered as channel strategies. Equilibrium strategies are determined by extending the concept of entropy and mutual information to general absolutely continuous measures.


Introduction
Transmission over a band-limited wireless channel is often considered as a game where players compete for a scarce medium, the channel capacity. Nash bargaining solutions are determined for interference games with Gaussian additive noise. In the works [1,2], different fairness and allocation criteria arise from this paradigm leading to useful access control policies for wireless networks.
The engineering problem of transmitting messages over a channel with varying states may also be gainfully considered from a game-theoretic point of view, particularly if the channel state is unpredictable. Here, two players are entering the scene, the transmitter and the channel state selector. The transmitter gambles against the channel state, chosen by a malicious nature, for example. Mutual information I(X; Y ) is considered as payoff function, the transmitter aims at maximizing, nature at minimizing I(X; Y ). A simple motivating example is the additive scalar channel with input X and additive Gaussian noise Z subject to average power constraints E(X 2 ) ≤ P and E(Z 2 ) ≤ σ 2 . By standard arguments from information theory, it follows that max X:E(X 2 )≤P min Z:E(Z 2 )≤σ 2 I(X; X + Z) = min Z:E(Z 2 )≤σ 2 max X:E(X 2 )≤P I(X; X + Z) is the capacity of the channel. Hence an equilibrium point exists and capacity is the value of the two-person zerosum game. The corresponding equilibrium strategies are to increase power and noise, respectively, to their maximum values.
A similar game is considered in [3], where the coder controls the input and the jammer the noise, both from allowable sets. Saddle points, hence equilibria, and ε-optimal strategies are determined for binary input and output quantization under power constraints for both the coder and the jammer. An extension of the mutual information game (1) to vector channels with convex covariance constraints is considered in [4]. Jorswieck and Boche [5] investigate a similar minimax setup for a single link in a MIMO system with different types of interference. Further extensions to vector channels and different kinds of games are considered (e.g., [6,7]). In this paper, we choose the approach that nature gambles against the transmitter, which aims at conveying information across the channel in an optimal way. "Nature" and "channel" are used synonymously to characterize the antagonist of the transmitter. We consider two models of the channel which yield comparable results. First, transmission is considered purely on a symbol basis. Symbols from a finite set are transmitted and decoded with certain error probabilities. The model is completely discrete, and strategies of nature are described by certain channel matrices chosen from the set of stochastic matrices. The binary asymmetric erasure channel as shown in Figure 4 may serve as a typical example.
On the other hand, continuous channel models are considered. The strategies of the channel are then given by a set of densities, each describing the conditional distribution of received values given a transmitted symbol. The finite input additive white Gaussian noise channel is a standard example hereof, and also 4-QAM with correlated noise (e.g., as shown in Figure 1) is covered by this model.
For both models, equilibrium points are sought, where the strategy of the transmitter consists of selecting the optimum input distribution against the worst-case behavior Figure 3: The Z-channel. of the channel, vice versa, and both have the same game value. The contributions of this paper are as follows. In Section 2, we demonstrate that mutual information is a convex function of the channel matrix, or the noise densities, respectively. For discrete channels, transmission is considered as a game in Section 3. Some typical binary and n-ary channels are covered by this theory, as shown in Section 5. It is demonstrated that equilibrium points exist and the according optimum strategies for both players are determined. The entropy of mixture distributions is considered in Section 6, which finally, in Section 7, leads to equilibrium points for mixed discrete-continuous channel strategies.

Channel Models and Mathematical Foundations
Denote the set of stochastic vectors of dimension m by Each p ∈ D m represents a discrete distribution with m support points. The entropy H of p is defined as If p characterizes the distribution of some discrete random variable X, we synonymously write H(X) = H(p). It is well known that the entropy H is a concave function of p, and furthermore, even Schur-concave over the set of distributions D m , since it is symmetric (see [8]). Let random variable X denote the discrete channel input with symbol set {x 1 , . . . , x m } and distribution p. Accordingly, random variable Y denotes the output of the channel.

Discrete Output Channels.
We first deal with discrete channels. If the output set consists of n symbols {y 1 , . . . , y n }, then the behavior of the channel is completely characterized by the (m × n) channel matrix: consisting of conditional probabilities w i j = P(Y = y j | X = x i ). Matrix W is an element of the set of stochastic (m × n) matrices, denoted by S m×n . Its rows are stochastic vectors, denoted by w 1 , . . . , w m ∈ D n . The distribution of Y is then given by the stochastic vector q = pW.
Mutual information for this channel model reads as where D(· ·) denotes the Kulback-Leibler divergence, with p, q ∈ D m . Obviously, mutual information depends on the input distribution p, controlled by the transmitter, and channel matrix W, controlled by nature. To emphasize this dependence, we also write I(X; Y) = I(p; W), The following result is quoted from [9, Lemma 3.5].

Proposition 1. Mutual information I(p; W) is a concave function of p ∈ D m and a convex function of W ∈ S m×r .
The proof relies on the representation in the third line of (5), convexity of the Kulback-Leibler divergence D(p q) as a function of the pair (p, q), and concavity of the entropy H.
The problem of maximizing I(p; W) over p or minimizing I(p; W) over W subject to convex constraints hence fall into the class of convex optimization problems.

General Output Channels. Entropy definition (3) gener-
alizes to densities f of absolutely continuous distributions with respect to a σ-finite measure μ as (see [10]). Practically relevant cases are the discrete case (3), where μ is taken as the counting measure, densities f , with respect to the Lebesgue measure λ n on the σ-field of Borel sets over R n , and mixtures hereof. These cases correspond to discrete, continuous, and mixed discretecontinuous random variables. The approach in Section 2.1 carries over to densities of absolutely continuous distributions with respect to μ, as used in (7). The channel output Y is randomly distorted by noise, for symbol i governed by μ-density f i . Hence, the distribution of Y given input X = x i has μ-density The AWGN channel Y = X + N is a special case hereof with f i (y) = ϕ(y − x i ). Here, ϕ denotes the Lebesgue density of a Gaussian distribution N n (0, Σ).
Mutual information between channel input and output as a function of p = (p 1 , . . . , p m ) and ( f 1 , . . . , f m ) may be written as where D( f g) = f log( f /g)dμ denotes the Kullback-Leibler divergence between μ-densities f and g.
Let F denote the set of all μ-densities. From the convexity of t log t, t ≥ 0, it is easily concluded that By applying the log-sum inequality (cf. [9]), we also obtain pointwise for any pairs of densities ( Integrating both sides of the aforementioned inequality shows that Applying (10) and (12) to the third and forth lines of representation (9), respectively, gives the following proposition.
Proposition 2 generalizes its discrete counterpart, Proposition 1. The latter is obtained from the former by identifying the rows of W as densities with respect to the counting measure with support given by the output symbol set.
In summary, determining the capacity of the channel for fixed channel noise densities f 1 , . . . , f m leads to a concave optimization problem, namely, Further, minimizing I(p; ( f 1 , . . . , f m )) over a convex set of densities f 1 , . . . , f m for some fixed input distribution p ∈ D m yields a convex optimization problem.

Discrete Output Channel Games
In what follows, we regard transmission over a channel as a two-person zero-sum game. A malicious nature is gambling against the transmitter. If nature is controlling the channel, the transmitter wants to protect itself against a worst-case behavior of nature in the sense of maximizing the capacity of the channel by an appropriate choice of the input distribution. The question arises whether this type of channel game has an equilibrium. If the transmitter moves first and maximizes capacity under the present channel conditions, is the same game value achieved if nature deteriorates the channel against the chosen strategy of the transmitter? Hence, I(X; Y) plays the role of the payoff function. We will show that for different classes of channels equilibria exist. The basis is formed by the following minimax or saddle point theorem.
The proof is an immediate consequence of von Neumann's minimax theorem (cf. [11, page 131]). Since D m and T are closed and convex, the main premises are concavity in p and convexity in W, both properties assured by Proposition 1.
If T = S m×r , the value of the game is zero. Nature will make the channel useless by selecting with constant rows w yielding I(p; W) = 0 independent of the input distribution. Obviously, (15) holds if and only if input X and output Y are stochastically independent. We first consider the case that nature plays a singleton strategy, hence T = {W}, a set consisting of only one strategy. However, (14) then reduces to determining max p∈D m I(p; W), the capacity C of the channel for fixed channel matrix W. In order to characterize nonzero capacity channels, we use the variational distance between the ith and jth row of W, defined as The condition max 1≤i, j≤m on the channel matrix W ensures that the according channel has nonzero capacity, as demonstrated in the following proposition. (17) for some γ(W) > 0, then

Proposition 4. If W satisfies
where information is measured in nats.
Proof. Let the maximum in (17) be attained at indices i 0 and j 0 . Further, set p = (1/2)(e i0 + e j0 ) where e i denotes the ith unit row vector in R m . The third line in (5) then gives Since (see [9, page 58]), and it follows that In summary, some channel with transition probabilities W has nonzero capacity if and only if γ(W) > 0. The same condition turns out important when determining the capacity of arbitrary discrete channels. Proof. Mutual information I(p; W) is a concave function of p. Hence the KKT conditions (cf., e.g., [12]) are necessary and sufficient for optimality of some input distribution p. Using (5), some elementary algebra shows that The full set of KKT conditions now reads as which shows the assertion.
Proposition 5 has an interesting interpretation. For an input distribution p * = (p * 1 , . . . , p * m ) to be capacityachieving, the Kulback-Leibler distance between the rows of W and the weighted average with weights p * i has to be the same for all i with positive p * i . Hence, capacityachieving distribution p * places the mixture distribution p * W somehow in the middle of all rows w * i .
EURASIP Journal on Advances in Signal Processing 5

Elementary Channel Models
Discrete binary input channels are considered in this section. From the according channel games capacity-achieving distributions against worst-case channels are obtained.

The Binary Asymmetric Erasure Channel.
The binary asymmetric erasure channel (BEC) with bit error probabilities ε, δ ∈ [0, 1], and channel matrix is depicted in Figure 4. According to Proposition 4, this channel has zero capacity if and only if ε = δ = 1. Excluding this case, by Proposition 5, the capacity-achieving distribution p * = (p * 0 , p * 1 ), p * 0 + p * 1 = 1 is given by the solution of Substituting x = p 0 / p 1 , (38) reads equivalently as By differentiating with respect to x, it is easy to see that the right-hand side is monotonically increasing such that exactly one solution p * = (p * 1 , p * 2 ) exists, which can be numerically computed.
Resembling the arguments used for the binary asymmetric channel and adopting the notation, we see that min W∈T ε, δ I(p; W) = I p; W( ε, δ) for any p ∈ D 2 . Further, is attained at p * = (p * 0 , p * 1 ), the solution of (38) with ε substituted by ε and δ by δ. Finally, the game value amounts to max p∈D 2 min W∈T ε, δ I(p; W) = min W∈T ε, δ max p∈D 2 I(p; W) = I p * ; W( ε, δ) .

The n-Ary Symmetric Channel
Consider the n-ary symmetric channel with symbol set {0, 1, . . . , n − 1} and channel matrix by cyclically shifting some error vector ε = (ε 0 , ε 1 , . . . , ε n−1 ) ∈ D n . Let E ⊆ D n denote the set of strategies that nature can choose the channel state from by selecting some ε ∈ E . If E = D n , the value of the game is zero. As mentioned earlier, nature will cripple the channel by selecting yielding I(X; Y) = 0 independent of the input distribution. Note that ε u is the unique minimum element with respect to majorization, that is, ε u ≺ ε for all ε ∈ D n . We briefly recall the corresponding definitions (see [8]). Let p [i] and q [i] denote the components of p and q in decreasing order, respectively. Distribution p ∈ S is said to be majorized by Hence, to avoid trivial cases, the set of strategies for nature has to be separated from this worst case.

Separation by Schur Ordering. We first investigate the set
for some fixed ε / = ε u and permutation π. This means that the error probabilities are at least spread out, or separated from uniformity as ε, with error probabilities increasing in the fixed order determined by π.
Since E ε is convex and closed, the set of corresponding matrices is convex and closed as well.
To determine the value v of the game, we first consider max p∈D n I(p; W(ε)) for some fixed ε ∈ E ε . From (5), it follows that the maximum is attained at input distribution p = (1/n, . . . , 1/n) with value max p∈D n I(p; W(ε)) = log n − H(ε).
As the entropy is Schur concave, min ε∈E ε (log n − H(ε)) is attained at ε such that the value of the game is obtained as with according equilibrium strategies p = (1/n, . . . , 1/n) and the components of ε equal to those of ε rearranged according to π.

Directional Separation.
In what follows, we consider channel states separated from the worst-case ε u into the direction of some prespecified ε ∈ D n , ε / = ε u . This set of strategies is formally described as for some given α > 0. It is obviously convex and closed. The set of corresponding channel matrices is also closed and convex such that an equilibrium exists by Proposition 3. It remains to determine the game value. Since I(p; W) is a convex function of W, hence decreasing in α ∈ [ α, 1): is attained at W(ε α ) with ε α = (1 − α)ε 0 + α ε. From representation (5), it can be easily seen that is attained at p = (1/n, . . . , 1/n). Vice versa, from (5), it follows that for any W = W(ε), is attained at p = (1/n, . . . , 1/n) for any ε ∈ E α, ε . By monotonicity in α ∈ [ α, 1), it holds that min W∈T α, ε max p∈D n I(p; W) = log n − H(ε α ), which determines the game value. The equilibrium strategies are the uniform distribution for the transmitter and the extreme error vector ε α for nature.

Entropy of Mixture Distributions
Let U be an absolutely continuous random variable with density g(y) with respect to to the Lebesgue measure λ n , and let random variable V have a discrete distribution with discrete density h(y) = p i , if y = x i , i = 1, . . . , m, and h(y) = 0 otherwise, p i ≥ 0, m i=1 p i = 1. Furthermore, assume that B is Bernoulli distributed with parameter α, 0 ≤ α ≤ 1, hence P(B = 1) = α, P(B = 0) = 1 − α. Further, let U, V, B be stochastically independent, then has density with respect to the measure μ = λ n + χ, where χ denotes the counting measure with support {x 1 , . . . , x m }. According to [10], the entropy of W is defined as It easily follows (see [16]) that The following proposition will be useful when investigating equilibria of channel games with continuous noise densities.
In summary, the channel game has an equilibrium point The equilibrium strategy for the channel is given by α = 1. The optimum strategy p * for the transmitter is characterized by (74). For certain error distributions g j this condition can be explicitly evaluated (see [17]).

Conclusions
We have investigated Nash equilibria for a two-person zerosum game where the channel gambles against the transmitter. The transmitter strategy set consists of all input distributions over a finite symbol set, while the channel strategy sets are formed by certain convex subsets of channel matrices or noise distributions, respectively. Mutual information is used as payoff function. Basically, it is assumed that a malicious nature is controlling the channel such that equilibria are achieved when the transmitter plays the capacity-achieving distribution against worst-case attributes of the channel. In practice, however, a wireless channel is only partially controlled by nature, for example, by shadowing and attenuation effects, further, diffraction and reflection. A major contribution to the channel properties, however, is made by interference from other users. It will be a subject of future research to investigate how these effects may be combined in a single strategy set of the channel. The question arises if equilibria for the game "one transmitter against a group of others plus random effects from nature" still exist.