Performance Characterization and Transmission Schemes for Instantly Decodable Network Coding in Wireless Broadcast

We consider broadcasting a block of packets to multiple wireless receivers under random packet erasures using instantly decodable network coding (IDNC). The sender first broadcasts each packet uncoded once, then generates coded packets according to receivers' feedback about their missing packets. We focus on strict IDNC (S-IDNC), where each coded packet includes at most one missing packet of every receiver. But we will also compare it with general IDNC (G-IDNC), where this condition is relaxed. We characterize two fundamental performance limits of S-IDNC: 1) the number of transmissions to complete the broadcast, and 2) the average delay for a receiver to decode a packet. We derive a closed-form expression for the expected minimum number of transmissions in terms of the number of packets and receivers and the erasure probability. We prove that it is NP-hard to minimize the decoding delay of S-IDNC. We also derive achievable upper bounds on the above two performance limits. We show that G-IDNC can outperform S-IDNC %in terms of the number of transmissions without packet erasures, but not necessarily with packet erasures. Next, we design optimal and heuristic S-IDNC transmission schemes and coding algorithms with full/intermittent receiver feedback. We present simulation results to corroborate the developed theory and compare with existing schemes.

. This approach, though simple, is inefficient in terms of throughput, as the transmitted packets are non-innovative to the receivers who have already received them.
The advent of network coding [2] starts a new era for high throughput network coded wireless communications [3]- [17]. By linearly adding all data packets together with randomly chosen coefficients from a sufficiently large finite field, random linear network coding (RLNC) can almost surely achieve the optimal block completion time in block-based wireless broadcast [9]- [11], [18], which is defined as the number of transmissions it takes to complete the broadcast, and is a fundamental measure of throughput due to their inverse relation under a fixed block size. Compared to other optimal codes such as Fountain codes [19], [20], RLNC is preferred due to its ease of implementation at the sender and extension to more complex networks and traffic settings.
However, with RLNC, data packets are block-decoded by solving a set of linear equations, which only takes place after a sufficient number of coded packets have been received. RLNC thus may suffer from heavy computational load [11] and packet decoding delay [17], which is measured by the average time it takes for a receiver to decode a data packet. The first issue can, for example, hinder the application of RLNC for mobile receivers with limited computational capability [11]. Meanwhile, a large packet decoding delay can be unacceptable for delay-sensitive applications such as video streaming [21], [22].
To mitigate these issues, instantly decodable network coding (IDNC) techniques [3], [12]- [15] have been introduced. With IDNC, the sender first broadcasts the data packets uncoded once. It then makes online coding decisions based on receivers' feedback about their packet reception state, under the restriction that coding/decoding is over binary field. A simple packet reception state is demonstrated in Table I. There are two data packets, p 1 and p 2 , and three receivers, R 1 to R 3 , where each has a subset of {p 1 , p 2 } and wants the rest. Consider a coded packet of X = p 1 ⊕ p 2 , where ⊕ denotes binary XOR. It has three different effects on different receivers: 1) it is instantly decodable to R 1 , because R 1 can decode p 2 by performing X ⊕ p 1 ; 2) it is non-instantly decodable to R 2 , because R 2 has neither p 1 nor p 2 ; and 3) it is non-innovative to R 3 , because R 3 already has both p 1 and p 2 .
There are two main types of IDNC techniques. The first one, called strict IDNC (S-IDNC) [13]- [15], prohibits the transmissions of non-instantly decodable packets to any receiver. Effectively, each coded packet can include at most one wanted data packet of every receiver. The second one, called general IDNC (G-IDNC), removes this restriction to generate more coding opportunities.
There is a large body of research on G-IDNC, focusing on its throughput and decoding delay performance, coding algorithms, and transmission schemes. Early models and heuristics related to G-IDNC were proposed for index coding [23]. Then G-IDNC was introduced for wireless broadcast and was graphically modeled in [12]. Although its best performance remains unidentified, powerful heuristic algorithms have been developed to improve its throughput and/or decoding delay [12], [24]- [26], or to strike a balance between the two [27]. G-IDNC transmission schemes under full/intermittent receiver feedback have been developed [28], [29]. G-IDNC has also been adopted in wireless broadcast applications with hard decoding deadlines [21] or with receiver cooperation [30]- [32].
Studies on theoretical performance characterization and implementations of S-IDNC are more limited in both range and depth. S-IDNC was graphically modeled in [13], which then proved that the minimum clique partition solution [13] of the associated graph can be an S-IDNC solution that minimizes the block completion time. However, this solution does not take into account the issues of decoding delay and the robustness of coded transmissions to erasures. S-IDNC has shown to be asymptotically throughput optimal when there are up to three receivers or when the number of data packets approaches infinity [21], but the general relation between the throughput of S-IDNC and system parameters has not been characterized before. In addition and to the best of our knowledge, the minimum packet decoding delay of S-IDNC is still unknown. Moreover, there have not been S-IDNC transmission schemes that can work with intermittent receiver feedback. Another unaddressed problem is a systematic performance comparison between S-IDNC and G-IDNC.
In this paper, we study the above problems and provide the following contributions: 1) We characterize the throughput performance limits of S-IDNC. Specifically, we first derive an achievable upper bound on the minimum block completion time for any given packet reception state. We then derive a closed-form expression for the expected minimum block completion time in terms of the number of packets and receivers and the erasure probability; 2) We prove that it is NP-hard to minimize the packet decoding delay of S-IDNC. We derive an upper bound on the minimum packet decoding delay in terms of the minimum block completion time; 3) We show that in the presence of erasures, the minimum clique partition solution of the S-IDNC graph as identified by [13] may not result in the minimum block completion time, because it does not allow the same data packet to be repeated in different coded packets, a desired property which we refer to as packet diversity. Motivated by this fact, we develop the optimal S-IDNC coding algorithm, as well as heuristics that aim to improve packet diversity. We design S-IDNC transmission schemes under full and intermittent receiver feedback; 4) We also provide new results on how S-IDNC and G-IDNC compare. We study the relation between S-IDNC and G-IDNC graphs, and then demonstrate some scenarios under which G-IDNC can/cannot outperform S-IDNC.

A. Transmission Setup
We consider a block-based wireless broadcast scenario, in which the sender needs to deliver a block of K data packets, denoted by wireless channels that are subject to independent random packet erasures.
Initially, the K data packets are transmitted uncoded once using K time slots, constituting a systematic transmission phase [11]. Then, each receiver provides feedback 1 to the sender about the packets it has received or missed due to packet erasures. The complete packet reception state is represented by an N × K state feedback matrix (SFM) A, where a n,k = 1 if R n has missed (and thus still wants) p k , and 1 We assume that there exists an error-free feedback link from each receiver to the sender that can be used with appropriate frequency.  (c) G-Graph Gg Fig. 1. An example of SFM and its S-and G-IDNC graphs. a n,k = 0 if R n has already received p k . The set of data packets wanted by R n is called the Wants set of R n and is denoted by W n . The set of receivers who want p k is called the Target set of p k and is denoted by T k . The size of T k is denoted by T k . Packets with larger T k are more desired by receivers.
Example 1. Consider the SFM in Fig. 1(a) with K = 6 data packets and N = 5 receivers. The Wants

B. Coded Transmission Phase: Two Types of IDNC
According to A, the sender generates IDNC coded packets under the binary field F 2 . Explicitly, IDNC coded packets are of the form X = p k ∈M p k , where M is a selected subset of P K , and is called an IDNC coding set. Obviously, X has three possible decoding effects at each receiver: Depending on which of the above three types of effects are allowed, there are two variations of IDNC.
The first one is called strict IDNC (S-IDNC), which is the main subject of our study. It prohibits the transmission of any non-instantly decodable coded packets to any receivers. This restriction implies that any two data packets wanted by the same receiver cannot be coded together. We thus have the concept of conflicting and non-conflicting data packets: Definition 2. Two data packets p i and p j conflict if at least one receiver wants both of them, i.e., if ∃n : {p i , p j } ⊆ W n . Otherwise they do no conflict.
An S-IDNC coding set is thus a set of pairwise non-conflicting data packets. The conflicting states among data packets can be represented by an undirected graph G s (V, E). Each vertex v i ∈ V represents a data packet p i . Two vertices v i and v j are connected by an edge e i,j ∈ E if p i and p j do not conflict.
Thus, every complete subgraph of G s , a.k.a., a clique, represents an S-IDNC coding set. In the rest of the paper, we will use the terms "coded packet", "coding set", and "clique" interchangeably, and denote the last two by M.
The main limitation of S-IDNC is that a coded packet which is instantly decodable for a large subset of receivers may be prohibited merely because it is non-instantly decodable for a small subset of receivers.
In the second type of IDNC, called general IDNC (G-IDNC), the restriction on non-instantly decodable packets is removed to generate more coding opportunities.
G-IDNC can also be graphically modeled [24]. The difference is that, in the G-IDNC graph G g (V, E), a data packet p k wanted by different receivers are individually represented by different vertices v n,k , for all a n,k = 1. Consequently, the number of vertices in G g is equal to the number of "1"s in A. Two vertices v m,i and v n,j are connected by an edge if: 1) i = j, or 2) if p i / ∈ W n and p j / ∈ W m . In the first case, p i = p j , and thus by sending p i both R m and R n can decode. In the second case, by sending p i ⊕ p j , R m and R n can decode p i and p j , respectively, because they already have p j and p i , respectively. Similar to S-IDNC, every clique of G g represents a G-IDNC coding set.
We note that an S-IDNC coded packet is always a G-IDNC coded packet, but the reverse is not necessarily true. Below is an example of S-and G-IDNC coded packets.
Example 2. Consider the SFM and its S-and G-IDNC graphs in Fig. 1. The G-IDNC graph indicates that (v 1,1 , v 5,3 , v 4,4 ) is a clique. The corresponding G-IDNC coding set is (p 1 , p 3 , p 4 ), and thus X g = p 1 ⊕ p 3 ⊕ p 4 is a G-IDNC coded packet. X g is instantly decodable for R 1 , R 4 , R 5 because they only want one data packet from X g . X g is non-instantly decodable for R 3 because R 3 wants both p 3 and Due to the existence of R 3 , X g is not an S-IDNC coded packet. Whereas the S-IDNC graph indicates is a clique. The corresponding coding set is (p 1 , p 2 , p 3 ), and thus X s = p 1 ⊕ p 2 ⊕ p 3 is an S-IDNC coded packet, which can be verified to also correspond to clique (v 1,1 , v 2,2 , v 3,3 ) or clique We then introduce the notion of IDNC solution. A set of IDNC coding sets is called an IDNC solution if, upon the reception of the coded packets of all these coding sets, every receiver can decode all its wanted data packets. An S-IDNC solution is denoted by S s . The set of all S-IDNC solutions of a given SFM is denoted by S s . Similarly, we can also define S g and S g for G-IDNC.
To assess the performance of IDNC solutions, we now introduce our measures of throughput and decoding delay.

C. Throughput and Decoding Delay Measures
An S-IDNC solution S s requires a minimum of |S s | coded transmissions. We call U Ss |S s | the minimum block completion time of S s . It measures the best throughput of S s with a value of K K+U Ss packet per transmission. We further denote by U s the absolute minimum block completion time over all the S-IDNC solutions of A, i.e., U s min{U Ss : S s ∈ S s }. The definition of U g for G-IDNC follows.
We measure decoding delay by average packet decoding delay (APDD) D, which is the average time it takes for a receiver to decode a data packet: where u n,k is the time index when R n decodes p k , and T = K k=1 T k , which is also the number of "1"s in A. Given an IDNC solution S, by letting u n,k be the first time index when S allows R n to decode p k , (1) produces the minimum APDD of S. Then similar to U s (resp. U g ), we denote by D s (resp. D g ) the absolute minimum APDD over all S-(resp. G-) IDNC solutions of A. We also note that in the specific case of an S-IDNC solution S s , u n,k is indeed the index of the first coding set in S s that contains p k , as every receiver who wants p k can decode it from this coding set.
Example 3. Consider the SFM in Fig. 1(a). Suppose that an S-IDNC solution with four coded packets , and X 4 = p 5 are transmitted in this order. The receivers' decoding time {u n,k } are summarized in Table II. The minimum APDD of this solution is D Ss = In the presence of packet erasures, the sender needs to adopt a coded transmission phase. In each coded transmission, it selects and broadcasts a coding set through erasure-prone wireless channels. We denote by U T the block completion time of this phase, and by D T the APDD of this phase, calculated as in (1). U T and D T measure the throughput and decoding delay performance of this phase, respectively.
They vary according to the IDNC solutions, transmission schemes, and erasure patterns. But it always holds that U T U s and D T D s if S-IDNC is applied. Therefore, U s and D s reflect the performance limits of S-IDNC. Hence, we will first study these limits in the next section, and then design S-IDNC transmission schemes and coding algorithms in Sections IV and V, respectively.

III. PERFORMANCE LIMITS AND PROPERTIES OF IDNC
In this section, we study performance limits and properties of S-IDNC and compare it with G-IDNC.
A. Absolute minimum block completion time U s We first study the throughput limit of S-IDNC, measured by the absolute minimum block completion time U s . It has been proved that U s is equal to the size of the minimum clique partition solution 2 of G s [13], denoted by S c . This equivalence holds because of the following property: Property 1. Removing any vertex from the S-IDNC graph does not change the connectivity of the remaining vertices.
This property holds because vertices in G s represent different data packets. Thus, to remove all vertices from G s (i.e., to complete the broadcast), at least |S c | cliques must be removed, which yields U s = |S c |.
According to graph theory, |S c | is equal to the chromatic number 3 χ(G s ) of the complementary graph G s , which has the same vertex set as G s , but has opposite vertex connectivity. We thus have U s = χ(G s ).
This equality enables us to answer two important questions about U s : 1) how to find the U s of a given S-IDNC graph? 2) what are the statistical characteristics of U s under random packet erasures?
1) The U s of an SFM: The chromatic number of a graph (and thus U s ) has been proven to be NP-hard to find and AXP-hard to approximate [33]. But there are heuristic algorithms and bounds for it. We will develop algorithms dedicated to S-IDNC in Section V, and focus on the bounds in this subsection.
Tight bounds on χ(G s ) exist, but are also NP-hard to find. One such example is a tight lower bound w(G s ) [34], the size of the largest clique of G s . There are also loose bounds. For example: Property 2. All S-IDNC graphs with K vertices and M 0 edges have: where x outputs the smallest integer greater than x.
This bound is due to Geller [35] and its proof is omitted here. This bound is useful because it identifies the smallest achievable U s of all the S-IDNC graphs with K vertices and M 0 edges. 2 The minimum clique partition solution of a graph G is the minimum set of disjoint cliques of G that together cover all the vertices. 3 The chromatic number of a graph G is the minimum number of colors to color the vertices so that any two connected vertices have different colors.
An existing upper bound on χ(G s ) is ∆(G s ) + 1 [33], where ∆(G s ) is the largest number of edges incident to any single vertex in G s . However, with given K and M 0 , this bound is not always achievable.
To see this, assume G s has K − 1 edges. Connecting these edges to the same vertex yields an upper bound of ∆(G s ) + 1 = K. But no matter how we allocate these edges, there are always unconnected vertices in G s , which indicates that χ(G s ) < K. Hence, the upper bound is not achievable here.
We are thus motivated to derive an achievable upper bound of U s as a function of K and M 0 . We first note that a set of pairwise unconnected vertices (a.k.a. an independent set) of G s , denoted by V I , must be transmitted separately because their corresponding packets all conflict with each other. The size Then, whenever a new edge is added to the graph, we can maximize U s by maximizing |V I |, which means that the edge should not connect two vertices in V I whenever possible. Explicitly, our upper bound on U s is derived iteratively: • When there is no edge in the graph, we have U s = K; • When there are up to K − 2 additional edges, i.e, when M 0 = [K, 2K − 3], we can use these additional edges to connect v 2 with v 3 · · · v K . Since v 3 · · · v K are independent, we have U s = K −2; • The iterations will terminate when the graph is complete, i.e., when M 0 = K(K −1)/2 and U s = 1.
It can be easily proved that any reallocation of edges will reduce the U s derived above to a smaller value. Our upper bound on U s has the following stair-case profile: Property 3. All S-IDNC graphs with K vertices and M 0 edges have: 2) U s as a function of system parameters: In addition to finding U s for a given S-IDNC graph, we are also interested in the statistical characteristics of U s , for which we assume that A (and thus G s ) is obtained as a consequence of random packet erasures in the systematic transmission phase.
For wireless broadcast, a common assumption on random packet erasures is that they are i.i.d. random variables with a probability of P e . Under this assumption, a similar question has already been introduced and answered for the RLNC technique. It has been shown in [6], [36], [37] that the block completion time of RLNC scales as O(ln(N )) when K is a constant. Consequently, the throughput of RLNC vanishes with increasing number of receivers N . To prevent zero throughput, it has been proved in [38] that K should scale faster than ln(N ).
Since the throughput of RLNC is optimal, it cannot be exceeded by the throughput of S-IDNC. Hence, we can infer that the throughput of S-IDNC should also follow a vanishing behavior with increasing N . However, its rate and specific dependence on system parameters have not been fully characterized in the literature. In this subsection, we answer this question through the following theorem: Theorem 1. The mean of the absolute minimum block completion time U s is a function of the block size K, the number of receivers N , and packet erasure probability P e as follows: where o(1) is a small term that approaches zero with increasing K.
Proof: Our approach is to model the complementary S-IDNC graph G s as a random graph with i.i.d. edge generating probability. Recall that two vertices in G s are connected if the two data packets conflict, i.e., if at least one receiver has missed both packets. Therefore, the edge generating probability, denoted by P c , is calculated as: Then, the key is to prove that different edges are generated independently. We first consider the independency between two adjacent edges, say e 1,2 and e 1,3 , which share v 1 . The information carried by e 1 (resp. e 2 ) is that there is at least one receiver who wants both p 1 and p 2 (resp. p 1 and p 3 ). Hence, the mutual information between e 1 and e 2 is that p 1 is wanted by at least one receiver, which happens with a probability of 1 − (1 − P e ) N . Thus: The inequality holds because other edges incident to p 1 also contribute to H(p 1 ). It is clear that I(e 1,2 ; e 1,3 ) quickly converges to zero with increasing N , indicating that e 1,2 and e 1,3 are asymptotically independent of each other. We note that two disjoint edges in G s share no mutual information, and thus are mutually independent. Therefore, we can assume that edges in G s are independently generated.
Consequently, G s can be modeled as an Erdõs-Rényi random graph [39], which has K vertices and i.i.d. edge generating probability of P c . Fig. 2 compares the mean number of edges (with a value of K(K − 1)/2 · P c ) of our proposed random graph model and the simulated average number of edges in G s . Our model shows virtually no deviation under all considered values of N and K. From graph theory, given K and P c , almost every random graph G s has a chromatic number of: Since U s = χ(G s ), the above value is the mean of U s . By substituting (6) into (8) we obtain (4).
Theorem 1 has the following important corollary: Then, by noting that the mean block completion time of the coded transmission phase is lower bounded by E[U s ], we conclude that the throughput of S-IDNC is significantly affected by the number of receivers. S-IDNC may not be a good choice when the number of receivers is large. We note that the above theorem and corollary are not directly applicable to G-IDNC, because the edge generating probability in G-IDNC is quite different. Interested readers are referred to [40] for more information. It has been proved in [41] that it is NP-hard to determine the achievability of D. Hence, it is NP-hard to determine the existence of S p . Then by contradiction, if it is not NP-hard to find D s , we can easily determine the existence of S p by comparing D s with D Sp . Therefore, it is NP-hard to find D s .
Besides the NP-hardness, D s has the following property: Proof: Given an S-IDNC solution S s = {M u } U u=1 , let T (u) = p k ∈Mu T k be the number of receivers who can decode a data packet from M u . The minimum APDD of S s is thus: which is maximized when {T (u)} U u=1 = T U . In this case, D Ss = U +1 2 . Applying this result to an S-IDNC solution with absolute minimum block completion time U = U s , we obtain the result.
Our proof indicates that, although it is NP-hard to achieve D s , we can still effectively reduce APDD by reducing the S-IDNC solution size. Before we further explore this result to implement S-IDNC, we would like to compare the performance limits of S-IDNC that we have just derived with G-IDNC.

C. S-IDNC vs. G-IDNC
In this subsection, we address the question of how does S-IDNC compare with G-IDNC?
We first note that the NP-hardness of finding D s also holds for D g . This is because the perfect S-IDNC solution S p is also the best possible G-IDNC solution. For the throughput, we first present a relation between S-and G-IDNC graphs (proved in the appendix): Theorem 3. The minimum clique partition solutions of S-IDNC and G-IDNC graphs have the same size. In other words, χ(G s ) = χ(G g ).
However, the above theorem does not imply U s = U g . This is because G-IDNC does not have Property 1. Explicitly, by removing a vertex from G g , more edges and larger cliques may be generated, and thus the absolute minimum block completion time U g can be smaller than χ(G g ) of the original G-IDNC graph G g [25]. We thus have U g U s . We note, however, that a systematic way of finding U g other than brute-force search remains widely open.
Therefore, when there are no packet erasures, the throughput of G-IDNC is at least as good as S-IDNC. But is this still true in more realistic erasure-prone scenarios? In the next section, we will design S-IDNC transmission schemes under packet erasures and compare them with G-IDNC. We will apply the above theorem to show that G-IDNC cannot outperform S-IDNC under certain circumstances.

IV. S-IDNC TRANSMISSION SCHEMES
In this section, we design S-IDNC transmission schemes to compensate for packet erasures in the coded transmission phase, which are i.i.d. with a probability of P e . To this end, the sender has to regularly collect feedback from the receivers about their packet reception state to make online coding decisions. We consider two common types of feedback frequency, namely: 1) fully-online feedback: feedback is collected after every coded transmission. However, this could be costly in wireless communications. We thus also consider a reduced feedback frequency next; 2) semi-online feedback: feedback is only collected after several (to be quantified later) coded transmissions; To be able to design S-IDNC transmission schemes, two questions need to be answered first: 1) What is the optimization objective for throughput and decoding delay improvement?
2) What does the sender need to send to achieve it?
Before addressing these questions, we first highlight some challenges: To see this, let us consider the stochastic shortest path (SSP) method [24]. In SSP method, the state space comprises the current SFM and its successors, and thus has a prohibitively large size with a value of 2 T , where T is the number of "1"s in A. The action space for each state comprises all cliques/coding sets, which is NP-hard to find [42]. Then, E[U T ] is recursively minimized by examining all the states and the associated actions. Such examination is necessary, because the packet erasures can take any pattern and are not predictable. But it makes E[U T ] intractable to minimize. To overcome this difficulty, we will turn to optimization objectives that are heuristic, but still based on SSP optimization principles.

Remark 2.
It is intractable to minimize the APDD D T of the coded transmission phase due to the NP-hardness of finding D s , because otherwise by setting P e = 0, the minimum D T is equal to D s . To overcome this difficulty, we will give higher priority to the minimization of block completion time. In other words, we first minimize the block completion time. Then among the resultant coding decisions, we choose the one that minimizes the decoding delay. Our prioritization reflects the motivation of using network coding, that is, to achieve better throughput performance. It also provides bounded decoding delay performance as we have shown in (9). This will also be confirmed by our simulations, which show that D T generally decreases with decreasing U T .

A. Fully-online Transmission Scheme
With fully-online feedback frequency, the sender only transmits one coded packet before collecting feedback. Under the SSP method, the current state is the current SFM A, the absorbing state is the all-zero SFM and is denoted by A 0 . The action space comprises all the S-IDNC coding sets of A. The cost of each action is one, for it consumes one transmission. The block completion time U T is thus equal to the number of transitions (a.k.a. path length or distance) between A and A 0 .
According to Remark 1, it is intractable to choose an action/coded packet that minimizes the expected path length (and thus E[U T ]). As a heuristic alternative, we propose to choose an action/coded packet that belongs to the shortest path from A to A 0 , which has a length of U s . This choice guarantees that, upon the reception of the coded packet at all interested receivers, the shortest distance between the updated state A and A 0 is minimized to U s − 1. To this end, the coded packet must belong to a minimum clique partition solution S c . Otherwise, the shortest distance between A and A 0 is still U s .
We then reduce APDD by forcing the coded packet to be maximal (and thus serving the maximal number of receivers). However, cliques in a minimum clique partition solution are not necessarily maximal. Hence, we further require the coded packet to belong to a set of U s maximal cliques that together cover all the data packets. This set is also an S-IDNC solution and is denoted by S m .
In conclusion, we propose the following coded packet M f for fully-online transmission scheme: Given an SFM instance, the preferred coded packet M f is the most wanted coded packet in S m , where S m is an S-IDNC solution that contains U s maximal cliques.

B. Semi-online Transmission Scheme
The fully-online transmission scheme is costly, not only in collecting feedback, but also in computational load, as it has to find S m in every time slot. These problems can be mitigated by partitioning the coded transmission phase into rounds. In each round, the sender transmits a complete S-IDNC solution and only collects feedback at the end of each round. We call this scheme the semi-online scheme.
Under the SSP method, the action space is the set of all S-IDNC solutions S s , and the cost of each action is the solution size |S s |, which is equal to the length of a semi-online transmission round. The total cost is thus equal to the block completion time.
According to Remark 1, it is intractable to minimize the expected total cost (and thus E[U T ]). As a heuristic alternative, we propose to minimize the expected cost of the shortest path between A and A 0 . The shortest path has a length of one, representing the event that every coded packet of the chosen solution S s is received by all the interested receivers after only one semi-online round. Denote the probability of this event by P s . Then the expected cost is |S s |/P s , where P s is calculated as: Here d k is called the packet diversity and is defined below.
Definition 4. The diversity d k of data packet p k is the number of coding sets in S s that comprise p k .
We note that the minimum clique partition solution S c is not a preferred semi-online S-IDNC solution.
Although S c offers the smallest solution size (|S c | = U s ), it does not maximize P s because every data packet has a diversity of only one due to disjoint cliques in S c . Instead, the S m we have proposed for the fully-online case can offer a higher P s than S c due to possibly overlapping maximal cliques, while also offering the smallest solution size.
We still wish to answer the following question before choosing S m as our preferred semi-online S-IDNC solution: Is there a solution that, though large in its size, provides higher packet diversities, so that P s is maximized?
An explicit answer to this question is difficult to obtain, because it requires the examination of all the solutions of size greater than U s . Such search is costly and does not provide any insight into this question. Moreover, a solution with a larger block completion time is unlikely to provide higher packet diversities due to the following property of S-IDNC solutions:  Fig. 3. The fully-and semi-online transmission schemes.

of one.
This property holds because if every data packet in a coding set has a diversity of greater than one, then this coding set can be removed from the solution without affecting the completeness of the solution.
Due to the above property, an S-IDNC solution S s has at least |S s | data packets with a diversity of only one. According to (11), these unit-diversity data packets reduce P s the most.
Therefore, we choose S m for throughput improvement. Then, by taking into account our secondary optimization objective, i.e., the APDD, we define our preferred semi-online S-IDNC solution as follows: Given an SFM instance, the proposed semi-online S-IDNC solution is S m , which comprises a set of U s maximal cliques. The cliques are sorted for transmission in the descending order of their numbers of targeted receivers to minimize the APDD.
A flow-chart of the proposed two transmission schemes are presented in Fig. 3. Both the fully-and semi-online IDNC schemes require finding S m . Since packet diversity is not a concern in graph theory, there is no algorithms to find S m in the graph theory literature. Hence, we will design algorithms dedicated for S-IDNC in the next section. Before doing so, however, we briefly compare S-IDNC and G-IDNC under the above two transmission schemes.

C. S-IDNC vs. G-IDNC
With fully-online feedback, the sender can update the G-IDNC graph and add new edges representing coding opportunities after every transmission. The throughput of G-IDNC is thus better than S-IDNC.
But the price is high computational load, because G-IDNC graph is much larger than S-IDNC graph (O(N K) v.s. K). However, during a semi-online transmission round, the sender cannot update SFM due to the absence of receiver feedback. Consequently, it does not update the G-IDNC graph G g [29], and only sends the minimum clique partition solution of G g , which, according to Theorem 3, has the same size as the minimum clique partition solution of S-IDNC. We thus have the following corollary: Corollary 2. G-IDNC cannot reduce the length of a semi-online transmission round compared to S-IDNC.

V. S-IDNC CODING ALGORITHMS
The two transmission schemes we proposed in the last section require finding S m , an S-IDNC solution that contains U s maximal coding sets. In this section, we develop its optimal and heuristic algorithms.

A. Optimal S-IDNC coding Algorithm
Our optimal S-IDNC coding algorithm finds S m in two steps: Step-1 Find all the maximal coding sets (maximal cliques): This problem is NP-hard in graph theory.
We apply an exponential algorithm, called Bron-Kerbosch (B-K) algorithm [42]. The group of all maximal cliques is denoted by A.
Step-2 Find S m from A: We propose a branching algorithm in Algorithm 1. The intuition behind this algorithm is that, if a data packet p k belongs to d k maximal coding sets in A, then one of these d k maximal coding sets must be included in S m for the completeness of S m . In the extreme case where d k = 1, the sole maximal coding set that contains p k must be included in S m . Below is an example of Algorithm 1. IDNC graph Maximal cliques: Step-1 Step-2 2) The set of data packets not included in S is P = {p 4 , p 6 }, and the remaining maximal coding sets are S = A \ S={(p 3 , p 4 ), (p 4 , p 6 ), (p 5 , p 6 )}}. Since p 4 has a diversity of 2 under S due to B-K algorithm and Algorithm 1 constitute our optimal S-IDNC coding algorithm. It produces all the valid S m . Among these solutions, we can choose the one that optimizes a secondary criteria, such as the one offering the smallest D S , or the largest P S , calculated using (11).

B. Hybrid S-IDNC Coding Algorithm
Algorithm 1 is memory demanding, because the number of candidate solutions grows exponentially with branching. Thus, we propose a heuristic alternative to it. The idea is to iteratively maximize the number of data packets included in S m . The algorithm is given in Algorithm 2.
B-K algorithm and Algorithm 2 constitute our hybrid S-IDNC coding algorithm. It produces only one S-IDNC solution, with no guarantee on the solution size. It is still computational expensive due to B-K algorithm. Thus, we develop a polynomial time heuristic S-IDNC coding algorithm next.

C. Heuristic S-IDNC coding Algorithm
Algorithm 3 is a simple algorithm that heuristically finds the maximum (the largest maximal) clique of a graph. The intuition behind this algorithm is that, a vertex is very likely to be in the maximum clique if it is incident by the largest number of edges. Variations of this algorithm have been developed in the literature [12], [13], [24]. But this algorithm has not been applied to finding a complete S-IDNC solution, and its computational complexity has not been identified yet.
The computational complexity of Algorithm 3 is polynomial in the number of data packets K. The highest computational cost occurs when the input graph is complete, i.e., when all vertices are connected to each other. In this case, only one vertex will be removed in each iteration. Thus, the number of remaining vertices in iteration-i will be K − i, ∀i ∈ [0, K − 1]. Then, to find the vertex with the largest number of incident edges, we need K − i comparisons. The total computational cost is thus in the order Hence, the computational complexity of Algorithm 3 is at most O(K 2 ).
We apply Algorithm 3 to iteratively find S m in Algorithm 4. In each iteration, we find a clique using Algorithm 3, maximize it by adding more vertices to it whenever possible, and then remove it from the S-IDNC graph. This will increase the diversities of the added vertices/packets. Below is an example: find the coding set M in A that contains the largest number of data packets in P;

5:
Add M to S and remove data packets in M from P; 6: u = u + 1; Example 5. Consider the graph G s in Fig. 1(b). In the first two iterations, the algorithm will choose Find the maximum clique in G w using Algorithm 3. Denote it by M i ; 5: Find the vertices in V covered which are connected to M i . Denote their set by V i (They are the candidate vertices that could be added to M i .); 6: Generate a subgraph of G whose vertex set is V i . Denoted this subgraph by G i (V i , E i ); We also compare S-IDNC with RLNC and G-IDNC. For RLNC, we assume a sufficiently large finite field, so that its throughput is almost surely optimal and serves as a benchmark. For G-IDNC, although its best performance is at least as good as S-IDNC (as we have explained in Section III-C), this advantage will not necessarily be reflected in our simulation results. This is because there has not been any optimal G-IDNC algorithm. Instead, we apply a heuristic algorithm (abbreviated as Heur. G-in the figures) proposed in [24], which aims at minimizing the block completion time. This aim coincides with our optimization priorities for S-IDNC in Remark 2, namely, to minimize the block completion time first.
We conduct three sets of simulations. The first set compares the performance limits of the three techniques. The results are presented in Fig. 5. Here for RLNC, its absolute minimum block completion time is equal to the size of the largest Wants set of the receivers. This number cannot be further reduced by any means, because otherwise the most demanding receivers cannot decode all its wanted data packets. The second (resp. third) set of simulations compares the throughput and decoding delay performance under fully-online (resp. semi-online) transmission scheme. The results are presented in Fig. 6 (resp. Fig. 7). We note that the performance of RLNC is the same under both schemes, because RLNC is feedback-free.
Our observations on S-IDNC are as follows: • The absolute minimum block completion time of S-IDNC increases almost linearly with N . This result matches Corollary 1; • The fully-online transmission scheme always provides better throughput and decoding delay performance than the semi-online one; • The optimal coding algorithm always provides better throughput performance than its hybrid and heuristic alternatives. This result verifies our choice of S m for throughput improvement, because only the optimal coding algorithm can always produce S m , which has |S m | = U s ; • However, the optimal coding algorithm does not necessarily minimize the APDD. For example, in Fig. 6(b), the hybrid algorithm provides smaller APDD than the optimal one under the fully-online transmission scheme; • The performance gap between the optimal and hybrid algorithms is always marginal, and is much smaller than their gap with the heuristic one. Hence, the hybrid algorithm strikes a good balance between performance and computational load.
A cross comparison of RLNC, S-, and G-IDNC shows that:  It is compared with the performance of heuristic fully-online G-IDNC and RLNC.
• The throughput of RLNC is always the best. The throughput of S-IDNC is very close to RLNC when the number of receivers is small. Their gap increases with N ; • In general, the APDD of both S-and G-IDNC is better than RLNC. This advantage only vanishes when the block completion time of S-and G-IDNC becomes much larger than RLNC, which takes place when N is much larger than K; • There is no clear winner between the performance of heuristic G-IDNC and optimal S-IDNC. We can expect that G-IDNC will outperform S-IDNC if its optimal coding algorithm is developed.
In summary, our simulations verified our theorems, propositions, and algorithms. They also demonstrated that, if we are concerned with both throughput and decoding delay performance, S-IDNC is a good alternative to RLNC when the number of receivers is not too large. It is compared with the performance of heuristic semi-online G-IDNC and RLNC.

VII. CONCLUSION
In this paper, we studied the throughput and decoding delay performance of S-IDNC in broadcasting a block of data packets to wireless receivers under packet erasures. By using a random graph model, we showed that the throughput of S-IDNC decreases linearly with increasing number of receivers. By introducing the concept of perfect S-IDNC solution, we proved the NP-hardness of minimizing the average packet decoding delay. We also proposed two upper bounds on the throughput and decoding delay limits of S-IDNC. We considered two transmission schemes that requires fully-and semi-online feedback frequencies, respectively. By applying stochastic shortest path method, we showed that it is intractable to make optimal coding decisions in the presence of random packet erasures. We then used heuristic objective functions to determine the preferred coded packet(s) to send. We then developed the optimal S-IDNC coding algorithm and its complexity-reduced heuristics. We also compared S-IDNC with G-IDNC by proving the equivalence between the chromatic number of the complementary S-IDNC and G-IDNC graphs. We used this equivalence to show that G-IDNC can outperform S-IDNC when there are not packet erasures, but this is not always true when there are packet erasures.
Our work provides news understandings of S-IDNC. It will facilitate the extension of S-IDNC to applications in other network settings, such as cooperative data exchange and distributed data storage. We are also interested in designing approximation and heuristic algorithms for decoding delay minimization. APPENDIX A PROOF OF THEOREM 3 Theorem 3 requires the proof of χ(G s ) = χ(G g ). Since every S-IDNC solution is also a G-IDNC solution, but a G-IDNC solution is not necessarily an S-IDNC solution, we have U s U g , and thus χ(G s ) χ(G g ). Hence, here we only need to prove that χ(G s ) χ(G g ).
We first introduce the concept of affiliated S-IDNC graph G as of a G-IDNC graph G g , which is construct as follows. Given G g that involves K data packets and N receivers, we generate a graph G as with K vertices, each representing a data packet. We then connect v i and v j in G as if for every pair of {m, n} ∈ [1, N ], v i,m and v j,n are connected upon their existence in G g . In other words, we claim that p i and p j do not conflict if every vertex that represents p i in G g is connected to every vertex that represents p j in G g .
Given an SFM A, we can easily show that its S-IDNC graph G s is the same as the affiliated S-IDNC graph G as of its G-IDNC graph G g . Hence, our task becomes to prove that χ(G as ) χ(G g ), where χ(G as ) = U s . This statement is true if the following property is true: Property 6. After removing any clique M g from G g , the chromatic number of the affiliated S-IDNC graph G as is reduced by at most one.
Since G g is nonempty as long as G as is nonempty, this property indicates that any clique partition solution of G g must have a size of at least χ(G as ), which will prove that χ(G as ) χ(G g ). Property 6 can be proved through induction: 1) If M g does not contain any conflicting data packets in G as , then χ(G as ) is reduced by at most one; 2) If M g contains one pair of conflicting data packets in G as , then χ(G s ) is reduced by at most one; 3) If M g already contains m pairs of conflicting data packets in G as , then modifying M g to contain one more pair of conflicting data packets in G as cannot further reduce χ(G as ).
The first statement is self-evident, because the set of data packets included in such M g is a clique of G as . By removing it, χ(G as ) can be reduced by at most one.
To prove the second statement, without loss of generality we assume that the pair of conflicting data packets is (p 1 , p 2 ). Then the set of data packets included in M g takes a form of {M s , p 1 , p 2 }, where M s is the set of pair-wise non-conflicting data packets, and thus is a clique of G as . Since p 1 conflicts with p 2 , there exists at least one pair of unconnected vertices in G g that represent p 1 and p 2 . This pair is not included in M g , and thus is kept after removing M g from G g . Hence, in the updated affiliated S-IDNC graph G as , v 1 and v 2 exist, and are unconnected. Let the chromatic number of G as be U , then the minimum clique partition of G as takes a form of {M 1 , · · · , M U }, which keeps p 1 and p 2 in different coding sets. Then, since M s is a clique of G as , {M s , M 1 , · · · , M U } is a partition of G as with a size of U + 1. Thus, U χ(G as ) − 1, implying that χ(G s ) is reduced by at most one after removing The proof of the third statement is similar to the second one, and thus is omitted here. According to the above three statements, no matter how many conflicting data packets are included in M g , after removing M g from G g , the chromatic number of the affiliated S-IDNC graph G as is reduced by at most one. Therefore, χ(G g ) χ(G as ). Since G as is the same as G s , we have χ(G g ) χ(G s ) and Theorem 3 is proved.