An Integrated Source and Channel Rate Allocation Scheme for Robust Video Coding and Transmission over Wireless Channels

A new integrated framework for source and channel rate allocation is presented for video coding and transmission over wireless channels without feedback channels available. For a ﬁxed total channel bit rate and a ﬁnite number of channel coding rates, the proposed scheme can obtain the near-optimal source and channel coding pair and corresponding robust video coding scheme such that the expected end-to-end distortion of video signals can be minimized. With the assumption that the encoder has the stochastic information such as average SNR and Doppler frequency of the wireless channel, the proposed scheme takes into account robust video coding, channel coding, packetization, and error concealment techniques altogether. An improved method is proposed to recursively estimate the end-to-end distortion of video coding for transmission over error-prone channels. The proposed estimation is about 1–3 dB more accurate compared to the existing integer-pel-based method. Rate-distortion-optimized video coding is employed for the trade-o ﬀ between coding e ﬃ ciency and robustness to transmission errors.


INTRODUCTION
Multimedia applications such as video phone and video streaming will soon be available in the third generation (3G) wireless systems and beyond. For these applications, delay constraint makes the conventional automatic repeat request (ARQ) and the deep interleaver not suitable. Feedback channels can be used to deal with the error effects incurred in image and video transmission over error-prone channels [1], but in applications such as broadcasting services, there is no feedback channel available. In such cases, the optimal trade-off between source and channel coding rate allocations for video transmission over error-prone channels becomes very important. According to Shannon's separation theory, these components can be designed independently without loss in performance [2]. However, this is based on the assumption that the system has an unlimited computational complexity and infinite delay. These assumptions are not satisfied in delay-sensitive real-time multimedia communications. Therefore, it is expected that joint considerations of source and channel coding can provide performance improvement [3,4].
Most of the joint source and channel coding (JSCC) schemes have been focusing on images and sources with ideal signal models [4,5]. For video coding and transmission, many works still keep the source coding and channel coding separate instead of optimizing their parameters jointly from an overall end-to-end transmission point of view [6,7]. Some excellent reviews about robust video coding and transmission over wireless channels can be found in [8,9]. In [10], a JSCC approach is proposed for layered video coding and transport over error-prone packet networks. It presented a framework which trades video source coding efficiency off for increased bitstream error resilience to optimize the video coding mode selection with the consideration of channel conditions as well as error recovery and concealment capabilities of the channel codec and source decoder, respectively. However, the optimal source and channel rate allocation and corresponding video macroblock (MB) mode selection have to be selected through simulations over packet-loss channel models. In [11], a parameterized model is used for the analysis of the overall mean square error (MSE) in hybrid video coding for the error-prone transmission. Models for the video encoder, a bursty transmission channel, and error propagation at the video decoder have been combined into a complete model of the entire video transmission system. However, the model for video encoder involves several parameters and the model is not theoretically optimal because of the use of random MB intramode updating, which does not consider the different motion activities within a video frame to deal with error propagation. Furthermore, the models depend on the distortion-parameter functions obtained through ad hoc numerical models and simulations over specific video sequences, which also involves a lot of simulation efforts and approximation. The authors of [12] proposed an operational rate-distortion (RD) model for DCT-based video coding incorporating the MB intra-refreshing rate and an analytic model for video error propagation which has relatively low computational complexity and is suitable for realtime wireless video applications. Both methods in [11,12] focus on the statistical model optimization for general video sequence, which is not necessarily optimal for a specific video sequence because of the nonstationary behavior across different video sequences.
In this paper, we propose an integrated framework to obtain, the near-optimal source and channel rate allocation, and the corresponding robust video coding scheme for a given total channel bit rate with the knowledge of the stochastic characteristics of the wireless fading channel. We consider the video coding error (quantization and mode selection of MB), error propagation, and concealment effects at the receiver due to transmission error, packetization, and channel coding in an integrated manner. The contributions of this paper are the following. First, we present an integrated system design method for wireless video communications in realistic scenarios. This proposed method takes into account the interactions of fading channel, channel coding and packetization, and robust video coding in an integrated, yet simple way, which is an important system design issue for wireless video applications. Second, we propose an improved video distortion estimation which is about 1-3 dB peak signal-to-noise ratio (PSNR) more accurate than the original integer-pel-based method (IP) in [13] for half-pel-based video coding (HP), and the computational complexity in the proposed method is less than that in [13].
The rest of the paper is organized as follows. Section 2 describes first the system to be studied, then the packetization and channel coding schemes used. We also derive the integrated relation between MB error probability and channel coding error probability given the general wireless fading channel information such as average signal-to-noise ratio (SNR) and Doppler frequency. Section 3 presents the improved end-to-end distortion estimation method for HPbased video coding. Simulations are performed to compare the proposed method to the IP-based method in [13]. Then we employ RD-optimized video coding scheme to optimize the end-to-end performance for each pair of source and channel rate allocation. Simulation results are shown in Section 4 to demonstrate the accuracy of the proposed endto-end distortion estimation algorithm under different channel characteristics. Conclusions are stated in Section 5.

PROBLEM DEFINITION AND INTEGRATED SYSTEM STRUCTURE
The problem to be studied is illustrated in Figure 1 which can be specified by five parameters (r, r c , ρ, f d , F): r is the total channel bit rate, r c is the channel coding rate, ρ is the average SNR at the receiver, f d is the Doppler frequency of the fading channel targeted, and F is the video frame rate. H.263 [14] is used for video coding. A video sequence denoted as is the pixel spatial location and l = 1, . . . , L is the frame index, is encoded at the bit rate r s = r × r c b/s and the frame rate F f/s with the MB error probability P Mb = f (ρ, f d , r c ) that will be detailed next. The resulted H.263 bitstream is packetized and protected by forward error correction (FEC) channel coding with the coding rate r c . The resulted bitstream with rate r b/s is transmitted through wireless channels characterized by ρ and f d . The receiver receives the bitstream corrupted by the channel impairment, then reconstructs the video sequencẽ f s l after channel decoding, H.263 video decoding, and possible error concealment if residual errors occur. The end-toend MSE between the input video sequence at the encoder and the reconstructed video sequence at the decoder is defined as (1) For the video system in Figure 1, there are two tasks to be performed with the five given system parameters (r, r c , ρ, f d , F). First, we need to decide how to allocate the total fixed bit rate r to the source rate r s = r × r c to minimize the end-to-end MSE of the video sequence. Furthermore, the video encoder should be able to, for a source/channel rate allocation (r s , r c ) with residual channel decoding failure rate denoted as p w (r c ), select the coding mode and quantizer for each MB to minimize the end-to-end MSE of the video sequence. The goal is to obtain the source/channel rate pair (r * s , r * c ) and the corresponding robust video coding scheme to minimize (1).
In practical applications, there are only finite number of source/channel pairs available. We can find the robust video encoding schemes for each rate pair (r s , r c ) that minimizes (1) and denote the minimal end-to-end MSE obtained as D * E (r s , r c ), then the optimal source/channel rate pair (r * s , r * c ) and the corresponding video coding scheme can be obtained as For each pair (r s , r c ), we use RD-optimized video coding scheme to trade off between the source coding efficiency and robustness to error propagation. An improved recursive method which takes into account the interframe prediction, error propagation, and concealment effect is used to estimate the end-to-end MSE frame by frame. In this paper, the wireless fading channel is modeled as a finite-state Markov chain (FSMC) model [15,16,17], and the Reed-Solomon (RS) code is employed for forward error coding.

Modeling fading channels using finite-state Markov chain
Gilbert and Elliott [15,16] studied a two-state Markov channel model, where each state corresponds to a specific channel quality. This model provides a close approximation for the error rate performance of block codes on some noisy channels. On the other hand, when the channel quality varies dramatically such as in a fast Doppler spread, the twostate Gilbert-Elliott model becomes inadequate. Wang and Moayeri extended the two-state model to an FSMC model for characterizing the Rayleigh fading channels [17]. In [17], the received SNR is partitioned into a finite number of intervals. Denote by 0 = A 0 < A 1 < A 2 < · · · < A K = ∞ the SNR thresholds of different intervals, then if the received SNR is in the interval [A k , A k+1 ), k ∈ {0, 1, 2, . . . , K − 1}, the fading channel is said to be in state S k . It turns out that if the channel changes slowly and is properly partitioned, each state can be considered as a steady state, and a state transition can only happen between neighboring states. As a result, a fading channel can be represented using a Markov model if given the average SNR ρ and Doppler frequency f d .

Performance analysis of RS code over finite-state Markov channel model
RS codes possess maximal minimum distance properties which make them powerful in correcting errors with arbitrary distributions. For RS symbols composed of m bits, the encoder for an RS(n, k) code groups the incoming bitstream into blocks of k information symbols and appends n − k redundancy symbols to each block. So the channel coding rate is r c = k/n. For an RS(n, k) code, the maximal number of symbol errors that can be corrected is t = (n − k)/2 . When the number of symbol errors is more than t, RS decoder reports a flag to notify that the errors are uncorrectable. The probability that a block cannot be corrected by RS(n, k), denoted as a decoding failure probability p w (n, k), can be calculated as where P(n, m) denotes the probability of m symbol errors within a block of n successive symbols. The computation of P(n, m) for FSMC channel model has been studied before (see [16,18]).

Packetization and macroblock error probability computation
We use baseline H.263 video coding standard for illustration. H.263 GOB/slice structure is used where each GOB/slice is encoded independently with a header to improve resynchronization. Denoting by N s the number of GOB/slice in each frame, the RS(n, k) code block size n (bytes) is set to such that each GOB/slice is protected by an RS codeword in average, where x is the smallest integer larger than x. No further alignment is used. In case of decoding failure of an RS codeword, the GOBs (group of blocks) covered by the RS code will be simply discarded and followed by error concealment. If a GOB is corrupted, the decoder simply drop the GOB and performs a simple error concealment as follows: the motion vector (MV) of a corrupted MB is replaced by the MV of the MB in the GOB above. If the GOB above is also lost, the MV is set to zero, then the MB is replaced by the corresponding MB at the same location in the previous frame. To facilitate error concealment at the decoder when errors occur, the GOBs which are indexed by even numbers are concatenated together, followed by concatenated GOBs indexed by odd numbers. By using this alternative GOB organization, the neighboring GOBs are normally not protected within the same RS codeword. Thus, when a decoding failure occurs in one RS codeword, the neighboring GOBs will not be corrupted simultaneously, which helps the decoder to perform error concealment using the neighboring correctly received GOB. In order to estimate the end-to-end distortion, we need to model the relation between video MB error probability P MB (n, k) and RS(n, k) decoding failure probability p w (n, k), that is, Since no special packetization or alignment is used, one RS codeword may contain part of one GOB/slice or overlap more than one GOB/slice. It is difficult to find the exact relation between P MB (n, k) and p w (n, k) because the length of GOB in each frame is varying. Intuitively, α should be between 1 and 2. Experiments are performed to find the suitable α. Figure 2 shows the experiment results of RS codeword failure probability and GOB error probability over Rayleigh fading channels. It turns out that α ≈ 1.5 is a good approximation in average. For a source and channel code pair (r s , r c ) or RS(n, k), the channel code decoding failure probability p w (n, k) can be derived from ρ and f d as described in Sections 2.1 and 2.2, then we have the corresponding video MB error probability P MB (n, k) from (5). Based on the derived MB error rate P MB (n, k), a recursive estimation method and an RD-optimized scheme are employed to estimate the minimal end-to-end MSE of the video sequence and obtain the corresponding optimized video coding scheme, which is to be described in detail in the next section.

OPTIMAL DISTORTION ESTIMATION AND MINIMIZATION
We first describe the proposed distortion estimation method for both HP-and IP-based video coding over error-prone channels. Simulations are performed to demonstrate the improved performance of the proposed method. Then an RD framework is used to select the coding mode and quantizer for each MB to minimize the estimated distortion, given the source rate r s , P MB which is derived as in Section 2, and the frame rate F.

Optimal distortion estimation
Recently, modeling of error propagation effects have been considered in order to optimally select the mode for each MB to trade off the compression efficiency and error robustness [11,13,19]. In particular, a recursive optimal per-pixel estimate (ROPE) of decoder distortion was proposed in [13] which can model the error propagation and quantization distortion more accurately than other methods. But the method in [13] is only optimal for the IP-based video coding. For the HP case, the computation of spatial cross correlation between pixels in the same and different MBs is needed to obtain the first and second moments of bilinear interpolated HPs, the process is computationally prohibitive. Most of the current video coding use the HP-based method to improve the compression performance. We propose a modified recursive estimate of end-to-end distortion that can take care of both IP-and HP-based video coding. The expected end-to-end distortion for the pixel f s l at s = (x, y) in frame l is Assuming thatẽ s l is an uncorrelated random variable with a zero mean, which is a reasonable assumption when P MB is relatively low as will be shown in the simulations later, we have We derive a recursive estimate of E{(ẽ s l ) 2 } for intra-MB and inter-MB as follows.

Intramode MB
The following three cases are considered.
(1) With the probability 1 − P MB , the intra-MB is received correctly and thenf s l =f s l . As a result,ẽ s l = 0.
(2) With the probability (1−P MB )P MB , the intra-MB is lost but the MB above is received correctly. Denoting by v c = (x c , y c ) the MV of the MB above, two cases of error concealment are considered depending on whether v c is at the HP location or not.
The clipping effect is ignored in the computation. (ii) If v c is at the HP location, without loss of generality, assume that v c = (x c , y c ) is at HP location that is interpolated from four neighbouring IP locations with MVs: (3) With the probability P 2 MB , both the current MB and the MB above are lost. The MB in the previous video frame at the same location is repeated, that is, v c = 0 = (0, 0): Combining all of the cases together, we have the following results.
(1) If v c is at the IP location, then (2) If v c is at the HP location in both x and y dimensions, then The cases when v c has only x c or y c at the HP location can be obtained similarly.

Intermode MB
(2) If v = (x, y) is at the HP location in both x and y dimensions, the prediction is interpolated from four pixels with MVs: The results of the other MB loss cases are the same as that of the intra-MB. We have the following two results.
(2) If v and v c are at the HP location in both x and y dimensions, then The encoder can use the above procedures to recursively estimate the expected distortion d s l in (7), based on the accumu-lated coding and error propagation effects from the previous video frames and current MB coding modes and quantizers.
To implement the HP-based estimation, the encoder needs to store an image for E{(ẽ s l ) 2 }; for the locations in which either x or y is at HP, the value is obtained by scaling the sum of the neighboring two values by 1/4, and for locations in which both x and y are at HP precision, it is obtained by scaling the sum of the neighboring four values by 1/16. It should be noted that the scaling by 1/4 or 1/16 can be done by simple bitshift. Both IP-and HP-based estimations need the same memory size to store either two IP images, E{ f l } and E{ f 2 l }, or one HP image, E{(ẽ s l ) 2 }, but E{(ẽ s l ) 2 } requires smaller bitwidth/pel since it is an error signal instead of a pixel value. The HP-based computational complexity is less than the IP-based method since it only needs to compute E{(ẽ s l ) 2 } instead of computing both E{ f l } and E{ f 2 l } in the IP-based estimate.
We now compare the accuracy of the proposed HP-based estimation to the original IP-based method (ROPE) in [13]. In the simulation, each GOB is carried by one packet. So the packet loss rate is equivalent to the MB error probability P MB . A memoryless packet loss generator is used to drop the packet at a specified loss probability. QCIF sequences Foreman and Salesman are encoded by the Telenor H.263 encoder with the intra-MB fresh rate set to 4, that is, each MB is forced to be intramode coded if it has not been intracoded for consecutive four frames. The HP-and IP-based estimates are compared to the actual decoder distortion averaged over 50 different channel realizations.
In Figure 3a, the sequence Foreman of 150 frames is encoded with HP motion compensation at a bit rate of  300 Kbps, frame rate of 30 f/s, and MB loss rate of 10%. In Figure 3b, the sequence Salesman is encoded in the same way.
It can be noted that the HP-based estimation is more accurate to estimate the actual distortion at the decoder compared to the IP-based estimation. Figure 4 also shows the average PSNR of the 150 coded frames with respect to MB loss rates from 5% to 20%. When MB loss rate is as small as 5%, the HP-based estimation is almost the same as the actual distortion, while the IP-based method has about 3 dB difference.
The results is as expected since there is about 2-4 dB PSNR difference between HP-and IP-based video coding efficiency given the same bit rate. As the MB loss rate increases as large as 20%, the HP-based estimation is about 1 dB better than the actual distortion, while the IP-based estimation is about 2 dB worse. So the HP-based method is still 1 dB more accurate than the IP-based method. The reason is that the error propagation effects play a more significant role when MB loss rate gets larger, so the coding gain of the HP-based motion compensation is reduced. Also, the assumption in HPbased method that the transmission and propagation errors are not correlated and zero mean may become loose. For practical scenarios, it is demonstrated that the HP-based estimation outperforms the original IP-based method by about 1-3 dB.

Rate-distortion-optimized video coding
The quantizer step size and code mode for each MB in a frame is optimized by an RD framework. [q i, j,l , m i, j,l ] ∈ C the encoding vector for b i, j,l , where C = Q × M is the set of all admissible encoding vectors. For each source/channel pair (r s , r c ), we have the corresponding P MB (n, k) from (5). The encoder needs to determine the coding mode and quantizer for each MB in total L frames to minimize the end-to-end MSE D E (r s , P MB ) of the video sequence, which is defined as where R l is the number of bits used to encode frame l, its maximal value is denoted as R max l = r s /F + ∆ l which is the maximal number of bits available to encode frame l provided by a frame level rate control algorithm with average r s /F and buffer related variable ∆ l . Moreover D l (R l , P MB ) is the estimated end-to-end MSE of frame l, l = 1, 2, . . . , L, which can be obtained as where D(c i, j,l , P MB ) is the end-to-end MSE of MB b i, j,l using encoding vector c i, j,l and D(c i, j,l , P MB ) can be computed from d s l as Since there is dependency between neighboring interframes because of the motion compensation, the optimal solution of (17) has to be searched over C H×V ×L , which is computationally prohibitive. We use greedy optimization algorithm, which is also implicitly used in most JSCC video coding methods such as [10,11,13], to find the coding modes and quantizers for MBs in frame l that minimize D l (R l , P MB ), then find coding modes and quantizers for MBs in frame l +1 that minimize D l+1 (R l+1 , P MB ) based on the previous optimized frame l, and so on. The optimal pair (r * s , r * c ) and the corresponding optimal video coding scheme can be found such that The goal now is to optimally select the quantizers and encoding modes on the MB level for a specific MB error rate P MB and frame rate R max l to trade off the source coding efficiency and robustness to error. The notation of P MB and (r s , r c ) is dropped from now on unless needed.
The optimal coding problem for frame l can be stated as subject to Such RD-optimized video coding schemes have been studied for noiseless and noisy channels recently [19,20,21,22,23,24]. Using Lagrangian multiplier, we can solve the problem by minimizing where λ ≥ 0. For video coding over error-prone channels, GOB coding structure is used for H.263 video coding over noisy channels with each GOB encoded independently. Therefore, if the transmission errors occur in one GOB, the errors will not propagate into other GOBs in the same video frame.
For video coding over noiseless channels, the independent GOB structure leads to the fact that the optimization of (23) can be performed for each GOB separately. However, when considering RD-optimized video coding for noisy channels, the MB distortion D i, j (c i, j , P MB ) depends not only on the mode and quantizer of the current MB but also on the mode of the MB above to take into account error concealment distortion. Therefore, there is a dependency between neighboring GOBs for this optimization problem. We use greedy optimization algorithm again to find the solution by searching the optimal modes and quantizers from the first GOB to the last GOB in each frame.

SIMULATION RESULTS
We first use a simple two-state Markov chain model for simulation to show the performance of the integrated source and channel rate allocation and robust video coding scheme, where the given channel stochastic knowledge is accurate. Then simulations over Rayleigh fading channel is performed to verify the effectiveness of the proposed scheme for practical wireless channels.

Two-state Markov chain channel
Simulations have been performed using base mode H.263 to verify the accuracy of the proposed integrated scheme. In the simulations, the total channel signaling rate r equals 144 kbps, which is a typical rate provided in the 3G wireless systems. Video frame rate is F = 10 f/s. The video sequence used for simulation is Foreman in QCIF format. RS code over GF (2 8 ) is used for FEC. The channel coding rate used are {0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8}. The source and channel coding rates r s , r c and the corresponding RS code (n, k) are listed in Table 1. A two-state Markov channel model [16] is used, where the state transition is at the RS symbol level. The two states of the model are denoted by G (good) and B (bad). In state G, the symbols are received correctly (e g = 0) whereas in state B, the symbols are erroneous (e b = 1). The model is fully described by the transition probabilities p from state G to state B, and q from state B to state G. We use the probabil-ity of state B: and the average bursty length: which is the average number of consecutive symbol errors to model the two-state Markov model [11,16]. The simulations are performed through the following steps.
(i) For each channel coding rate r c (or RS(n, k)) in each column of Table 1, the RS code decoding failure rate p w (n, k) is computed using (3) for a given two-state Markov channel model. The results for different r c and channel models are shown in Table 2. The average estimated PSNR, PSNR E , of video signals is used to measure the performance: where PSNR l E r s , r c = 10 log 10 is the estimated average PSNR between the original frame l and the corresponding reconstruction at the decoder using the pair (r s , r c ), and D * E (r s , r c ) is the minimal estimated endto-end MSE from (17) where PSNR (n,l) S (r s , r c ) is the PSNR between the original  frame l and the corresponding reconstruction at the decoder in the nth simulation using the source/channel rate pair (r s , r c ). Figure 5a shows the average estimated PSNR E of the optimal rate allocation and robust video coding for different channel code rates when the symbol error rate is P B = 0.01 and the bursty length L B = 16 symbols, and the corresponding average simulated PSNR S is of 50 times video transmission. Figure 5b also shows the same comparison when the symbol error rate is P B = 0.05. It can be noted that the estimated PSNR E , which is obtained at the encoder during RDoptimized video encoding, matches the simulated PSNR S very well. The optimal source and channel rate pair can also be found through Figures 5a and 5b for different channel characteristics. The corresponding channel decoding failure rate of the optimal channel coding rates in Figures 5a and 5b are 0.018 and 0.034, respectively.
We also compare the performance when the knowledge of channel model used at video encoder does not match the real channel used in simulations. Figure 6 shows two cases of channel mismatch. In Figure 6a, the video stream, which is encoded based on P B = 0.01 and L B = 16 two-state Markov channel, is simulated using two-state Markov channel with P B = 0.01 and L B = 8. The simulated average PSNR is bet-ter than the average PSNR estimated at the encoder during encoding because the channel model used in estimation is worse than the model used in simulation. On the other hand, when the video stream, which is encoded based on P B = 0.01 and L B = 8 two-state Markov channel, is simulated using two-state Markov channel with P B = 0.01 and L B = 16, the simulated average PSNR is much worse than the average PSNR estimated at the encoder as shown in Figure 6b. Furthermore, the optimal source and channel coder pair obtained at the encoder is not optimal when the channel condition used in simulation is worse than the channel information used at the encoder. This simulation result suggests that the optimal rate allocation and video coding should be focused on the worse channel conditions for broadcasting services.

Rayleigh fading channel
The simulation over the Rayleigh fading channel is also performed to verify the effectiveness of the proposed scheme over realistic wireless channels. In the simulation, QPSK with coherent demodulation is used for the sake of simplicity. The channel is a frequency-nonselective Rayleigh fading channel. An FSMC with K = 6 states is used to model the Rayleigh fading channel. The SNR thresholds for the K states are  Figure 6: Average PSNR obtained in channel mismatch cases. (a) Error burst is shorter than that used in estimation. (b) Error burst is longer than that used in estimation.
selected in such a way that the probability that the channel gain is at state s k , k = 0, 1, . . . , K − 1, is , The FSMC state transition is described at the RS codeword symbol level (8-bit RS symbol) with the assumption that the four QPSK modulation symbols within an RS codeword symbol stay in the same FSMC state. Given the average SNR ρ and the Doppler frequency f d , we can obtain the parameters such as steady state probability p k , RS symbol error probability e k , and state transition rates [17]. Then following the procedures described in Section 2.1, we are able to analyze the RS code performance over Rayleigh fading channels. Table 3 shows the estimated RS code decoding failure probability using FSMC model and the simulation values when the SNR is 18 dB and the Doppler frequency is 10 Hz and 100 Hz, respectively. The RS codeword error rate obtained by the FSMC matches the simulation results very well when f d is 10 Hz. When f d is 100 Hz, the FSMC-based estimate is not as accurate as the results when f d is 10 Hz, but is still within acceptable range compared to the simulated values. Figure 7a shows the average estimated PSNR E and simulated PSNR S of the video coding after optimal rate allocation and robust video coding for different channel code rates when the SNR is 18 dB and f d is 10 Hz. Figure 7b also shows the comparison when the f d is 100 Hz. Even though it can be noted that there are about 1 dB difference between the estimated PSNR E and the simulated PSNR S , the near-optimal source and channel rate allocation (or the channel code rate r c ) obtained from the estimation (0.8 and 0.5 as shown in Figure 7) still has the maximal simulated end-to-end PSNR over Rayleigh fading channels. The simulation results verify the effectiveness of the proposed scheme to obtain the optimal source and channel coding pair when given a fixed total bit rate for wireless fading channels.
Experiments are also performed when the knowledge of channel Doppler frequency used at the video encoder does not match the actual Doppler frequency used in simulations. Figure 8 shows two cases of channel mismatch. In Figure 8a, the video bitstream which is encoded based on f d = 10 Hz is simulated over fading channels with Doppler frequency  Figure 8b, the video bitstream which is encoded based on f d = 100 Hz is simulated over fading channels with Doppler frequency of 100 Hz and 10 Hz, separately. In both scenarios, the video quality would be better if the actual condition in terms of MB loss rate is smaller than the knowledge used at the encoder, and would be worse otherwise. Furthermore, the optimal source and channel coder pair obtained at the encoder is not optimal when the channel condition used in simulation is worse than the channel information used at the encoder. This simulation result again suggests that the optimal rate allocation and video coding should be focused on the worse channel conditions for broadcasting services.

CONCLUSION
We have proposed an integrated framework to find the nearoptimal source and channel rate allocation and the corresponding robust video coding scheme for video coding and transmission over wireless channels when there is no feedback channel available. Assuming that the encoder has the stochastic channel information when the wireless fading channel is modeled as an FSMC model, the proposed scheme takes into account the robust video coding, packetization, channel coding, error concealment, and error propagation effects altogether. This scheme can select the best source and channel coding pair to encode and transmit the video signals. Simulation results demonstrated the optimality of the rate allocation scheme and accuracy of end-to-end MSE estimation obtained at the encoder during the process of robust video encoding.