- Open Access
Bitrate control using a heuristic spatial resolution adjustment for a real-time H.264/AVC encoder
© Rhee et al; licensee Springer. 2012
- Received: 19 July 2011
- Accepted: 23 April 2012
- Published: 23 April 2012
Conventional bitrate control algorithms that change only the quantization parameter (QP) often suffer from quality degradation when the target bitrate is very low. Therefore, rate control algorithms that adjust spatial resolution in addition to QP control have recently been proposed, but their computations are too complex to be processed in real time. This research proposes a very simple, but effective, rate control algorithm that employs spatial resolution control as well as the existing QP-based bitrate control. The spatial resolution ratio for the best peak signal-to-noise ratio (PSNR) is calculated using a simple estimation model which defines the relationship between the PSNR and the spatial resolution at very low bitrate compression. In the proposed bitrate control algorithm, two scalability tools for adjusting the QP and the spatial resolution ratio are used sequentially to reach the target PSNR and the control decision is made for a group of pictures. Experimental results show that the proposed bitrate control algorithm approximates an optimal solution and yields a better subjective quality as well as objective quality at various bitrates compared to the conventional QP-based bitrate control algorithm. The decision of the control parameters requires very small computational complexity and is made in a completely automatic manner so that the proposed algorithm is well suited for real-time applications.
- bitrate control
- spatial resolution control
H.264/AVC standard is widely used in video streaming, video communication, and various mobile video applications due to its high compression efficiency through the use of many advanced tools. In recent video applications, the video resolution tends to be larger and, thus, the bandwidth requirement for video transmission also increases. As a result, the importance of the bitrate control is growing as it is necessary to regulate the bitrate of a video stream in order to achieve the target bitrate. In bitrate control algorithms, the target bits are allocated at the frame-level or the macroblock-level by considering the fullness of the output buffer and the encoding complexity. While achieving the allocated bit budget, rate control attempts to make quality compromises in ways that would have a minimum degradation on perceived through controlling quantization parameter (QP) values. When the QP is small, much of the information is preserved and when the QP is large, much of the information is discarded to reduce the bitrate at the cost of an increased distortion.
To achieve a target bitrate in real constraint situations, such as within a specific channel bandwidth or within defined encoder and decoder buffer sizes, QP values must vary dynamically based on the complexity of the input video and the current bitrate. QP values are determined using a rate-quantization (R-Q) model where the generated bitrate is modeled as a function of the QP and the complexity of the residual signal such as the mean absolute difference (MAD). In H.264/AVC, QPs are used not only in rate control, but also in rate distortion optimization (RDO). However, the famous "chicken-egg" dilemma complicates the selection of the QP value. The function for the RDO needs a pre-determined QP for its lambda factor. However, the QP can be determined by the MAD, which is only available after the RDO is obtained. To resolve that dilemma, the MAD is predicted from the previous frame using a linear function . The estimated MAD from the previous frame is often different from the actual MAD in the current frame. Thus, an inadequate QP may be selected. For improved coding efficiency, enhanced distortion models [2–6] are developed and rate-distortion (R-D) model is optimized or contents-aware bit allocation is proposed [5–10]. A number of recent approaches incorporate characteristics of the human visual system (HVS) into bitrate control. Moreover, there are several reports on region-of-interest (ROI)-based bit allocation [11–13]; such approaches can potentially improve the perceived visual quality of images. In addition to ROI-based methods, a new R-D model, along with frame skipping and bit-allocation schemes, using various perceptual metrics that are based on the characteristics of HVS, is proposed [14–19]. However, the conventional bitrate control algorithms based on QP control often suffer from perceptible image quality degradations, such as blocking, ringing, or texture-deviation artifacts, when the target bitrate is very low. Furthermore, when the target bitrate is not satisfied, even with the maximum QP value, the conventional rate control cannot avoid a sudden frame drop which results in video quality degradation.
This article proposes a simple but effective bitrate control algorithm that applies spatial resolution controls to the conventional QP-based bitrate control. In the proposed algorithm, a new model that represents the relationship between the spatial resolution and the peak signal-to-noise ratio (PSNR) for low bitrate coding is proposed. By using the proposed model, the spatial resolution that gives the highest PSNR at a given bitrate is estimated. In the proposed model, the computational complexity is very low as it only requires a small number of parameters and the value of parameters is obtained in a heuristic manner. The two scalability tools, the QP and the spatial resolution, are processed sequentially to achieve the target PSNR. The proposed spatial resolution adjustment which is applied in the group of pictures (GOP)-level is used as a coarse-grain bitrate control. Inside a GOP, QP is changed by a conventional bitrate control to meet the allocated bits in a fine-grain manner. Thus, the target bitrate is satisfied with a combination of two control methods. To estimate the perceptual quality of encoded video sequences with reduced spatial resolution, Video Quality Metric (VQM) software  is used to measure the subjective quality in addition to the PSNR which measures the objective quality. The VQM computes the perceptual effects of video impairments including blurring, jerky/unnatural motion, global noise, block distortion and color distortion, and combines them into a single metric. Experimental results show that proposed bitrate control scheme outperforms the conventional QP-based bitrate control algorithm at a variety of bitrates. At a low bitrate, the PSNR and VQM values with the proposed spatial resolution control scheme are improved up to 1.85 and 5.15 dB, respectively, when compared to that with the conventional QP-only control. There is only a small difference between the real optimal spatial resolution and the spatial resolution obtained using the proposed scheme.
This article is organized as follows. In Section 2, background is presented and Section 3 explains the proposed bitrate control algorithm. Experimental results are presented in Section 4 and conclusions are given in Section 5.
2.1. Previous study on spatial and temporal resolution controls
To improve flexibility in bit allocation, some rate control algorithms adjust frame rate or spatial resolution. When the frame rate decreases, additional bits can be allocated to each frame and frame image quality can be improved. However, frame skipping should be done very carefully because motion artifacts such as flickering or motion jerkiness may degrade subjective video quality. In [21, 22], the decision for the frame skip is based on buffer fullness and the spatial and temporal quality of the video. In , the similarity between successive frames measured by the PSNR is used to skip frames adaptively. To optimize coding performance, frames to be skipped are determined based on an R-D model [24, 25], but these works cannot be used in real-time applications. In [26–28], motion artifacts are reduced by adjusting the frame rate gradually based on the motion activity of the previous sub-GOP which is expressed by the histogram of difference image, thereby preserving motion smoothness. Even though a number of previous works have contributed to frame rate controls, the effect of providing bitrate control through frame rate adjustment is somewhat limited because, to avoid motion artifacts, the frame rate cannot be reduced below a certain value. In addition, temporal scalability may not be very effective for increasing subjective quality by temporal to spatial bitrate exchange. This is because the quality degradation due to dropping a frame is easily perceived, especially in low frame rate communications such as that used in two-way multimedia communication for mobile devices .
Spatial resolution control is another approach used for bit allocation when the target bitrate is very low. The dynamic resolution conversion (DRC) mode, which is supported in the advanced simple profile of MPEG-4, enables the video object plane to be encoded with reduced spatial resolution . Similarly, a reduced resolution update (RRU) coding tool is adopted in Annex Q of the H.263 standard . The RRU reduces the bitrate by coding the prediction error residuals at a reduced spatial resolution. However, the DRC and RRU techniques are not included in the H.264/AVC standard. Meanwhile, a number of previous studies have reported a relationship between down-sampling and video quality at a low bitrate. In , the optimum down-sampling ratio is determined according to the bitrate. In , it is reported that a down-sampled video, prior to compression and later up-sampled, visually outperforms that video compressed directly at high resolution with the same number of bits, when the target bitrate is very low. With this observation, a method to find the optimal down-sampling rate is suggested. These schemes are exploited by JPEG image compression standard not by video compression standards . In [34, 35], discrete cosine transform (DCT) coefficients are decimated prior to quantization to reduce spatial resolution, but modifying the coding loop loses the conformation to the syntax compatibility for the video coding standard. In video transcoders, spatial resolution control has been an important factor in meeting a different target bitrate . However, many previous works have focused on simplifying the computations in transcoding. In , the linear R-Q model is proposed to select the proper frame size, but it is not applicable to practical applications. Recently, in , the overall distortion is analyzed and the optimal spatial resolution is derived for a given bitrate. In , the spatial resolution ratio is appropriately selected, according to picture quality, bit rate, and power consumption. As shown in the above-mentioned works, in order to accomplish bitrate control by spatial resolution control, it is very important to select the optimal spatial resolution. When the selected spatial resolution is greater than the optimal one, the image quality can be degraded by using an excessively high QP value, while the quality can be degraded by aliasing artifacts when the selected spatial resolution is less than the optimal one. Nonetheless, it is difficult to find the optimal resolution because complex estimation models, which define the relationships among picture quality, bitrate, spatial resolution, and power consumption, are used in those previous works. Moreover, parameters used in the previous methods depend on the characteristics of the video content and the specific coding methods; thus, they cannot be calculated in real time. Therefore, the previous works cannot be applied to "on the fly", real-time rate control.
2.2. Comparison between spatial and temporal resolution controls
Comparison of VQM values between spatial and temporal drops at low target bitrates
Target bitrate (kbps)
t_drop - s_drop
3.1. Architecture of the target system
3.2. Spatial resolution control
where q 1, q 2, and q 3 are constants which depend on the video content and R is the bitrate of the encoded stream. The term sa, referred to here as the spatial resolution ratio, represents the ratio of the down-sampled frame area to the original frame area. When sa is smaller than 1, the frame is down-sampled. Equation (1) describes the relationship between sa and PSNR and thus is used for calculating the optimal spatial resolution. However, the parameters such as q 1, q 2, and q 3 in Equation (1) depend on the video content and cannot be known prior to encoding. Thus, this optimal solution cannot be applied to real-time systems on the fly.
where α and PSNRpeak are obtained from (5) and (6), respectively.
3.3. The proposed bitrate control algorithm
The proposed bitrate control scheme is implemented and integrated into the JM 13.2 reference software which adopts the QP-based bitrate control. To resize the spatial resolution, the up/down-sampling algorithms recommended in the SVC are used. For down-sampling, the algorithm based on the Sine-windowed Sinc-function is applied where a set of seven filters is used to support the extended range of the spatial scaling ratio. For up-sampling, the SVC normative up-sampling algorithm is applied which is based on a set of 6-taps filters derived from the Lanczos-3 filter. In this experiment, five HD video sequences, Pedestrian Area, Tractor, Station2, Sunflower, and Blue Sky, two full HD video sequences, Speed Bag and Life and two 4CIF videos, Harbor and Soccer, are used. The chosen length of a GOP is 30 frames and 150 frames are encoded. The GOP structure is IPPP.
Comparison of the PSNR and VQM among the conventional control, the proposed spatial resolution and the optimal spatial resolution at various target bitrates
Rate control methods
1280 × 720
1920 × 1072
704 × 576
Target bitrate (kbps)
Comparison of the SA between the proposed spatial resolution and the optimal spatial resolution at various target bitrates
Rate control methods
1280 × 720
1920 × 1072
704 × 576
Target bitrate (kbps)
Difference (optimal, proposed)
The main contribution of this article is a real-time bitrate control algorithm using spatial down-sampling for the low bitrate encoding. The previous resolution control schemes are too complex to be processed at run time. In this article, a simple estimation model which defines the relationship between the PSNR and the spatial resolution ratio is presented for low bitrate compression. This estimation model is used to find the resolution ratio for acceptable quality on the fly for real-time systems. Two scalability tools for the QP and spatial resolution ratio are determined sequentially to reach the target PSNR. Experimental results show that the proposed bitrate control algorithm is close to the optimal solution and yields the better PSNR and VQM quality at various bitrates compared to the conventional QP-based bitrate control algorithm.
This study was supported by the Korea Science and Engineering Foundation (KOSEF) grant funded by the Korea government (MEST) (2011-0027502).
- Li ZG, Pan F, Lim KP, Rahardja S: Adaptive rate control for H.264. Proc IEEE Int Conf Image Processing Singapore 2004, 2: 745-748.Google Scholar
- Kamaci N, Altunbasak Y, Mersereau RM: Frame bit allocation for the H.264/AVC video coder via Cauchy-density-based rate and distortion models. IEEE Trans Circ Syst Video Technol 2005, 15(8):994-1006.View ArticleGoogle Scholar
- Ma S, Gao W, Lu Y: Rate-distortion analysis for H.264/AVC video coding and its application to rate control. IEEE Trans Circ Syst Video Technol 2005, 15(12):, 1533-1544.View ArticleGoogle Scholar
- He Z, Kim Y-K, Mitra SK: Low-delay rate control for DCT video coding via ρ-domain source modeling. IEEE Trans Circ Syst Video Technol 2001, 11(8):928-940. 10.1109/76.937431View ArticleGoogle Scholar
- Yuan W, Lin S, Zhang Y, Yuan W, Luo H: Optimum bit allocation and rate control for H.264/AVC. IEEE Trans Circ Syst Video Technol 2006, 16(6):705-715.View ArticleGoogle Scholar
- Kwon D-K, Shen M-Y, Jay Kuo C-C: Rate control for H.264 video with enhanced rate and distortion models. IEEE Trans Circ Syst Video Technol 2007, 17(5):517-529.View ArticleGoogle Scholar
- An C, Nguyen TQ: Iterative rate-distortion optimization of H.264 with constant bit rate constraint. IEEE Trans Image Process 2008, 17(9):1605-1615.MathSciNetView ArticleGoogle Scholar
- Sullivan G, Wiegand T, Lim K-P: Joint model reference encoding methods and decoding concealment methods. Section 2.6 rate control JVT-I049 2003.Google Scholar
- Ma S, Gao W, Wu F, Lu Y: Rate control for JVT video coding scheme with HRD considerations. Proc IEEE ICIP Spain 2003, 3: 793-796.Google Scholar
- Yu HT, Pan F, Lin ZP: A new bit estimation scheme for H.264 rate control. Proc IEEE Int Symp Consumer Electronics, UK 2004, 396-399.Google Scholar
- Yang X, Lin W, Lu Z, Lin X, Rahardja S, Ong E, Yao S: Rate control for videophone using local perceptual cues. IEEE Trans Circ Syst Video Technol 2005, 15(4):496-507.View ArticleGoogle Scholar
- Liu Y, Li ZG, Soh YC: Region-of-interest based resource allocation for conversational video communication of H.264/AVC. IEEE Trans Circ Syst Video Technol 2008, 18(1):134-139.View ArticleGoogle Scholar
- Li H, Wang Z, Cui H, Tang K: An improved ROI-based rate control algorithm for H.264/AVC. Proc Int Conf Signal Processing China 2006, 2: 16-20.Google Scholar
- Hrarti M, Saadane H, Larabi M, Tamtaoui A, Aboutajdine D: A macroblock-based perceptually adaptive bit allocation for H264 rate control. Proc Int Symposium on I/V Communications and Mobile Network, Morocco 2010, 1-4.Google Scholar
- Huang C-M, Lin C-W: A novel 4-D perceptual quantization modeling for H.264 bit-rate control. IEEE Trans Multimed 2007, 9(6):1113-1124.View ArticleGoogle Scholar
- Meng Q, Meng Q: Improved macroblock-level rate control algorithm with visual properties. Proc Int Workshop Intelligent Systems and Applications, China 2010, 1-5.Google Scholar
- Cui Z, Zhu X: SSIM-based content adaptive frame skipping for low bit rate H.264 video coding. Proc Int Conf Communication Technology, China 2010, 484-487.Google Scholar
- Ou T-S, Huang Y-H, Chen HH: SSIM-based perceptual rate control for video coding. IEEE Trans Circ Syst Video Technol 2011, 21(5):682-691.View ArticleGoogle Scholar
- Jin R, Chen J: The coding rate control of consistent perceptual video quality in H.264 ROI. Proc Int Symposium Computer Network and Multimedia Technology, China 2009, 1-4.Google Scholar
- Wolf S, Pinson M: VQM software and measurement techniques. National Telecommunications and Information Administration Report 2002.Google Scholar
- Pan F, Lin X, Rahardja S, Lim KP, Li ZG, Wu DJ, Wu S: Proactive frame-skipping decision scheme for variable frame rate video coding. Proc Int Conf Multimedia and Expo Taiwan 2004, 3: 1903-1906.Google Scholar
- Pan F, Lin ZP, Lin X, Rahardja S, Juwono W, Slamet F: Adaptive frame skipping based on spatio-temporal complexity for low bit-rate video coding. J Vis Commun Image R 2006, 17(3):554-563. 10.1016/j.jvcir.2005.07.006View ArticleGoogle Scholar
- Jun J, Lee S, He Z, Lee M, Jang ES: Adaptive key frame selection for efficient video coding. LNCS 2007, 4872: 853-866.Google Scholar
- Liu S, Kuo CJ: Joint temporal-spatial bit allocation for video coding with dependency. IEEE Trans Circ Syst Video Technol 2005, 15(1):15-26.View ArticleGoogle Scholar
- Vetro A, Wang Y, Sun HF: Rate-distortion optimized video coding considering frameskip. Proc Int Conf Image Processing Greece 2001, 3: 534-537.Google Scholar
- Kim J, Kim Y-G, Song H, Kuo T-Y, Chung YJ, Kuo C-CJ: TCP-friendly Internet video streaming employing variable frame-rate encoding and interpolation. IEEE Trans Circ Syst Video Technol 2000, 10(7):1164-1177. 10.1109/76.875520View ArticleGoogle Scholar
- Song H, Kuo C-CJ: Rate control for low-bit-rate video via variable-encoding frame rates. IEEE Trans Circ Syst Video Technol 2001, 11(4):512-521. 10.1109/76.915357View ArticleGoogle Scholar
- Thaipanich T, Wu P-H, Kuo C-CJ: Low complexity algorithm for robust video frame rate up-conversion (FRUC) technique. IEEE Trans Consum Electron 2009, 55(1):220-228.View ArticleGoogle Scholar
- Jackson AHAM, McEwan R, Mullin J: Impact of video frame rate on communicative behavior in two and four party groups. Proc ACM Conf Comput Supported Cooperative Work, Philadelphia, PA 2000, 11-20.Google Scholar
- Information technology - Coding of Audio-visual Objects - Part 2: Visual International Organization for Standardization 2000. ISO/IEC 14496-2:1999/Amd.1:2000(E)Google Scholar
- Cote G, Erol B, Gallant M, Kossentini F: H.263+: video coding at low bit rates. IEEE Trans Circ Syst Video Technol 1998, 8(7):849-866. 10.1109/76.735381View ArticleGoogle Scholar
- Segall CA, Elad M, Milanfar P, Webb R, Fogg C: Improved high-definition video by encoding at an intermediate resolution. Proc Conf Visual Communications and Image Processing USA 2004, 5308: 1007-1018.Google Scholar
- Bruckstein AM, Elad M, Kimmel R: Down-scaling for better transform compression. IEEE Trans Image Process 2003, 12(9):1132-1145. 10.1109/TIP.2003.816023MathSciNetView ArticleMATHGoogle Scholar
- Ilgin HA, Chaparro LF: Low bit rate video coding using DCT based fast decimation/interpolation and embedded zero tree coding. IEEE Trans Circ Syst Video Technol 2007, 17(7):833-844.View ArticleGoogle Scholar
- Nguyen VA, Tan YP, Lin WS: Adaptive downsampling/upsampling for better video compression at low bit rate. Proc of Int Symposium on Circuits and Systems, USA 2008, 1624-1627.Google Scholar
- Shu HY, Chau LP: The realization of arbitrary downsizing video transcoding. IEEE Trans Circ Syst Video Technol 2006, 16(4):540-546.View ArticleGoogle Scholar
- Tan Y-P, Liang Y, Sun H: On the methods and performances of rational downsizing video transcoding. Signal Process Image Commun 2004, 19: 47-65. 10.1016/j.image.2003.08.017View ArticleGoogle Scholar
- Wang R-J, Chien M-C, Chang P-C: Adaptive down-sampling video coding. Proc SPIE Multimedia on Mobile Devices 2010, 7542: 1-8.Google Scholar
- Lee H, Lee Y, Lee J, Lee D, Shin H: Design of a mobile video streaming system using adaptive spatial resolution control. IEEE Trans Consum Electron 2009, 55(3):1682-1689.View ArticleGoogle Scholar
- Sirhindi R, Murtaza S, Afzal M: Improved data hiding technique for shares in extended visual secret sharing schemes. LNCS Inf Commun Secur 2008, 5308: 376-386. 10.1007/978-3-540-88625-9_25Google Scholar
- Samuel S, Penzhorn WT: Digital watermarking for copyright protection. Proc Conf AFRICON 2004, 2: 953-957.Google Scholar
- Kasmani SA, Naghsh-Nilchi A: A new robust digital image watermarking technique based on joint DWT-DCT transformation. Proc Int Conf on Convergence and Hybrid Information Technology, Korea 2008, 539-544.Google Scholar
- Liu R, Wang G, Wang P, Huang W: An image authentication scheme based on sliding window. Proc Conf Control and Decision, China 2008, 2937-2940.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.