- Research Article
- Open Access
Content-Aware Scalability-Type Selection for Rate Adaptation of Scalable Video
EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 010236 (2007)
Scalable video coders provide different scaling options, such as temporal, spatial, and SNR scalabilities, where rate reduction by discarding enhancement layers of different scalability-type results in different kinds and/or levels of visual distortion depend on the content and bitrate. This dependency between scalability type, video content, and bitrate is not well investigated in the literature. To this effect, we first propose an objective function that quantifies flatness, blockiness, blurriness, and temporal jerkiness artifacts caused by rate reduction by spatial size, frame rate, and quantization parameter scaling. Next, the weights of this objective function are determined for different content (shot) types and different bitrates using a training procedure with subjective evaluation. Finally, a method is proposed for choosing the best scaling type for each temporal segment that results in minimum visual distortion according to this objective function given the content type of temporal segments. Two subjective tests have been performed to validate the proposed procedure for content-aware selection of the best scalability type on soccer videos. Soccer videos scaled from 600 kbps to 100 kbps by the proposed content-aware selection of scalability type have been found visually superior to those that are scaled using a single scalability option over the whole sequence.
Ohm J-R: Advances in scalable video coding. Proceedings of the IEEE 2005,93(1):42-56.
Reichel J, Schwarz H, Wien M: Scalable video coding - Working Draft 1. Joint Video Team (JVT), Doc. JVTN020, Hong Kong, January 2005
Puri A, Chen X, Luthra A: Video coding using the H.264/MPEG-4 AVC compression standard. Signal Processing: Image Communication 2004,19(9):793-849. 10.1016/j.image.2004.06.003
Kumar Rajendran R, van der Schaar M, Chang SF: FGS+: optimizing the joint spatio temporal video quality in MPEG-4 fine grained scalable coding. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '02), May 2002, Phoenix, Ariz, USA
Kuhmünch C, Kühne G, Schremmer C, Haenselmann T: Video-scaling algorithm based on human perception for spatio-temporal stimuli. In Multimedia Computing and Networking (MMCN '01), January 2001, San Jose, Calif, USA, Proceedings of SPIE. Volume 4312. SPIE Press; 13–24.
Wang Y, van der Schaar M, Chang S-F, Loui AC: Classification-based multidimensional adaptation prediction for scalable video coding using subjective quality evaluation. IEEE Transactions on Circuits and Systems for Video Technology 2005,15(10):1270-1279.
Hung B-F, Huang C-L: Content-based FGS coding mode determination for video streaming over wireless networks. IEEE Journal on Selected Areas in Communications 2003,21(10):1595-1603. 10.1109/JSAC.2003.815229
Wolf S, Pinson MH: Spatial-temporal distortion metrics for in-service quality monitoring of any digital video system. Proceedings of the Multimedia Systems and Applications II, September 1999, Boston, Mass, USA, Proceedings of SPIE 3845: 266–277.
Reed EC, Lim JS: Optimal multidimensional bit-rate control for video communication. IEEE Transactions on Image Processing 2002,11(8):873-885. 10.1109/TIP.2002.801122
Vetro A, Wang Y, Sun H: Rate-distortion optimized video coding considering frameskip. Proceedings of IEEE International Conference on Image Processing (ICIP '01), October 2001, Thessaloniki, Greece 3: 534–537.
Wang Y, Kim J-G, Chang S-F: Content-based utility function prediction for real-time MPEG-4 video transcoding. Proceedings of IEEE International Conference on Image Processing (ICIP '03), September 2003, Barcelona, Spain 1: 189–192.
Yin P, Vetro A, Xia M, Liu B: Rate-distortion models for video transcoding. Image and Video Communications and Processing, January 2003, Santa Clara, Calif, USA, Proceedings of SPIE 5022: 479–488.
Girod B: What's wrong with mean-squared error. In Digital Images and Human Vision. Edited by: Watson AB. MIT Press, Cambridge, Mass, USA; 1993:207-220.
Winkler S, Lambrecht CJB, Kunt M: Vision and video: models and applications. In Vision Models and Applications to Image and Video Processing. Edited by: Lambrecht CJB. Kluwer Academic Publishers, Dordrecht, The Netherlands; 2001. chapter 10
Webster AA, Jones CT, Pinson MH, Voran SD, Wolf S: Objective video quality assessment system based on human perception. Human Vision, Visual Processing, and Digital Display IV, February 1993, San Jose, Calif, USA, Proceedings of SPIE 1913: 15–26.
Tan KT, Ghanbari M: A multi-metric objective picture-quality measurement model for MPEG video. IEEE Transactions on Circuits and Systems for Video Technology 2000,10(7):1208-1213. 10.1109/76.875525
Wang Y, Liu Z, Huang J-C: Multimedia content analysis-using both audio and visual clues. IEEE Signal Processing Magazine 2000,17(6):12-36. 10.1109/79.888862
Akyol E, Tekalp AM, Civanlar MR: Optimum scaling operator selection in scalable video coding. Picture Coding Symposium, December 2004, San Francisco, Calif, USA 477–482.
Ekin A, Tekalp AM, Mehrotra R: Automatic soccer video analysis and summarization. IEEE Transactions on Image Processing 2003,12(7):796-807. 10.1109/TIP.2003.812758
Kokaram A, Rea N, Dahyot R, et al.: Browsing sports video: trends in sports-related indexing and retrieval work. IEEE Signal Processing Magazine 2006,23(2):47-58.
Snoek CGM, Worring M: Multimodal video indexing: a review of the state-of-the-art. Multimedia Tools and Applications 2005,25(1):5-35.
Chang S-F, Bocheck P: Principles and applications of content-aware video communication. Proceedings of the IEEE Internaitonal Symposium on Circuits and Systems (ISCAS '00), May 2000, Geneva, Switzerland 4: 33–36.
Yuen M, Wu HR: A survey of hybrid MC/DPCM/DCT video coding distortions. Signal Processing 1998,70(3):247-278. 10.1016/S0165-1684(98)00128-5
Marziliano P, Dufaux F, Winkler S, Ebrahimi T: Perceptual blur and ringing metrics: application to JPEG2000. Signal Processing: Image Communication 2004,19(2):163-172. 10.1016/j.image.2003.08.003
Shapiro L, Stockman G: Computer Vision. Prentice-Hall, Upper Saddle River, NJ, USA; 2000.
Pan F, Lin X, Rahardja S, et al.: A locally adaptive algorithm for measuring blocking artifacts in images and videos. Signal Processing: Image Communication 2004,19(6):499-506. 10.1016/j.image.2004.04.001
Frajka T, Zeger K: Downsampling dependent upsampling of images. Signal Processing: Image Communication 2004,19(3):257-265. 10.1016/j.image.2003.10.003
Tekalp AM: Digital Video Processing. Prentice-Hall, Upper Saddle River, NJ, USA; 1995.
Hekstra AP, Beerends JG, Ledermann D, et al.: PVQM—a perceptual video quality measure. Signal Processing: Image Communication 2002,17(10):781-798. 10.1016/S0923-5965(02)00056-5
Xu J, Xiong R, Feng B, et al.: 3D sub-band video coding using barbell lifting. ISO/IEC JTC/WG11 M10569, S05
Luo L, Wu F, Li S, Xiong Z, Zhuang Z: Advanced motion threading for 3D wavelet video coding. Signal Processing: Image Communication 2004,19(7):601-616. special issue on Subband/Wavelet Interframe Video Coding 10.1016/j.image.2004.05.004
Xu J, Xiong Z, Li S, Zhang Y-Q: Three-dimensional embedded subband coding with optimized truncation (3-D ESCOT). Applied and Computational Harmonic Analysis 2001,10(3):290-315. 10.1006/acha.2000.0345
Methodology for the subjective assessment of the quality of television pictures In Recommendation ITU-R BT.500-10. ITU Telecommunication Standardization Sector, Geneva, Switzerland; 2000.
Devore J: Probability and Statistics for Engineering and the Sciences. Duxbury Press, Pacific Grove, Calif, USA; 1999.
Gulliver SR, Ghinea G: Changing frame rate, changing satisfaction? [Multimedia quality of perception]. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '04), June 2004, Taipei, Taiwan 1: 177–180.