Mode decision acceleration for H.264/AVC to SVC temporal video transcoding

Yeh, Chia-Hung; Tseng, Wen-Yu; Wu, Shih-Tse

doi:10.1186/1687-6180-2012-204

Research
Open access
Published: 24 September 2012

Mode decision acceleration for H.264/AVC to SVC temporal video transcoding

Chia-Hung Yeh¹,
Wen-Yu Tseng¹ &
Shih-Tse Wu¹

EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 204 (2012) Cite this article

3508 Accesses
3 Citations
Metrics details

Abstract

This study presents a fast video transcoding architecture that overcomes the complexity of different coding structures between H.264/AVC and SVC. The proposed algorithms simplify the mode decision process in SVC owing to its heavy computations. Two scenarios namely transcoding with the same quantization parameter and bitrate reduction are considered. In the first scenario, SVC’s modes are determined by the probability models, including conditional probability, Bayesian theorem, and Markov chain. The second scenario measures MB activity to determine SVC’s modes. Experimental results indicate that our algorithm saves significant coding time with negligible PSNR loss over that when using a cascaded pixel-domain transcoder.

Introduction

Among the many multimedia services that offer universal multimedia access on heterogeneous networks include videoconferencing, distance learning, and video on demand [1–4]. Such applications require a variety of devices, access links, and resources. In particular, video transcoding enables a pre-coded video to satisfy the constraints of transmission networks or specific applications [5–15], as shown in Figure 1.

The Joint Video Team consists of ITU-T VCEG, and ISO/IEC MPEG has a standardized SVC, which is an extended version of H.264/AVC. SVC provides scalable functionality by parsing and extracting a partial bitstream to satisfy various terminal requirements and network conditions. However, most conventional video contents have a non-scalable format such as H.264/AVC. Therefore, video transcoding from non-scalable H.264/AVC to SVC is advisable for reducing computations when transcoding without sacrificing R-D performance. Because of codec incompatibilities between H.264/AVC and SVC, video format transformation must decode an original video into an intermediate format and re-encode it to SVC. While the decoding overhead is negligible, the high complexity of the encoding process still slows down the transcoding speed even when it is on a modern multicore processor. Such a delay in speed limits its applications.

Cascaded pixel-domain transcoder (CPDT) is a straightforward method for transcoding an existing format to another [5]. The visual quality of CPDT is optimal because the CPDT fully decoded bitstream of CPDT re-encodes it as a new one, resulting in a large computational complexity. The ability to reuse information of the incoming bitstream as much as possible can significantly reduce the computations of transcoding. However, H.264/AVC and SVC differ in coding structures, as illustrated in Figure 2, apparently making it impossible to directly reuse the H.264/AVC modes to those of SVCs’.

This study presents a fast algorithm for transcoding the coding format from H.264/AVC to SVC. First, H.264/AVC to SVC transcoding with the same QP is proposed. The proposed algorithm develops a mode probability model for coding format transcoding from H.264/AVC to SVC, based on the use of conditional probability, Bayesian theorem, and the Markov chain. Experimental results show that the proposed algorithm saves an average of 76.65% coding time with 0.1 dB PSNR loss over that when using a CPDT. In the second part, we discuss video transcoding from H.264/AVC to SVC with bitrate reduction. The residual DCT-domain MB energy obtained from H.264/AVC decoding process is used to find MB activity for the mode decision in SVC encoder [16]. The proposed algorithm saves an average of 59.4% coding time with 6.24% bitrate increase over that when using a CPDT.

The rest of this article is organized as follows. “Related study” section describes the previous aspects of video transcoding. “Proposed video transcoding from H.264/AVC to SVC” section then introduces the proposed video transcoding algorithm. Next, “Experimental results” section evaluates the performance of the proposed method, based on the experimental results. Conclusions are finally drawn in “Conclusions” section.

Related study

Visual quality and coding time are of priority concern during the design phase of video transcoding. Previous work can be categorized into transcoding in the frequency domain and transcoding in the pixel domain. De Cock et al. [17] proposed a video transcoding scheme in the frequency domain from H.264/AVC to SVC in order to reduce the coding time and complexity. Although capable of avoiding the inverse transform process to save computations, frequency transcoding degrades video quality owing to the drift problem. The study of [18] developed a scheme to transcode a single layer H.264/AVC bitstream into SNR scalable SVC bitstreams in CGS layer. To avoid the drift problem, this study uses a re-quantization error compensation method to prevent error propagation. However, visual quality of this method has an obvious gap compared to that of CPDT.

In pixel-domain transcoding, Garrido-Cantos et al. [19] developed a method for transcoding from H.264/AVC to SVC in temporal scalability in order to reduce computational complexity. The decoded motion vectors of H.264/AVC construct a reduced search area to accelerate the motion estimation process in SVC. However, their scheme does not discuss the mode decision, which is computationally intensive.

Al-Muscati and Labeau [20] also developed a video transcoding approach from H.264/AVC to SVC in temporal scalability. Extracted from H.264/AVC bitstream, the motion vectors are used to map either the hierarchical B frame or zero-delay referencing structures in SVC; in addition, H.264/AVC’s modes are directly reused. Reusing coding modes is an inefficient approach owing to different coding structures between H.264/AVC and SVC, subsequently degrading the coding performance significantly.

To avoid drift problem and achieve the optimal rate distortion (RD) performance, the proposed video transcoding scheme is in the pixel domain to eliminate drift problem and emphasize the mode decision process. Therefore, the proposed architecture has a low computational complexity and satisfactory coding performance in terms of video transcoding.

Proposed video transcoding from H.264/AVC to SVC

This section introduces the proposed H.264/AVC to SVC video transcoder scheme, capable of maintaining the visual quality of transcoded videos and reducing the coding time of transcoding simultaneously.

Transcoding with the same QP

Some candidate modes are selected from those of H.264/AVC incoming bitstream by using conditional probability. The conditional probability is statistical mode distribution, consisting of the SVC’s mode distribution for a given mode distribution of H.264/AVC. Next, whether the current mode is the best one is determined based on Bayesian theorem. Finally, Markov chain uses transitional probability to predict the likelihood of another candidate mode. The training sets of conditional probability, Bayesian theorem detection, and Markov chain consist of Football, Flower, Foreman, Carphone, and Mobile with CIF format and each sequence contain 200 frames. The quantization parameter is set to 25 and 35 while considering both low and high bitrates.

Candidate mode selection through conditional probability

Reducing computational complexity depends on the ability to efficiently reuse the information of the incoming bitstream. Despite the inability to apply the modes of H.264/AVC to SVC directly, the H.264/AVC’s modes provide hints on how to predict SVCs’. Therefore, based on an analysis of the mode distribution of these two standards, this study develops a conditional probability model to select candidate modes. Some useful candidate modes are determined by the highest conditional probability of the SVC’s mode given the H.264/AVC’s mode. Here, Mode0 to Mode6 represent Skip mode, Inter16 × 16, Inter16 × 8, Inter8 × 16, Inter8 × 8, Intra16 × 16, and Intra4 × 4. Table 1 summarizes the statistical results of the mode distribution between H.264/AVC and SVC.

Table 1 Conditional probability of the SVC’s mode distribution

Mode decision acceleration for H.264/AVC to SVC temporal video transcoding

Abstract

Introduction

Related study

Proposed video transcoding from H.264/AVC to SVC

Transcoding with the same QP

Candidate mode selection through conditional probability

Mode testing by Bayesian theorem detection

Mode refinement by Markov chain

Transcoding with bitrate reduction

Experimental results

Conclusions

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords