 Research Article
 Open Access
BlockMatching Translational and Rotational Motion Compensated Prediction Using Interpolated Reference Frame
 KaHo Ng^{1}Email author,
 LaiMan Po^{1},
 KwokWai Cheung^{2} and
 KaMan Wong^{1}
https://doi.org/10.1155/2010/385631
© KaHo Ng et al. 2010
 Received: 26 July 2010
 Accepted: 18 November 2010
 Published: 30 November 2010
Abstract
Motion compensated prediction (MCP) implemented in most video coding schemes is based on translational motion model. However, nontranslational motions, for example, rotational motions, are common in videos. Higherorder motion model researches try to enhance the prediction accuracy of MCP by modeling those nontranslational motions. However, they require affine parameter estimation, and most of them have very high computational complexity. In this paper, a translational and rotational MCP method using special subsampling in the interpolated frame is proposed. This method is simple to implement and has low computational complexity. Experimental results show that many blocks can be better predicted by the proposed method, and therefore a higher prediction quality can be achieved with acceptable overheads. We believe this approach opens a new direction in MCP research.
Keywords
 Rotation Angle
 Motion Vector
 Prediction Quality
 Video Code Standard
 Angle Interval
1. Introduction
Modern video coding schemes achieve high compression efficiency by exploring the temporary redundancy between frames via motion compensated prediction (MCP). In MCP, a block of pixels in reference frames is chosen as prediction candidate for the block in the current frame. Conventional MCP assumes objects moving along the imaging plane with translational motion, and most video coding standards implement MCP based on this classical translational motion model. Lots of research works are done to increase the efficiency of translational MCP, for example, in H.264/AVC [1], advanced block matching motion estimation algorithm is adopted [2]. Multiple reference frames (MRF) [3] is also adopted to provide additional candidates for prediction over a longer period of time. Another MCP technique is variable block size (VBS) [4] motion compensation. These techniques almost push the performance of translational motionbased MCP to the limit.
However, projection of realworld moving objects onto a 2D imaging plane will not always result in pure translational objects motion. Rotation, zoom [5], and other nonrigid motions are also pervasive in videos. Researches on higherorder motion models such as affine [6, 7], bilinear [8], quadratic [9], perspective [10], and elastic [11] ones are conducted. These higherorder motion models aim to include nontranslational motions so that MCP prediction accuracy can be increased at the expense of additional motion vectors or parameters. However, these methods require motion parameter estimations. A commonly used method for motion parameter estimation is GaussNewton minimization algorithm in which motion parameters are iteratively updated until a minimum is found for the cost function [12]. Motion parameter estimation is in general of very high computational complexity. Moreover, subpixel reconstruction is required for these higherorder motion models because the transformed positions may not be a sampling point of the image. Interpolation is required to obtain the intensity values of these positions. This further increases the computational complexity. As a result, higherorder motion models are seldom used in practical coding applications.
In this paper a new translational and rotational MCP method is proposed. In this method, special subsampling in the interpolated reference frame effectively predicts the rotational block motions. In the next section, this method will be discussed in detail. In Section 3, experimental results will be provided and discussed. In the last section, a conclusion will be given.
2. Translational and Rotational Motion Compensated Prediction
2.1. Rotated Subsampling in Interpolated Reference Frame
2.2. Computational Complexity Determined by Search Angle Range and Interval
Interpolated reference frames exist in all codecs which implement fractionalpixel accuracy MCP. The rotation coordinates can be precomputed. Therefore the only computation complexity increase in our proposed method is the additional number of block matching between the rotated blocks and the current block. Another overhead will be the number of bits required to code the best matched rotation angle. However, if we control the number of rotation angles to be used, we can use a few bits to represent the angles and at the same time reduce the computation complexity required. Experiments show that up to 37% of the blocks can be better predicted with rotational MCP. The angle of rotation is usually small, especially for static video sequences like Akiyo. This is reasonable as objects between frames usually will not rotate a lot. We define search angle interval and search angle range so that the number of block matching between the rotated blocks and the current block is fixed. For a search angle interval , block matching will be performed for each blockrotation of , , starting from 0° within a search angle range. For example, if the search angle interval is 0.1° and the search range is ±5°, a total number of 10°/0.1° = 100 rotated block matching will be performed. For practical implementation, we choose a lower number of rotational searches. For example the performance of using 16 searches in both clockwise and anticlockwise directions, that is, 32 in total, is investigated in detail. Only 5 bits are required to represent the 32 rotational angles. For applications requiring lower computational complexity, we can use even fewer rotational searches. With fixed number of searches, the search angle interval and range can be varied. For example, search angle interval 0.1° with search angle range ±1.6° and search angle interval 0.2° with search angle range ±3.2° both have 32 rotational searches. By comparing the prediction accuracy of different search angle intervals with the same total number of searches, we found that using a larger angle interval performs better for complex motion sequences, for example, Foreman. For small motion sequences, for example, Akiyo, using a smaller interval angle has better prediction quality. This is logical because complex motion sequences contain blocks with rotation of larger extent. With the same number of searches, using larger angle interval covers larger rotation. On the other hand, static sequences contain blocks with very small rotation. For these sequences a smaller angle of interval is more suitable. In this stage we have not yet found a search angle interval and range which is extremely robust. In the next section we will show the experimental results of some typical values we tested.
The proposed MCP method can be summarized below:
Step 1.
Find the best translational motion vector (MV) in integerpixel accuracy using traditional integer motion estimation
Step 2.
At the position pointed by the best translational MV and each of its surrounding fractionalpixel accuracy positions, performs original (nonrotated) block matching and rotated block matchings using the special subsampling method.
Step 3.
The position and the rotation angle which has the lowest distortion is returned. This is the translational and rotational MV which will be encoded and transmitted to the decoder side.
3. Experimental Results
Experiments using CIF sequences Soccer, Stefan, Crew, Foreman, Mobile, and Akiyo are performed to analyze the performance of the proposed translational and rotational MCP. The block size is pixels. The search window size is ±16 pixels. Integerpixel motion estimation is performed using exhaustive search (full search) algorithm, which searches each integer position in the search window.
Because the coordinates of the subsampled pixels will eventually be rounded into integers, some of the will be too small such that the rounded coordinates will be the same as that of the previous angle. To avoid repeated calculations, we will skip those angles in our experiments.
3.1. Prevalence of Rotational Motion
Rotation angle distribution in sequence Foreman.
Foreman ( )  

Angle range (in degree)  No. of blocks selected RMCP  % in total no. of Blocks  Angle range (in degree)  No. of blocks selected RMCP  % in total no. of blocks 
 3450 

 3357 

 872 

 889 

 468 

 462 

 334 

 349 

 170 

 148 

 85 

 111 

 93 

 126 

 87 

 93 

 29 

 32 

 25 

 27 

 21 

 21 

 16 

 16 

 10 

 11 

 11 

 8 

 6 

 11 

 7 

 7 

 5 

 3 

 5 

 1 

 2 

 3 

 5 

 2 

 4 

 1 

 2 

 6 

 1 

 6 

 1 

 1 

 0 

 2 

 1 

 0 

 4 

 2 

 0 

 0 

 1 

 2 

 7 

 6 

 9 

 8 

Sum:  11442  29.19% 
Rotation angle distribution in sequence Stefan.
Stefan ( )  

Angle range (in degree)  No. of blocks selected RMCP  % in total no. of Blocks  Angle range (in degree)  No. of blocks selected RMCP  % in total no. of blocks 
 5785 

 5648 

 605 

 625 

 257 

 282 

 239 

 233 

 81 

 82 

 84 

 71 

 69 

 63 

 52 

 58 

 25 

 28 

 22 

 29 

 13 

 25 

 6 

 18 

 4 

 6 

 2 

 13 

 4 

 5 

 10 

 4 

 0 

 1 

 2 

 4 

 2 

 2 

 1 

 3 

 0 

 3 

 1 

 2 

 0 

 1 

 3 

 4 

 0 

 1 

 1 

 1 

 1 

 1 

 0 

 5 

 0 

 0 

 0 

 3 

 6 

 6 

Sum:  14502  36.99% 
Rotation angle distribution in sequence Akiyo.
Akiyo ( )  

Angle range (in degree)  No. of blocks selected RMCP  % in total no. of Blocks  Angle range (in degree)  No. of blocks selected RMCP  % in total no. of blocks 
 849 

 918 

 97 

 114 

 26 

 26 

 15 

 16 

 4 

 4 

 2 

 1 

 1 

 0 

 2 

 3 

 2 

 0 

 2 

 2 

 0 

 0 

 1 

 0 

 1 

 0 

 0 

 0 

Sum:  2086  5.32% 
3.2. Optimum Search Angle Range and Interval
For static motion sequence Akiyo, the prediction quality cannot be improved with larger rotation angle interval. On the contrary, the quality is slightly dropped because in static sequences blocks actually rotate very slightly. In sequence Mobile, using a smaller rotation angle interval also has slightly better performance.
3.3. Computational Complexity of Proposed MCP Method
To estimate the computational complexity of the proposed MCP method in a practical system, we measured the peak signaltonoise ration (PSNR) achieved with four rotated searches and angle of interval 2.0°. The computational complexity of four rotated searches with certain fractionalpixel accuracy is similar to that of translationalonly MCP with the next higher fractionalpixel accuracy. For example the number of SAD calculation of the proposed MCP method with four rotated searches at 1/4pixel accuracy is 244 because at each 1/4pixel position four rotated and one original matching are performed. The number of SAD calculation of translationalonly MCP with 1/8pixel accuracy is 224 because there are 224 search candidate positions. These two numbers are comparable which means their computational complexities are similar. If the proposed method with certain fractionalpixel accuracy can achieve better prediction accuracy than the traditional MCP with higher fractionalpixel accuracy, it has great potential to replace traditional MCP with higher fractionalpixel accuracy.
Comparison of computational complexity and prediction accuracy.
Complexity  Soccer  Stefan  Crew  Foreman  Mobile  Akiyo  

No. of SAD calculations after Interger ME  Average PSNR per frame (dB)  Average PSNR per frame (dB)  Average PSNR per frame (dB)  Average PSNR per frame (dB)  Average PSNR per frame (dB)  Average PSNR per frame (dB)  
Translationalonly MCP  TRM CP (4 rotated searches, )  Translationalonly MCP  TRM CP (4 rotated searches, )  Translationalonly MCP  TRM CP (4 rotated searches, )  Translationalonly MCP  TRM CP (4 rotated searches, )  Translationalonly MCP  TRM CP (4 rotated searches, )  Translationalonly MCP  TRM CP (4 rotated searches, )  Translationalonly MCP  TRM CP (4 rotated searches, )  
1/2pixel accuracy  8  44  30.16  30.38  26.9  26.99  32.93  33.06  34.25  34.36  26.16  26.25  43.92  43.98 
1/4pixel accuracy  48  244  30.51  30.78  27.46  27.58  33.49  33.66  34.75  34.91  27.32  27.49  44.91  44.98 
1/8pixel accuracy  224  1124  30.58  30.87  27.6  27.72  33.61  33.79  34.85  35.03  27.58  27.78  45.35  45.43 
1/16pixel accuracy  960  4804  30.59  30.88  27.63  27.75  33.65  33.82  34.87  35.05  27.63  27.84  45.47  45.56 
4. Conclusion
In this paper, translational and rotational MCP implemented by special subsampling in the interpolated frame is proposed. It is found that up to 37% of the blocks can be better predicted with rotational MCP. The proposed method has the merits of easy implementation and low overhead. The interpolated frame used by rotational MCP is the same as that used by fractionalpixel accuracy MCP, which exists in most recent video coding standards. Experimental results show that higher fractionalpixel accuracies, for example, 1/16pixel, cannot much further improve the prediction accuracy in translational MCP. Moreover, they require the additional computation overhead of extra interpolation calculation. With regard to the side information overhead, MCP with higher fractionalpixel accuracy needs more bits to transmit the higher fractionalpixel accuracy MV. For example the number of candidate search positions of 1/16pixel accuracy MCP is around four times that of 1/8pixel accuracy MCP. Our proposed method only needs to transmit one rotational angle parameter. For example four rotational angles can be represented by 2 bits, and so on. The increase in side information overhead is negligible.
In view of the decreasing effectiveness of MCP with higher fractionalpixel accuracies, the proposed method shows a new research direction to further improve the performance of MCP. Further works in this direction include the determination of an optimized search angle interval and range which are robust for video sequences of different motion contents. Furthermore, the correlation between the translational MV and the rotation angle is also under investigation.
Declarations
Acknowledgment
The work described in this paper was substantially supported by a Grant from the Hong Kong SAR Government with GRF Project no. of 9041501 (CityU 119909).
Authors’ Affiliations
References
 Wiegand T, Sullivan GJ, Bjøntegaard G, Luthra A: Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology 2003, 13(7):560576.View ArticleGoogle Scholar
 Tourapis AM: Enhanced predictive zonal search for single and multiple frame motion estimation. Viual Communications and Image Processing, 2002, San Jose, Calif, USA, Proceedings of SPIE 4671: 10691079.Google Scholar
 Wiegand T, Zhang X, Girod B: Longterm memory motioncompensated prediction. IEEE Transactions on Circuits and Systems for Video Technology 1999, 9(1):7084. 10.1109/76.744276View ArticleGoogle Scholar
 Sullivan GJ, Baker RL: Ratedistortion optimized motion compensation for video compression using fixed or variable size blocks. Proceedings of IEEE Global Telecommunications Conference (GLOBECOM '91), December 1991, Phoenix, Ariz, USA 8590.Google Scholar
 Wong KM, Po LM, Cheung KW, Ng KH: Blockmatching translation and zoom motioncompensated prediction by subsampling. Proceedings of IEEE International Conference on Image Processing (ICIP '09), November 2009, Cairo, Egypt 15971600.Google Scholar
 Wiegand T, Steinbach E, Girod B: Affine multipicture motioncompensated prediction. IEEE Transactions on Circuits and Systems for Video Technology 2005, 15(2):197209.View ArticleGoogle Scholar
 Kordasiewicz RC, Gallant MD, Shirani S: Affine motion prediction based on translational motion vectors. IEEE Transactions on Circuits and Systems for Video Technology 2007, 17(10):13881394.View ArticleGoogle Scholar
 Sullivan GJ, Baker RL: Motion compensation for video compression using control grid interpolation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '91), April 1991, Toronto, Canada 4: 27132716.Google Scholar
 Karczewicz M, Nieweglowski J, Haavisto P: Video coding using motion compensation with polynomial motion vector fields. Signal Processing: Image Communication 1997, 10(13):6391. 10.1016/S09235965(97)000192Google Scholar
 Nakaya Y, Harashima H: Motion compensation based on spatial transformations. IEEE Transactions on Circuits and Systems for Video Technology 1994, 4(3):339356. 10.1109/76.305878View ArticleGoogle Scholar
 Pickering MR, Frater MR, Arnold JF: Enhanced motion compensation using elastic image registration. Proceedings of IEEE International Conference on Image Processing, October 2006 10611064.Google Scholar
 Zitová B, Flusser J: Image registration methods: a survey. Image and Vision Computing 2003, 21(11):9771000. 10.1016/S02628856(03)001379View ArticleGoogle Scholar
 Girod B: Motioncompensating prediction with fractionalpel accuracy. IEEE Transactions on Communications 1993, 41(4):604612. 10.1109/26.223785View ArticleGoogle Scholar
 Po LM, Wong KM, Ng KH, et al.: Motion compensated prediction by subsampled block matching for zoom motion contents. ISO/IEC JTC1/SC29/WG11 MPEG2010/M17163, (91th MPEG Meeting), 2010, Kyoto, JapanGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.