- Research Article
- Open Access
Enhanced JPEG2000 Quality Scalability through Block-Wise Layer Truncation
© Francesc Auli-Llinas et al. 2010
- Received: 1 December 2009
- Accepted: 26 April 2010
- Published: 31 May 2010
Quality scalability is an important feature of image and video coding systems. In JPEG2000, quality scalability is achieved through the use of quality layers that are formed in the encoder through rate-distortion optimization techniques. Quality layers provide optimal rate-distortion representations of the image when the codestream is transmitted and/or decoded at layer boundaries. Nonetheless, applications such as interactive image transmission, video streaming, or transcoding demand layer fragmentation. The common approach to truncate layers is to keep the initial prefix of the to-be-truncated layer, which may greatly penalize the quality of decoded images, especially when the layer allocation is inadequate. So far, only one method has been proposed in the literature providing enhanced quality scalability for compressed JPEG2000 imagery. However, that method provides quality scalability at the expense of high computational costs, which prevents its application to the aforementioned applications. This paper introduces a Block-Wise Layer Truncation (BWLT) that, requiring negligible computational costs, enhances the quality scalability of compressed JPEG2000 images. The main insight behind BWLT is to dismantle and reassemble the to-be-fragmented layer by selecting the most relevant codestream segments of codeblocks within that layer. The selection process is conceived from a rate-distortion model that finely estimates rate-distortion contributions of codeblocks. Experimental results suggest that BWLT achieves near-optimal performance even when the codestream contains a single quality layer.
- Discrete Wavelet Transform
- Video Streaming
- Quality Scalability
- Packet Header
- Quality Layer
Quality scalability is an important feature provided by modern image and video coding systems to allow the transmission and/or decoding of compressed codestreams at several bitrates without sacrificing coding performance. Quality scalability is key in applications like interactive image transmission, video streaming, or transcoding, among others. Commonly, it is achieved by means of the formation of successive layers of quality that, progressively decoded, provide optimal rate-distortion representations of the image.
JPEG2000  is a prominent image coding standard that provides advanced features such as lossy and lossy-to-lossless compression, random codestream access, and five different progression orders: scalability by quality, by spatial location, by resolution, and by component. To suit quality scalability requirements of applications, JPEG2000 permits the user to specify the layer allocation of the codestream. The density and bitrate distribution of layers are selected at encoding time, determining the rate-distortion optimality of the codestream . It is important to construct codestreams containing a layer allocation that works well for most applications. Nonetheless, the practical use of quality layers must consider that, once the codestream is constructed, the layer allocation cannot be modified without the full reencoding of the image, or the use of computationally intensive techniques like for example  or . If the codestream had an inadequate layer allocation, the quality of decoded images could be penalized by more than 10 dB, especially when insufficient quality layers are available (see Section 4).
The high degree of flexibility provided by JPEG2000 is adjusted through several coding parameters that are all set at encoding time. For simplicity, applications and libraries commonly use the default parameters of the underlying JPEG2000 implementation. The layer allocation of most implementations, including Kakadu, JJ2000, Jasper, OpenJPEG, ECW, and LeadTools, constructs by default codestreams containing a single quality layer. Though single quality layer codestreams may be adequate for the basic requirements of applications, this default layer allocation may become inadequate when extended functionalities are required. Such a case is found, for example, in the medical environment. Before the advent of interactive image transmission, the main concern of medical institutions was to compress images losslessly, thus imagery compressed using single quality layer codestreams were sufficient. If these images need to be interactively transmitted in, for instance, current telemedicine environments, the lack of quality scalability becomes an unacceptable shortcoming. Another example is the compression and distribution of digital cinema, in which frames of a movie are compressed using single quality layer codestreams . This restrains the real-time rendering, or the video streaming, of such movies in personal computers, or over the Internet, due to the lack of quality scalability. Functional changes in such environments may trigger important problems for the adequate manipulation of already compressed images. Reencoding is not a viable solution because of the large amount of data or, in some cases, due to limited computational resources. Hence, enhanced quality scalability for already compressed images is a need of pressing importance in environments that deal with JPEG2000 imagery.
The purpose of this paper is to introduce a simple yet efficient strategy of quality layer truncation for JPEG2000 that, requiring negligible computational cost, can be employed in most applications to enhance the quality scalability of already compressed codestreams constructed through an inadequate layer allocation. A preliminary version of this work was presented at the Workshop on Scalable Coded Media Beyond Compression . This paper extends that prior work with a better description, justification, and comparison between the proposed layer truncation and state-of-the-art methods. In addition, the research has been extended implementing the proposed strategy of layer truncation into interactive image and video transmission scenarios, which are niche applications of our method. This paper also contributes extensive experimental results considering computational costs, visual performance, and several configurations of quality layers allocation.
The paper is structured as follows. Section 2 overviews the JPEG2000 core coding system and reviews the state-of-the-art of layer formation and rate-distortion optimization. Section 3 introduces the proposed strategy of layer truncation, and Section 4 evaluates its performance in three applications that require enhanced quality scalability. The last section provides concluding remarks.
2.1. JPEG2000 Core Coding System
Tier-1 successively refines the coefficients' distortion by means of a fractional bitplane coder that encodes all coefficients of codeblock from the highest bitplane to the lowest bitplane , denoting the number of minimum magnitude bitplanes needed to represent all coefficients of . In each bitplane, Tier-1 carries out three subbitplane coding passes that are called Significance Propagation Pass (SPP), Magnitude Refinement Pass (MRP), and Cleanup Pass (CP). Each bit produced by the bitplane coder is encoded by the binary arithmetic coder MQ, which employs contextual information to adaptively encode input data. This encoding process produces a quality embedded bitstream for each codeblock that contains first the coding passes with the greatest distortion reductions, and that contains a large collection of potential truncation points (one at the end of each coding pass).
Once bitstream segments included in each quality layer are selected, the codestream reorganization stage encodes auxiliary data needed to properly identify the content of quality layers through Tier-2 coding. The final codestream is organized in several containers that encapsulate and sort the bitstream segments of codeblocks. As Figure 3 depicts, containers within the codestream are closely related with the partitioning system defined by JPEG2000. First, the dyadic decomposition carried out by the wavelet transform produces a multiresolution representation of the image. Second, each resolution level is composed of three subbands, denoted as HL, LH, and HH, that contain low and high frequencies in the horizontal and vertical direction, respectively. The lowest resolution level (also referred to as LL subband) is the only resolution level that contains only one subband with low frequencies. As depicted in Figure 3, the third image partition are precincts, which are defined as the same spatial region of all subbands within one resolution level. Finally, each precinct is subdivided in codeblocks. Data belonging to codeblocks within one precinct are coded in the smallest accessible container of the codestream, the packet. This sophisticated partitioning system is aimed to the rapid editing and management of the image in the compressed domain.
The progression order defines how packets are sorted in the final codestream. JPEG2000 defines 5 progression orders denoted as LRCP, RLCP, RPCL, PCRL, and CPRL. Characters L, R, C, and P stand for quality Layer, Resolution, Component, and spatial Position, respectively. The most common progression order is LRCP, in which the primary sorting directive is by quality. This means that all packets belonging to the first quality layer are situated at the very beginning of the codestream. After them, packets belonging to the second quality layer are included, and so on. Within each quality layer, packets are sorted by resolution level (second sorting directive), that is, first all packets belonging to precincts of the lowest resolution level are included, then those ones belonging to the second resolution level, and so on. Within each resolution level, packets are sorted by component (third sorting directive), and the last sorting directive is by position, which sorts packets depending on the spatial location. An illustrative example of the LRCP progression is depicted in Figure 3. The remaining progression orders employ same principles as LRCP but directives are in different order.
2.2. Review of Layer Formation and Rate-Distortion Optimization
Even though PCRD achieves optimal results in terms of rate-distortion performance, in some scenarios it cannot be applied as it was originally formulated due to restrictions inherent in applications, such as limited computational resources, and scan-based acquisition. Several alternatives to PCRD have been proposed in the literature, most of them focused on the reduction of the Tier-1's computational load that can be achieved when only those coding passes included in the final codestream are encoded. These methods might be roughly classified in four classes as characterized in : ( ) the sample data coding and rate-distortion optimization is carried out simultaneously [9–11]; ( ) statistics from the already encoded codeblocks are collected to decide which coding passes need to be encoded in the remaining codeblocks [12, 13]; ( ) rate-distortion contributions of codeblocks are estimated before the encoding process ; and ( ) suitable step sizes for the wavelet subbands are determined before encoding [15, 16]. The complementary problem of the optimization of the bitrate for a target quality is addressed in [17, 18], reducing the computational load of Tier-1 too. Specific techniques of rate-distortion optimization are also developed in scenarios such as scan-based applications , the coding of hyperspectral data , implementations of motion JPEG2000 , and for images containing tiles .
Most of the proposed methods of rate-distortion optimization can be employed to allocate successive layers of quality at increasing bitrates. The most common strategy for the layer allocation is to distribute layers in terms of bitrate through a uniform or a logarithmic function [1, Chapter 8.4.1], employing PCRD or derived approaches afterward. Another approach is to let the bitplane coder self-determine the layer allocation , or to optimally allocate layers considering the expected multirate-distortion measure introduced in .
Despite the use of different techniques to tackle the optimization problem, none of the aforementioned approaches can be directly applied to avoid loss when decoding, transmitting, or transcoding codestreams containing an inadequate layer allocation. The main difficulty to apply rate-distortion optimization once the codestream is constructed is that rate-distortion statistics collected during the encoding process are no longer available. More precisely, neither distortion-rate slopes nor truncation points of codeblocks are maintained in the final codestream because they are not needed for the decoding process. Thus, this information is disregarded to minimize the size of the codestream. Only some coders, like for example Kakadu, may record distortion-rate slope thresholds achieved at each quality layer in a COM marker of the codestream. Even so, when the layer is truncated, only the distortion-rate slope threshold for that layer is available, which reveals nothing with regard to individual distortion-rate slopes of coding passes and truncation points. The lack of rate-distortion statistics once the image is compressed prevents the use of classic rate-distortion optimization techniques. Though techniques to estimate distortion [24, 25] may aid transcoding procedures, their practical use require partial decoding of the codestream. To the best of our knowledge, only our previous work [4, 26, 27] addresses the lack of quality scalability of codestreams containing an inadequate layer allocation through models that characterize the rate-distortion contributions of codeblocks. The main idea behind that approach is to estimate distortion-rate slopes of coding passes without using distortion measures based on the original image, or related with the encoding process. Again, applicability of such approach is limited since lengths of individual coding passes are required, hence its use may compel partial decoding of the codestream. We recall that coding passes lengths are not commonly maintained in the codestream. They are only available when the restart coding variation [1, Chapter 12.4] is active. The restart coding variation is devised to allow intracodeblock parallelization for the bitplane coding, and for error resilience, hence its use may compel partial decoding of the codestream. Our objective is to truncate quality layers in the compressed domain, without needing to decode any part of the codestream.
Let a codestream primarily progressive by quality contain quality layers allocated at bit-rates , with . When the codestream is truncated at quality layer boundaries , the decoded image is optimal in the rate-distortion sense. However, when the codestream needs to be truncated at bitrate , with , the common approach of simply truncating layer by keeping the first portion does not guarantee optimal decoding. None of the rate-distortion optimization methods and techniques employed during encoding time can be used once the codestream is already constructed, since neither the rate-distortion statistics collected during the encoding process, nor the lengths of individual coding passes are available.
whether or not the codeblock contributes to the quality layer,
number of included coding passes,
length of encoded data, and
Instead of truncating a quality layer by keeping its initial prefix, we propose a Block-Wise Layer Truncation (BWLT) that modifies the number of bytes included for each codeblock within the truncated layer. The key point of this method is to determine an adequate number of bytes to keep for each codeblock. The main insight to do so is to use the rate-distortion model deployed by the Coding Passes Interleaving (CPI) method [8, 26], jointly with an estimation of the lengths of individual coding passes.
The main assumption behind CPI is that coding passes belonging to the highest bitplanes have greater rate-distortion contributions than coding passes belonging to the lowest bitplanes. More precisely, let us define a coding level as the coding pass of all codeblocks of the image at the same level, defined as , with standing for the bitplane, and standing for the coding pass type with . Through this rate-distortion model, coding passes are included from the highest coding level to the lowest one until the target bitrate is achieved.
Algorithm 1 includes first the full content of layers , in the final codestream. For the layer , BWLT selects coding passes from the highest coding level to the lowest coding level within (line 3). In each coding level, coding passes are selected from the lowest resolution level (i.e., subband LL), to the highest resolution level. If codeblock has a coding pass corresponding to the currently included coding level (line 7), the increment on the 's bitstream length for that coding pass is estimated according to expression (4) (line 8). If the bitrate of the final codestream does not exceed the target bitrate , that coding pass is included in the final codestream. The algorithm finishes execution when the bitrate for the final codestream is attained and, then, packet headers of the truncated layer are regenerated to adjust bitstream lengths of codeblocks.
Algorithm 1: Procedure carried out by the Block-Wise Layer Truncation strategy.
Algorithm 1 uses a fixed scanning order to select codestream segments of codeblocks (loops of lines 3, 4, and 5). Our experience suggests that to modify this scanning order does not change results significantly since, in practice, the truncated layer includes segments of most codeblocks. This is caused due to the outer loop in line 3 that selects coding passes from the highest to the lowest coding level within , so small increments for all codeblocks are added progressively until the target bitrate is achieved. Figure 3 depicts an illustrative example of the layer truncation carried out by BWLT compared to the common approach of truncation by keeping the initial prefix.
For the sake of simplicity, throughout this section we have assumed that the base quantization step sizes, corresponding to bitplane , are chosen accordingly to the -norm of the DWT synthesis basis vectors of the subband. This orthonormalizes wavelet coefficients, which is a common practice in JPEG2000, hence coding levels are weighted according to the coding gain of the wavelet subband. When nonorthonormal filter-banks are employed, the number of coding passes of codeblocks must be multiplied by the energy gain factor of the subband to which they belong. This can be employed to weight coloured-transformed components, or for the JPEG2000 lossless mode.
The application of the BWLT strategy raises the issue of how the decoder deals with bitstreams truncated at any point rather than at the end of coding passes. The simplest and most effective strategy is to stop the decoding of the bitstream when the MQ decoder starts synthesizing FF's, which occurs when the bitstream terminates. Although this may cause the loss of the last coded coefficients, whether the bitstream is correctly truncated or not, experimental evidence suggests that these losses are in practice negligible.
4.1. Transcoding and Decoding
It is worth noting that the selected layer allocation is the most adequate one in this context to benefit the performance of the common truncation. Other layer allocations, such as uniformly distributed layers in terms of bitrate, penalize more the coding performance of truncated codestreams. The common approach of layer truncation, for instance, requires 12 uniformly distributed quality layers or more to reach the performance that is achieved by BWLT when truncating codestreams containing a single quality layer. In general, the more inadequate the layer distribution, the more layers the common truncation requires to achieve competitive performance. BWLT achieves virtually the same performance regardless of the number of quality layers.
To assess the performance of BWLT when nonorthonormal filter-banks are employed, Figure 4(b) depicts results obtained when transcoding the "Portrait" and "Cafeteria" images encoded using the JPEG2000 lossless mode. The JPEG2000 lossless mode employs the 5/3 Integer Wavelet Transform (IWT), and no quantization is performed on wavelet data, therefore, the coding pass order used by BWLT is multiplied by the energy gain factor of the subband. Results are similar to the ones achieved with the JPEG2000 lossy mode, suggesting that BWLT can be employed with nonorthonormal transforms.
Results of Figures 4(a) and 4(b) also hold when BWLT is employed to decode a portion of a codestream, rather than generating a new one. This may be useful, for example, in real-time video rendering applications with limited computational resources, in which the full decoding of the codestream might force subsampling.
4.2. Interactive Image Transmission
The second application in which BWLT is applied is interactive transmission of JPEG2000 imagery. In this case, BWLT is implemented in the Kakadu server and client. The implementation of BWLT maintains compliance with the JPIP protocol (ISO/IEC 15444-9), which is supported in Kakadu. JPIP specifies a rich syntax to interchange JPEG2000 imagery over the network, and it is employed in several environments like remote sensing or telemedicine. In the server, the implemented BWLT strategy aids the procedure that extracts the window-of-interest (WOI) requested by the client from the codestream. The WOI is delivered progressively so that it can be rendered at increasing quality at the client side.
4.3. Video Streaming
4.4. Comparison with State-of-the-Art Methods
4.5. Computational Cost
Evaluation of the computational costs of four methods of layer truncation. Results are reported in seconds.
Quality scalability is achieved in JPEG2000 through the use of quality layers. Even though the definition of quality layers is a sound mechanism of JPEG2000, their practical use must take into account that codestreams containing an inadequate layer allocation may greatly penalize the coding performance.
This paper introduces a Block-Wise Layer Truncation (BWLT) strategy based on a well-known rate-distortion model and on an accurate estimation of the bitstream lengths generated for coding passes. The proposed BWLT strategy only uses the auxiliary information included in packet headers, without requiring decoding of the image. Experimental results suggest that BWLT may significantly improve the quality of decoded images, especially when few quality layers are available. To the best of our knowledge, the proposed BWLT is the only strategy of layer truncation that, with negligible computation cost, provides enhanced quality scalability to JPEG2000 codestreams achieving near-optimal performance. Applications that can benefit from the proposed strategy are interactive image and video transmission, video streaming, transcoding of codestreams, and real-time video rendering.
The authors thank the associate editor supervising the review process of this paper, and the anonymous referees for their remarks. The authors thank D. Taubman for his help with Kakadu, and for hosting the first author during his stay in Australia. This paper has been partially supported by the Spanish Government (MICINN), by FEDER, and by the Catalan Government, under Grants 2008-BPB-0010, TIN2009-14426-C02-01, TIN2009-05737-E/TIN, and 2009-SGR-1224.
- Taubman DS, Marcellin MW: JPEG2000 Image Compression Fundamentals, Standards and Practice. Kluwer Academic Publishers, Norwell, Mass, USA; 2002.View ArticleGoogle Scholar
- Wu X, Dumitrescu S, Zhang N: On multirate optimality of JPEG2000 code stream. IEEE Transactions on Image Processing 2005, 14(12):2012-2023.View ArticleGoogle Scholar
- Apostolopoulos JG: Secure media streaming & secure adaptation for non-scalable video. Proceedings of the IEEE International Conference on Image Processing (ICIP '04), October 2004 3: 1763-1766.Google Scholar
- Auli-Llinas F, Serra-Sagrista J: JPEG2000 quality scalability without quality layers. IEEE Transactions on Circuits and Systems for Video Technology 2008, 18(7):923-936.View ArticleGoogle Scholar
- Digital cinema system specification, version 1.2 2008.http://www.dcimovies.com
- Auli-Llinas F, Taubman D: JPEG2000 block-wise truncation of quality layers. Proceedings of the IEEE/IET International Conference on Visual Information Engineering, August 2008 711-716.Google Scholar
- Taubman D: High performance scalable image compression with EBCOT. IEEE Transactions on Image Processing 2000, 9(7):1158-1170. 10.1109/83.847830View ArticleGoogle Scholar
- Auli-Llinas F: Model-based JPEG2000 rate control methods, Ph.D. dissertation. Universitat Autònoma de Barcelona, Barcelona, Spain; December 2006.Google Scholar
- Kim T, Kim HM, Tsai P-S, Acharya T: Memory efficient progressive rate-distortion algorithm for JPEG 2000. IEEE Transactions on Circuits and Systems for Video Technology 2005, 15(1):181-187.View ArticleGoogle Scholar
- Yeung YM, Au OC: Efficient rate control for JPEG2000 image coding. IEEE Transactions on Circuits and Systems for Video Technology 2005, 15(3):335-344.View ArticleGoogle Scholar
- Yu W, Sun F, Fritts JE: Efficient rate control for JPEG-2000. IEEE Transactions on Circuits and Systems for Video Technology 2006, 16(5):577-589.View ArticleGoogle Scholar
- Taubman D: Software architectures for JPEG2000. Proceedings of the IEEE International Conference on Digital Signal Processing, July 2002 1: 197-200.Google Scholar
- Du W, Sun J, Ni Q: Fast and efficient rate control approach for JPEG2000. IEEE Transactions on Consumer Electronics 2004, 50(4):1218-1221. 10.1109/TCE.2004.1362522View ArticleGoogle Scholar
- Vikram KN, Vasudevan V, Srinivasan S: Rate-distortion estimation for fast JPEG2000 compression at low bit-rates. Electronics Letters 2005, 41(1):16-18. 10.1049/el:20057147View ArticleGoogle Scholar
- Parisot C, Antonini M, Barlaud M: High performance coding using a model-based bit allocation with EBCOT. Proceedings of the EURASIP European Signal Processing Conference, September 2002 2: 510-513.Google Scholar
- Gaubatz MD, Hemami SS: Robust rate-control for wavelet-based image coding via conditional probability models. IEEE Transactions on Image Processing 2007, 16(3):649-663.MathSciNetView ArticleGoogle Scholar
- Liu Z, Karam LJ, Watson AB: JPEG2000 encoding with perceptual distortion control. IEEE Transactions on Image Processing 2006, 15(7):1763-1778.View ArticleGoogle Scholar
- Chang Y-W, Fang H-C, Cheng C-C, Chen C-C, Chen L-G: Precompression quality-control algorithm for JPEG 2000. IEEE Transactions on Image Processing 2006, 15(11):3279-3293.View ArticleGoogle Scholar
- Kulkarni P, Bilgin A, Marcellin MW, et al.: Compression of earth science data with JPEG2000. In Hyperspectral Data Compression. Springer, New York, NY, USA; 2006:347-378.Google Scholar
- Kosheleva OM, Usevitch BE, Cabrera SD, Vidal E Jr.: Rate distortion optimal bit allocation methods for volumetric data using JPEG 2000. IEEE Transactions on Image Processing 2006, 15(8):2106-2112.View ArticleGoogle Scholar
- Dagher JC, Bilgin AH, Marcellin MW: Resource-constrained rate control for motion JPEG2000. IEEE Transactions on Image Processing 2003, 12(12):1522-1529. 10.1109/TIP.2003.819228View ArticleGoogle Scholar
- Wu Z, Zheng N: Efficient rate-control system with three stages for JPEG2000 image coding. IEEE Transactions on Circuits and Systems for Video Technology 2006, 16(9):1063-1073.View ArticleGoogle Scholar
- Auli-Llinas F, Bartrina-Rapesta J, Serra-Sagrista J: Self-conducted allocation strategy of quality layers for JPEG2000. EURASIP Journal on Advances in Signal Processing 2008, 2008:-7.Google Scholar
- Taubman DS: Localized distortion estimation from already compressed JPEG2000 images. Proceedings of the IEEE International Conference on Image Processing, October 2006 3089-3092.Google Scholar
- Auli-Llinas F, Marcellin MW: Distortion estimators for bitplane image coding. IEEE Transactions on Image Processing 2009, 18(8):1772-1781.MathSciNetView ArticleGoogle Scholar
- Auli-Llinas F, Serra-Sagrista J, Monteagudo-Pereira JL, Bartrina-Rapesta J: Efficient rate control for JPEG2000 coder and decoder. Proceedings of the IEEE Data Compression Conference, March 2006 282-291.Google Scholar
- Auli-Llinas F, Serra-Sagrista J: Low complexity JPEG2000 rate control through reverse subband scanning order and coding passes concatenation. IEEE Signal Processing Letters 2007, 14(4):251-254.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.