Self-Conducted Allocation Strategy of Quality Layers for JPEG2000

The rate-distortion optimality of a JPEG2000 codestream is determined by the density and distribution of the quality layers it contains. The allocation of quality layers is, therefore, a fundamental issue for JPEG2000 encoders, which commonly distribute layers logarithmically or uniformly spaced in terms of bitrate, and use a rate-distortion optimization method to optimally form them. This work introduces an allocation strategy based on the hypothesis that the fractional bitplane coder of JPEG2000 already generates optimal truncation points for the overall optimization of the image. Through these overall optimal truncation points, the proposed strategy is able to allocate quality layers without employing rate-distortion optimization techniques, to self-determine the density and distribution of quality layers, and to reduce the computational load of the encoder. Experimental results suggest that the proposed method constructs near-optimal codestreams in the rate-distortion sense, achieving a similar coding performance as compared with the common PCRD-based approach.


INTRODUCTION
JPEG2000 is a powerful standard structured in 12 parts that addresses the coding, transmission, security, and manipulation of still images and video.To maximize interoperability among vendors, JPEG2000 Part 1 [1] defines the core coding system as the specification of the codestream syntax that any decoder must support to produce the output signal.As far as the codestream syntax is respected, encoders have the freedom to implement their own coding strategies, commonly devised to fulfill specific requirements of applications.
The core coding system of JPEG2000 is wavelet based with a two-tiered coding strategy built on an embedded block coding with optimized truncation (EBCOT) [2].After the wavelet transform and quantization, the image is divided into small blocks of wavelet coefficients (called codeblocks) that are independently encoded by the Tier-1 stage, generating one quality embedded bitstream for each one.The final codestream is then formed through ratedistortion optimization techniques that optimally truncate these bitstreams, and through the Tier-2 stage that encodes the auxiliary information needed to properly decode the image.In this coding process, rate-distortion optimization is necessary for two main reasons [3]: (1) to attain a target bitrate for the final codestream while minimizing the overall image distortion; (2) to form increasing layers of quality that avoid penalizing the quality of the decoded image when the codestream is truncated, or the image is interactively transmitted.
The first rate-distortion optimization method proposed for JPEG2000 was the Post-Compression Rate-Distortion (PCRD) optimization, introduced in EBCOT.Although PCRD achieves optimal results in terms of rate distortion, as it is originally formulated, it lacks in efficiency because it compels the Tier-1 to fully encode all codeblocks even when only a small portion of the generated bitstreams are included in the final codestream.Tier-1 is the most computationally intensive stage of the JPEG2000 encoder [4], hence several rate-distortion optimization methods have been proposed focused on the Tier-1's computational load reduction.In spite of the efficiency achieved by some of these methods, most of them still need to collect ratedistortion statistics during the encoding process.In some applications, this compels to develop specific strategies as, for example, in the coding of hyper-spectral images [5], in motion JPEG2000 encoders [6], or in hardware-based implementations [7,8]; however, some specific strategies may complicate the architecture of the encoder in terms of memory and speed.On the other hand, the allocation of quality layers is commonly conducted using a uniform or a logarithmic function [9] that determines adequate bitrates for the layers.Although the determination of these bitrates takes negligible computational resources, a ratedistortion optimization process is still necessary to correctly select the bitstream segments included in each layer.The accurate allocation of quality layers is fundamental, since they must provide optimal rate-distortion representations of the image to properly supply quality scalability and quality progression [3], however, the attainment of a target bitrate, or the distortion minimization, for the final codestream may allow some flexibility.This is the case, for example, of digital cameras or devices that do not require accurate rateor quality-control, commonly letting the user to choose among few degrees of freedom.
The purpose of this research is to introduce a simple yet accurate allocation strategy of quality layers that avoids ratedistortion optimization while supplying rough rate control for the final codestream when distortion is minimized, or precise rate control at the expense of slight coding performance.The introduced strategy also reduces the Tier-1's computational load achieving competitive results compared to the state-of-the-art methods, and facilitates the architecture of the JPEG2000 encoder since it does not require the collection of rate-distortion statistics during the encoding process.The key idea of the proposed strategy is to allocate quality layers through overall optimal truncation points that, as it will be seen, are already produced by the fractional bitplane coder of JPEG2000.
This paper is structured as follows: Section 2 briefly overviews the JPEG2000 core coding system, and reviews the state-of-the-art of rate-distortion optimization and allocation strategies; Section 3 introduces the proposed method; and Section 4 assesses the performance of the introduced strategy through extensive experimental results.Section 5 concludes this work pointing out some remarks.

JPEG2000 core coding system
The core coding system of JPEG2000 is constituted by four main stages (see Figure 1): sample data transformations, sample data coding, codestream reorganization, and ratedistortion optimization.The first sample data transformations stage compacts the energy of the image through the wavelet transform, and sets the range of the sample values.Then, the image is logically partitioned in codeblocks that are independently coded by the sample data coding stage, or also called Tier-1.
The purpose of Tier-1 is to produce a bitstream containing first the data that has the greatest distortion reductions.This is achieved through a fractional bitplane coder and the arithmetic coder MQ, encoding each coefficient of codeblock B i from the highest bitplane P = K i −1 to the lowest bitplane P = 0, K i denoting the minimum magnitude of bitplanes needed to represent all coefficients of B i .In each bitplane, Tier-1 scans each coefficient in one of its three sub-bitplane coding passes, which are called Significance Propagation Pass (SPP), Magnitude Refinement Pass (MRP), and Cleanup Pass (CP).The purpose of SPP and CP coding passes is to encode whether insignificant coefficients become significant in the current bitplane.The main difference between SPP and CP is that the former scans those coefficients that are more likely to become significant.MRP coding pass refines the magnitude of those coefficients that have become significant in previous bitplanes.A valuable advantage of this sub-bitplane coding is that it produces an embedded bitstream with a large collection of potential truncation points (one at the end of each coding pass) that can be used by the rate-distortion optimization techniques.
The last stage of the coding pipeline is the codestream reorganization, which encodes the auxiliary data needed to properly identify the content of quality layers through the Tier-2, and organizes the final codestream in containers that encapsulate and sort the bitstream segments using one or several progression orders.

Rate-distortion optimization methods and allocation strategies
The first three stages of the JPEG2000 core coding system are considered as the coding pipeline, whereas rate-distortion optimization may entail different techniques in different operations of the coding system.The main purpose of this stage is to optimally truncate and select those bitstream segments included in each layer-and, by extension, in the final codestream-while attaining the target bitrates determined by the allocation strategy.
The PCRD method achieves this purpose by means of a generalized Lagrange multiplier for a discrete set of points [10].In brief, PCRD first identifies the convex hull for each codeblock bitstream, and it then selects, among all codeblocks, those segments with the highest distortionlength slopes.
As it is stated in the previous section, this process compels to fully encode all codeblocks even when few coding passes are included in the final codestream.The methods proposed in the literature addressing this shortcoming can be roughly classified in four classes, characterized by: (1) to carry out the sample data coding and rate-distortion optimization simultaneously [11][12][13][14]; (2) to collect statistics from the already encoded codeblocks, deciding which coding passes need to be encoded in the remaining codeblocks [4,[15][16][17]; (3) to estimate the rate-distortion contributions of codeblocks before the encoding process [18][19][20]; (4) to determine suitable step sizes for the wavelet subbands [21,22].Other approaches based on variations of the Lagrange multiplier have been proposed in [23][24][25], and the complementary problem of the optimization of the bitrate for a target quality is addressed in [26,27] reducing the computational load of Tier-1 too.On the other hand, rate-distortion optimization applied to enhance the quality scalability of already encoded codestreams is addressed in [28].An extensive review and comparison of these methods can be found in [29].
Most of the proposed methods of rate-distortion optimization can also be employed to allocate successive layers of quality at increasing bitrates.If the bitrates at which the codestream is going to be decoded were known at encoding time, the codestream could be optimally constructed.However, this is not usually the case, and allocation strategies must construct codestreams that work reasonably well for most applications and scenarios.The most common strategy of quality layers allocation is to distribute layers in terms of bitrate through a uniform or a logarithmic function.
Once the target bitrates are determined, the rate-distortion optimization method can straightforwardly truncate and allocate the optimal bitstream segments to each layer.With this strategy, however, the number and distribution of quality layers can only be determined by experience [3,Chapter 8.4.1].
More recently, the rate-distortion optimality of the JPEG2000 codestream has been evaluated under an expected multirate distortion measure that weights the distortion of the image recovered at some bitrates by the probability to recover the image at those bitrates [9].Under this measure and considering different distribution functions, a smart algorithm able to optimally construct codestreams is proposed.Although that research is the first one proposing an optimal allocation for the JPEG2000 codestream, reported experimental results suggest that the improvement achieved by the proposed method is usually small when compared to the common approach.

Main insight
To explain the low degree of improvement achieved by the method proposed in [9], the authors state in a concluding remark that the fractional bitplane coder of JPEG2000 is already a near-optimal scalable bitplane coder, able to generate an almost convex operational rate-distortion curve.The principal consequence of this well-known efficiency is that most truncation points of the bitstream generated for one codeblock have strictly decreasing distortion-length slope or, otherwise stated, that most coding passes can be considered by the Lagrange multiplier.This is also claimed by other authors [13], and is supported experimentally in [30].However, to best of our knowledge, there is no work addressing the optimality of the JPEG2000 fractional bitplane coder beyond the convex hull of individual codeblocks, which is the main insight of this research.If the bitplane coder were also optimal in terms of the overall image optimization, rate-distortion optimization could be avoided, and thus the architecture of JPEG2000 encoders might be simplified.
To study the bitplane coder from the point of view of the overall image optimization-instead of studying it independently for codeblocks-we use a coding strategy that completely avoids rate-distortion optimization by means of implicitly considering the bitplane coder optimality in the overall optimization sense.The comparison of this coding strategy against to the optimal PCRD method will help to disclose the degree of optimality of the JPEG2000 coder; the closer the results achieved by both coders are, the more optimal is, implicitly, the JPEG2000 bitplane coder.
The coding strategy avoiding rate-distortion optimization is based on the Coding Passes Interleaving (CPI) method introduced in [29,31].CPI defines a coding level c as the coding pass of all codeblocks of the image at the same height, given by c = (P • 3) + t, where P stands for the bitplane number, and t stands for the coding pass type with t = {2 for SPP, 1 for MRP, 0 for CP}.Coding passes are encoded from the highest coding level of the image to the lowest one until the target bitrate is achieved.In each coding level, coding passes are selected from the lowest resolution level to the highest one, and in each resolution level, subbands are scanned in order [HL, LH, HH].CPI was originally conceived to provide quality scalability to already encoded codestreams, and to aid in transcoding procedures or in interactive image transmissions.More recently, it has been further improved in [28]  here we apply the CPI's coding strategy in the encoder, since to encode the image consecutively through levels of coding passes can also be used to assume that the bitplane coder is optimal in the overall rate-distortion sense.
To evaluate the bitplane coder in terms of rate-distortion optimality, we compare the coding performance achieved by CPI and PCRD when encoding at the same target bitrates.In this evaluation, both CPI and PCRD construct a codestream containing a single quality layer for each target bitrate.This avoids penalizing the coding performance when more than one quality layer is formed, and gives us the optimal coding performance that can be achieved by both strategies.All images of the ISO 12640-1 corpus have been encoded using both methods at 2000 target bitrates equally distributed in terms of bitrate from 0.001 to 5 bps. Figure 2 depicts the PSNR difference (in dB) achieved between both methods when encoding the cafeteria image.Although PCRD achieves better results than CPI at almost all bitrates, it is worth noting that, at some bitrates, the coding performance achieved by CPI and PCRD is exactly the same.
We have carried out an in-depth evaluation of the CPI's coding strategy, focusing our attention on the points where both methods achieve the same results.This evaluation has disclosed that CPI always achieves optimal results during the same stage of the encoding process, more precisely, when finishing the scanning of a coding level containing coding passes of type SPP, and when finishing the scanning of a coding level containing coding passes of type CP.This is depicted in Figure 2 through the labels on the top.Same results hold for all images of the corpus.Although this experimental evidence suggests that JPEG2000 bitplane coder is generally not optimal for the overall image optimization, it discloses that the coder is able to produce several overall optimal truncation points.The main advantage of these points is that they can be determined a priori requiring null computational resources, thus the collection of rate-distortion statistics can be completely avoided.In addition, since these overall optimal truncation points are as accurate as when using the optimal PCRD method, they can be straightly employed by the JPEG2000 encoder for rate-distortion optimization purposes, for example, to allocate quality layers, or to supply rough rate control.

Allocation strategy
The key-idea of the proposed strategy is to allocate quality layers at the overall optimal truncation points generated by the bitplane coder.Formally stated, the proposed allocation strategy allocates to one quality layer all coding passes belonging to one coding level of type SPP, and also to one quality layer all coding passes belonging to two consecutive coding levels of type MRP and CP.Notice that for each bitplane there are two quality layers, except for the highest one, which only contains a CP coding pass.The assignment of coding levels to quality layers is rather simple.Let P c denote all coding passes of all codeblocks of the image belonging to coding level c, and let T l denote the quality layers, with l ∈ [0, L), L denoting the total number of quality layers, which can be computed through L = K • 2 − 1, K being the number of bitplanes needed to represent all image coefficients.Coding passes P c are included in quality layer T l according to with P standing for the bitplane number of coding level c.We name the proposed method Self-Conducted Allocation strategy of quality LayErs (SCALE), since the JPEG2000 fractional bitplane coder implicitly determines the number and the rate distribution of quality layers, thus it conducts their allocation.
There are some remarks worth to be stressed in such strategy: first, even though SCALE does not use ratedistortion optimization techniques, it allocates layers as accurately as the PCRD method; second, the distribution of coding passes to quality layers can be carried during the Tier-1 coding, thus encoders neither require to maintain codeblock data in memory, nor need any type of postprocessing after codeblock encoding, which may reduce the memory requirements of the block coder engine in more than 30% [3,Chapter 17.2.4];and third, the number and distribution of quality layers is self-determined achieving an adequate distribution for most applications.In addition, SCALE reduces the computational load of Tier-1 driving the encoding process by incrementally encoding coding levels until the target bitrate is reached.This causes that only those coding passes included in the final codestream are encoded, and reduces the Tier-1 computational load achieving competitive results when compared to the state-ofthe-art rate-distortion optimization methods.On the other hand, when a target bitrate R max has to be attained for the final codestream and no loses in coding performance are desired, this encoding strategy cannot provide a strict attainment on the rate since it can only truncate the codestream at the overall optimal truncation points.When strict rate control is necessary, SCALE can truncate the codestream at the target bitrate at the expense of a slight penalization on the coding performance.

EXPERIMENTAL RESULTS
To assess the performance of SCALE we first evaluate the rate-distortion optimality of codestreams constructed with SCALE comparing them to the best results achieved when codestreams are constructed through two common strategies that allocate quality layers using either a logarithmic or a uniform function, and apply PCRD afterward to select the bitstream segments included in each layer.Coding options for all experiments are lossy mode of JPEG2000, derived quantization, 5 DWT levels, no precincts, restart coding variation, progression order LRCP, and codeblock size of 64 × 64.The construction of codestreams through this allocation strategy may use any rate-distortion optimization method other than PCRD.However, the intention of this test is to evaluate the rate-distortion optimality of codestreams constructed by SCALE, against the most accurate method existing in the literature, hence the use of PCRD.Tests have been carried out for the eight natural images of the ISO 12640-1 corpus.Each image has been encoded using SCALE, which has self-determined the number and rate distribution of quality layers and, for the logarithmic and uniform distributions, codestreams containing 10, 20, 40, 80, and 120 quality layers have been constructed.In order to enhance the optimality of codestreams constructed through the uniform distribution, finer quality layers, in terms of bitrate, have been distributed from 0.001 to 0.5 bps, and coarser quality layers from 0.5 bps onwards.Codestreams have been truncated and decoded at 600 equally distributed bitrates, computing the PSNR difference against the optimal performance that can be achieved by JPEG2000 at that particular bitrate when PCRD is used constructing single quality layer codestreams.This optimal performance, which is depicted as the straight line in the figures, is valid only from a theoretic point of view, but gives us the reference to compare allocation methods among them.Figure 3 depicts the luminance results obtained for the portrait image, and Figure 4 depicts the results obtained for the average PSNR of the RGB components for the candle image.In order to ease the visual interpretation, figures only depict the best results achieved by the two rate distribution functions.To assess the performance achieved for all images of the corpus, Table 1 reports the average coding performance of all images, in four bitrate ranges.The evaluation of the rate-distortion optimality of JPEG2000 codestreams was first analyzed in our previous study [32] presented in KES 2007, however, that preliminary work neither integrated the rate-control, nor the computational load reduction for the JPEG2000 encoder.
Results suggest that SCALE self-determines the density and distribution of quality layers adequately, achieving competitive results when comparing to the best allocation strategies.Compared to a logarithmic rate distribution, SCALE allocates quality layers similarly at low bitrates, and achieves better results at medium and high bitrates.Compared to a uniform rate distribution, SCALE is, on average, only 0.05 dB worse.When other state-of-the-art methods of rate-distortion optimization are applied instead of PCRD to form logarithmically or uniformly spaced layers, results do not vary significantly.On the other hand, the fact that SCALE distributes less quality layers than the best results obtained for the logarithmic and uniform rate distributions (for most images SCALE includes 23 quality layers), suggests that the LRCP progression is also an adequate progression order for the intrafragmentation of layers, particularly at low bitrates.
To assess the Tier-1's computational load reduction achieved by SCALE, we have encoded all images of the corpus to the target bitrates reported in Table 2, computing the time spent by SCALE and the PCRD method when encoding at those bitrates.Results are reported as the speed-up achieved by SCALE in comparison to PCRD, on average for all images of the corpus.Compared to the results reported in the literature, there are only two rate-distortion optimization methods [13,20] able to achieve speed-ups similar to the reported ones, suggesting that SCALE is highly competitive in terms of the computational load reduction of the Tier-1 stage.
Table 3 reports the rate control accuracy, and the penalization in the coding performance, achieved by SCALE and state-of-the-art methods.Since SCALE can be applied either maximizing the coding performance (at the expense of rate precision), or attaining the precise target bitrate (at the expense of slight coding performance), the first and the second columns of this table, respectively, reports these two cases.Compared to the two methods with similar speedups [13,20], SCALE achieves a competitive rate control and coding performance.Compared to the methods with lower speed-ups, SCALE achieves regular coding performance when the target bitrate is perfectly attained, and rough rate control when distortion is minimized.Among all analyzed methods, SCALE is the only one that self-determines the number and allocation of quality layers.

CONCLUSIONS
The allocation of quality layers is a fundamental issue of JPEG2000 encoders, needed to construct adequate codestream in the rate-distortion sense.Quality layers allocation is commonly addressed by means of a logarithmic or a uniform function that determines adequate bitrates for layers, afterwards applying a rate-distortion optimization method to optimally select the bitstream segments included in each layer.
This work proposes a Self-Conducted Allocation strategy of quality LayErs (SCALE) that, without employing ratedistortion optimization techniques, is able to allocate quality layers with a precision comparable to the optimal one.Since SCALE neither needs to collect statistics during the encoding process, nor allocates layers employing a postprocessing stage, it can be used by JPEG2000 encoders to facilitate the coding architecture, reduce their complexity in terms of speed and memory, and to minimize the computational load of Tier-1 coding stage.Compared to the state-of-the-art methods of rate-distortion optimization and quality layers allocation, experimental results suggest that SCALE provides the simplest allocation strategy for encoders without sacrificing performance significantly.

Figure 1 :
Figure 1: Stages and operations of the JPEG2000 core coding system.

Figure 2 :
Figure 2: Coding performance evaluation between PCRD and CPI for the cafeteria image.The straight red line depicts the optimal coding performance achieved by PCRD; the CPI line depicts the difference between PCRD and CPI at 2000 equally distributed target bitrates from 0.001 to 5.1 bps.

Figure 3 :Figure 4 :
Figure 3: Coding performance evaluation between SCALE and two common allocation strategies distributing quality layers logarithmically and uniformly spaced in terms of bitrate, for the portrait image (gray scaled, 8 bps).

Table 1 :
Average coding performance achieved with SCALE and the two common strategies of quality layers allocation.Average results, in different bitrate ranges, for all images of the corpus ISO/IEC 12640-1.

Table 2 :
Evaluation of the Tier-1's computational load reduction achieved by SCALE (corpus ISO 12640-1) and state-of-the-art methods (as claimed in the literature).Results are reported as the speed-up achieved by the evaluated method when compared to PCRD.