Open Access

Adaptive lifting scheme with sparse criteria for image coding

  • Mounir Kaaniche1Email author,
  • Béatrice Pesquet-Popescu1,
  • Amel Benazza-Benyahia2 and
  • Jean-Christophe Pesquet3
EURASIP Journal on Advances in Signal Processing20122012:10

https://doi.org/10.1186/1687-6180-2012-10

Received: 30 June 2011

Accepted: 13 January 2012

Published: 13 January 2012

Abstract

Lifting schemes (LS) were found to be efficient tools for image coding purposes. Since LS-based decompositions depend on the choice of the prediction/update operators, many research efforts have been devoted to the design of adaptive structures. The most commonly used approaches optimize the prediction filters by minimizing the variance of the detail coefficients. In this article, we investigate techniques for optimizing sparsity criteria by focusing on the use of an 1 criterion instead of an 2 one. Since the output of a prediction filter may be used as an input for the other prediction filters, we then propose to optimize such a filter by minimizing a weighted 1 criterion related to the global rate-distortion performance. More specifically, it will be shown that the optimization of the diagonal prediction filter depends on the optimization of the other prediction filters and vice-versa. Related to this fact, we propose to jointly optimize the prediction filters by using an algorithm that alternates between the optimization of the filters and the computation of the weights. Experimental results show the benefits which can be drawn from the proposed optimization of the lifting operators.

1 Introduction

The discrete wavelet transform has been recognized to be an efficient tool in many image processing fields, including denoising [1] and compression [2]. Such a success of wavelets is due to their intrinsic features: multiresolution representation, good energy compaction, and decorrelation properties [3, 4]. In this respect, the second generation of wavelets provides very efficient transforms, based on the concept of lifting scheme (LS) developed by Sweldens [5]. It was shown that interesting properties are offered by such structures. In particular, LS guarantee a lossy-to-lossless reconstruction required in some specific applications such as remote sensing imaging for which any distortion in the decoded image may lead to an erroneous interpretation of the image [6]. Besides, they are suitable tools for scalable reconstruction, which is a key issue for telebrowsing applications [7, 8].

Generally, LS are developed for the 1D case and then they are extended in a separable way to the 2D case by cascading vertical and horizontal 1D filtering operators. It is worth noting that a separable LS may not appear always very efficient to cope with the two-dimensional characteristics of edges which are neither horizontal nor vertical [9]. To this respect, several research studies have been devoted to the design of non separable lifting schemes (NSLS) in order to better capture the actual two-dimensional contents of the image. Indeed, instead of using samples from the same rows (resp. columns) while processing the image along the lines (resp. columns), 2D NSLS provide smarter choices in the selection of the samples by using horizontal, vertical and oblique directions at the prediction step [9]. For example, quincunx lifting schemes were found to be suitable for coding satellite images acquired on a quincunx sampling grid [10, 11]. In [12], a 2D wavelet decomposition comprising an adaptive update lifting step and three consecutive fixed prediction lifting steps was proposed. Another structure, which is composed of three prediction lifting steps followed by an update lifting step, has also been considered in the nonadaptive case [13, 14].

In parallel with these studies, other efforts have been devoted to the design of adaptive lifting schemes. Indeed, in a coding framework, the compactness of a LS-based multiresolution representation depends on the choice of its prediction and update operators. To the best of our knowledge, most existing studies have mainly focused on the optimization of the prediction stage. In general, the goal of these studies is to introduce spatial adaptivity by varying the direction of the prediction step [1517], the length of the prediction filters [18, 19] and the coefficient values of the corresponding filters [9, 11, 15, 20, 21]. For instance, Gerek and Çetin [16] proposed a 2D edge-adaptive lifting scheme by considering three direction angles of prediction (0°, 45°, and 135°) and by selecting the orientation which leads to the smallest gradient. Recently, Ding et al. [17] have built an adaptive directional lifting structure with perfect reconstruction: the prediction is performed in local windows in the direction of high pixel correlation. A good directional resolution is achieved by employing fractional pixel precision level. A similar approach was also adopted in [22]. In [18], three separable prediction filters with different numbers of vanishing moments are employed, and then the best prediction is chosen according to the local features. In [19], a set of linear predictors of different lengths are defined based on a nonlinear function related to an edge detector. Another alternative strategy to achieve adaptivity aims at designing lifting filters by defining a given criterion. In this context, the prediction filters are often optimized by minimizing the detail signal variance through mean square criteria [15, 20]. In [9], the prediction filter coefficients are optimized with a least mean squares (LMS) type algorithm based on the prediction error. In addition to these adaptation techniques, the minimization of the detail signal entropy has also been investigated in [11, 21]. In [11], the approach is limited to a quincunx structure and the optimization is performed in an empirical manner using the Nelder-Mead simplex algorithm due to the fact that the entropy is an implicit function of the prediction filter. However, such heuristic algorithms present the drawback that their convergence may be achieved at a local minimum of entropy. In [21], a generalized prediction step, viewed as a mapping function, is optimized by minimizing the detail signal energy given the pixel value probability conditioned to its neighbor pixel values. The authors show that the resulting mapping function also minimizes the output entropy. By assuming that the signal probability density function (pdf) is known, the benefit of this method has firstly been demonstrated for lossless image coding in [21]. Then, an extension of this study to sparse image representation and lossy coding contexts has been presented in [23]. Consequently, an estimation of the pdf must be available at the coder and the decoder side. Note that the main drawback of this method as well as those based on directional wavelet transforms [15, 17, 22, 24, 25] is that they require to transmit losslessly a side information to the decoder which may affect the whole compression performance especially at low bitrates. Furthermore, such adaptive methods lead to an increase of the computational load required for the selection of the best direction of prediction.

It is worth pointing out that, in practical implementations of compression systems, the sparsity of a signal, where a portion of the signal samples are set to zero, has a great impact on the ultimate rate-distortion performance. For example, embedded wavelet-based image coders can spend the major part of their bit budget to encode the significance map needed to locate non-zero coefficients within the wavelet domain. To this end, sparsity-promoting techniques have already been investigated in the literature. Indeed, geometric wavelet transforms such as curvelets [26] and contourlets [27] have been proposed to provide sparse representations of the images. One difficulty of such transforms is their redundancy: they usually produce a number of coefficients that is larger than the number of pixels in the original image. This can be a main obstacle for achieving efficient coding schemes. To control this redundancy, a mixed contourlet and wavelet transform was proposed in [28] where a contourlet transform was used at fine scales and the wavelet transform was employed at coarse scales. Later, bandlet transforms that aim at developing sparse geometric representations of the images have been introduced and studied in the context of image coding and image denoising [29]. Unlike contourlets and curvelets which are fixed transforms, bandelet transforms require an edge detection stage, followed by an adaptive decomposition. Furthermore, the directional selectivity of the 2D complex dual-tree discrete wavelet transforms [30] has been exploited in the context of image [31] and video coding [32]. Since such a transform is redundant, Fowler et al. applied a noise-shaping process [33] to increase the sparsity of the wavelet coefficients.

With the ultimate goal of promoting sparsity in a transform domain, we investigate in this article techniques for optimizing sparsity criteria, which can be used for the design of all the filters defined in a non separable lifting structure. We should note that sparsest wavelet coefficients could be obtained by minimizing an 0 criterion. However, such a problem is inherently non-convex and NP-hard [34]. Thus, unlike previous studies where prediction has been separately optimized by minimizing an 2 criterion (i.e., the detail signal variance), we focus on the minimization of an 1 criterion. Since the output of a prediction filter may be used as an input for other prediction filters, we then propose to optimize such a filter by minimizing a weighted 1 criterion related to the global prediction error. We also propose to jointly optimize the prediction filters by using an algorithm that alternates between filter optimization and weight computation. While the minimization of an 1 criterion is often considered in the signal processing literature such as in the compressed sensing field [35], it is worth pointing out that, to the best of our knowledge, the use of such a criterion for lifting operator design has not been previously investigated.

The rest of this article is organized as follows. In Section 2, we recall our recent study for the design of all the operators involved in a 2D non separable lifting structure [36, 37]. In Section 3, the motivation for using an 1 criterion in the design of optimal lifting structures is firstly discussed. Then, the iterative algorithm for minimizing this criterion is described. In Section 4, we present a weighted 1 criterion which aims at minimizing the global prediction error. In Section 5, we propose to jointly optimize the prediction filters by using an algorithm that alternates between optimizing all the filters and redefining the weights. Finally, in Section 6, experimental results are given and then some conclusions are drawn in Section 7.

2 2D lifting structure and optimization methods

2.1 Principle of the considered 2D NSLS structure

In this article, we consider a 2D NSLS composed of three prediction lifting steps followed by an update lifting step. The interest of this structure is two-fold. First, it allows us to reduce the number of lifting steps and rounding operations. A theoretical analysis has been conducted in [13] showing that NSLS improves the coding performance due to the reduction of rounding effects. Furthermore, any separable prediction-update LS structure has its equivalent in this form [13, 14]. The corresponding analysis structure is depicted in Figure 1.
Figure 1

NSLS decomposition structure.

Let x denote the digital image to be coded. At each resolution level j and each pixel location (m, n), its approximation coefficient is denoted by x j (m, n) and the associated four polyphase components by x0,j(m, n) = x j (2m,2n), x1,j(m,n) = x j (2m,2n+1), x2,j(m,n) = x j (2m+1,2n), and x3,j(m,n) = x j (2m + 1, 2n + 1). Furthermore, we denote by P j ( H H ) , P j ( L H ) , P j ( H L ) , and U j the three prediction and update filters employed to generate the detail coefficients x j + 1 ( H H ) oriented diagonally, x j + 1 ( L H ) oriented vertically, x j + 1 ( H L ) oriented horizontally, and the approximation coefficients xj+1. In accordance with Figure 1, let us introduce the following notation:

  • For the first prediction step, the prediction multiple input, single output (MISO) filter P j ( H H ) can be seen as a sum of three single input, single output (SISO) filters P 0 , j ( H H ) , P 1 , j ( H H ) , and P 2 , j ( H H ) whose respective inputs are the components x0,j, x1,jand x2,j.

  • For the second (resp. third) prediction step, the prediction MISO filter P j ( L H ) (resp. P j ( H L ) ) can be seen as a sum of two SISO filters P 0 , j ( L H ) and P 1 , j ( L H ) (resp. P 0 , j ( H L ) and P 1 , j ( H L ) ) whose respective inputs are the components x2,jand x j + 1 ( H H ) (resp. x1,jand x j + 1 ( H H ) ).

  • For the update step, the update MISO filter U j can be seen as a sum of three SISO filters U j ( H L ) , U j ( L H ) , and U j ( H H ) whose respective inputs are the detail coefficients x j + 1 ( H L ) , x j + 1 ( L H ) , and x j + 1 ( H H ) .

Now, it is easy to derive the expressions of the resulting coefficients in the 2D z-transform domain.a Indeed, the z-transforms of the output coefficients can be expressed as follows:
X j + 1 ( H H ) ( z 1 , z 2 ) = X 3 , j ( z 1 , z 2 ) P 0 , j ( H H ) ( z 1 , z 2 ) X 0 , j ( z 1 , z 2 ) + P 1 , j ( H H ) ( z 1 , z 2 ) X 1 , j ( z 1 , z 2 ) + P 2 , j ( H H ) ( z 1 , z 2 ) X 2 , j ( z 1 , z 2 ) ,
(1)
X j + 1 ( L H ) ( z 1 , z 2 ) = X 2 , j ( z 1 , z 2 ) - P 0 , j ( L H ) ( z 1 , z 2 ) X 0 , j ( z 1 , z 2 ) + P 1 , j ( L H ) ( z 1 , z 2 ) X j + 1 ( H H ) ( z 1 , z 2 ) ,
(2)
X j + 1 ( H L ) ( z 1 , z 2 ) = X 1 , j ( z 1 , z 2 ) - P 0 , j ( H L ) ( z 1 , z 2 ) X 0 , j ( z 1 , z 2 ) + P 1 , j ( H L ) ( z 1 , z 2 ) X j + 1 ( H H ) ( z 1 , z 2 ) ,
(3)
X j + 1 ( z 1 , z 2 ) = X 0 , j ( z 1 , z 2 ) + U j ( H L ) ( z 1 , z 2 ) X j + 1 ( H L ) ( z 1 , z 2 ) + U j ( L H ) ( z 1 , z 2 ) X j + 1 ( L H ) ( z 1 , z 2 ) + U j ( H H ) ( z 1 , z 2 ) X j + 1 ( H H ) ( z 1 , z 2 )
(4)
where, for every polyphase index i {0,1, 2} and orientation o {HH, HL, LH},
P i , j ( o ) ( z 1 , z 2 ) = ( k , l ) P i , j ( o ) p i , j ( o ) ( k , l ) z 1 - k z 2 - 1 , and U j ( o ) ( z 1 , z 2 ) = ( k , l ) U j ( o ) u j ( o ) ( k , l ) z 1 - k z 2 - l .

The set P i , j ( o ) (resp. U j ( o ) ) and the coefficients p i , j ( o ) ( k , l ) (resp. u j ( o ) ( k , l ) ) denote the support and the weights of the three prediction filters (resp. of the update filter). Note that in Equations (1)-(4), we have introduced the rounding operations . in order to allow lossy-to-lossless encoding of the coefficients [7]. Once the considered NSLS structure has been defined, we will focus now on the optimization of its lifting operators.

2.2 Optimization methods

Since the detail coefficients are defined as prediction errors, the prediction operators are often optimized by minimizing the variance of the coefficients (i.e., their 2-norm) at each resolution level. The rounding operators being omitted, it is readily shown that the minimum variance predictors must satisfy the well-known Yule-Walker equations. For example, for the prediction vector p j ( H H ) , the normal equations read
E [ x ̃ j ( H H ) ( m , n ) x ̃ j ( H H ) ( m , n ) T ] p j ( H H ) = E [ x 3 , j ( m , n ) x ̃ j ( H H ) ( m , n ) ]
(5)

where

  • p j ( H H ) = ( p 0 , j ( H H ) , p 1 , j ( H H ) , p 2 , j ( H H ) ) T is the prediction vector, and, for every i {0, 1, 2},
    p i , j ( H H ) = p i , j ( H H ) ( k , l ) ( k , l ) P i , j ( H H ) ,
  • x ̃ j ( H H ) ( m , n ) = ( x 0 , j ( H H ) ( m , n ) , x 1 , j ( H H ) ( m , n ) , x 2 , j ( H H ) ( m , n ) ) T is the reference vector with
    x i , j ( H H ) ( m , n ) = x i , j ( m - k , n - l ) ( k , l ) P i , j ( H H ) .

The other optimal prediction filters p j ( H L ) and p j ( L H ) are obtained in a similar way.

Concerning the update filter, the conventional approach consists of optimizing its coefficients by minimizing the reconstruction error when the detail signal is canceled [20, 38]. Recently, we have proposed a new optimization technique which aims at reducing the aliasing effects [36, 37]. To this end, the update operator is optimized by minimizing the quadratic error between the approximation signal and the decimated version of the output of an ideal low-pass filter:
J ̃ ( u j ) = E x j + 1 ( m , n ) - y j + 1 ( m , n ) 2 = E x 0 , j ( m , n ) + o { H L , L H , H H } ( k , l ) U j ( o ) u j ( o ) ( k , l ) x j + 1 ( o ) ( m - k , n - l ) - y j + 1 ( m , n ) 2
(6)
where y j + 1 ( m , n ) = j ( 2 m , 2 n ) = ( h * x j ) ( 2 m , 2 n ) . Recall that the impulse response of the 2D ideal low-pass filter is defined in the spatial domain by:
( m , n ) 2 , h ( m , n ) = 1 4 sin c m π 2 sin c n π 2 .
(7)
Thus, the optimal update coefficients u j minimizing the criterion J ̃ are solutions of the following linear system of equations:
E [ x j + 1 ( m , n ) x j + 1 ( m , n ) T ] u j = E [ y j + 1 ( m , n ) x j + 1 ( m , n ) ] - E [ x 0 , j ( m , n ) x j + 1 ( m , n ) ]

Where

  • u j = u j ( o ) ( k , l ) ( k , l ) U j ( o ) , o { H L , L H , H H } T is the update weight vector,

  • x j + 1 ( m , n ) = x j + 1 ( o ) ( m - k , n - l ) ( k , l ) P i , j ( o ) , o { H L , L H , H H } T is the reference vector containing the detail signals previously computed at the j th resolution level.

Now, we will introduce a novel twist in the optimization of the different filters: the use of an 1-based criterion in place of the usual 2-based measure.

3 From 2to 1minimization

3.1 Motivation

Wavelet coefficient statistics are often exploited in order to increase image compression efficiency [39]. More precisely, detail wavelet coefficients are often viewed as realizations of a zero-mean continuous random variable whose probability density function f is given by a generalized Gaussian distribution (GGD) [40, 41]:
x , f ( x ; α , β ) = β 2 α Γ 1 β e - | x | α β
(8)

where Γ ( z ) = 0 + t z - 1 e - t d t is the Gamma function, α > 0 is the scale parameter, and β > 0 is the shape parameter. We should note that in the particular case when β = 2 (resp. β = 1), the GGD corresponds to the Gaussian distribution (resp. the Laplace one). The parameters α and β can be easily estimated by using the maximum likelihood technique [42].

Let us now adopt this probabilistic GGD model for the detail coefficients generated by a lifting structure. More precisely, at each resolution level j and orientation o (o {HL,LH,HH}), the wavelet coefficients x j + 1 ( o ) ( m , n ) are viewed as realizations of random variable X j + 1 ( o ) with probability distribution given by a GGD with parameters α j + 1 ( o ) and β j + 1 ( o ) . Thus, this class of distributions leads us to the following sample estimate of the differential entropy h of the variable X j + 1 ( o ) [11, 43]:
h ( X j + 1 ( o ) ) 1 M j N j ( α j + 1 ( o ) ) β j + 1 ( o ) ln ( 2 ) m = 1 M j n = 1 N j x j + 1 ( o ) ( m , n ) β j + 1 ( o ) - log 2 β j + 1 ( o ) 2 α j + 1 ( o ) Γ 1 β j + 1 ( o )
(9)

where (M j ,N j ) corresponds to the dimensions of the subband x j + 1 ( o ) .

Let ( x ¯ j + 1 ( o ) ( m , n ) ) 1 m M j 1 n N j be the outputs of a uniform quantizer with quantization step q driven with the real-valued coefficients ( x j + 1 ( o ) ( m , n ) ) 1 m M j 1 n N j . The coefficients x ̄ j + 1 ( o ) ( m , n ) can be viewed as realizations of a random variable X ¯ j + 1 ( o ) taking its values in {..., -2q, -q, 0, q, 2q, ...}. At high resolution, it was proved in [43] that the following relation holds between the discrete entropy X ¯ j + 1 ( o ) and the differential entropy h of X j + 1 ( o ) :
H ( X ¯ j + 1 ( o ) ) h ( X j + 1 ( o ) ) - log 2 ( q ) .
(10)
Thus, from Equation (9), we see [44] that the entropy H( X ¯ j + 1 ( o ) ) of X ¯ j + 1 ( o ) is (up to a dividing factor and an additive constant) approximatively equal to:
m = 1 M j n = 1 N j x j + 1 ( o ) ( m , n ) β j + 1 ( o ) .

This shows that there exists a close link between the minimization of the entropy of the detail wavelet coefficients and the minimization of their β j + 1 ( o ) -norm. This suggests in particular that most of the existing studies minimizing the 2-norm of the detail signals aim at minimizing their entropy by assuming a Gaussian model.

Based on these results, we have analyzed the detail wavelet coefficients generated by the decomposition based on the lifting structure NSLS(2,2)-OPT-L2 described in Section 6. Figure 2 shows the distribution of each detail subband for the "einst" image when the prediction filters are optimized by minimizing the 2-norm of the detail coefficients. The maximum likelihood technique is used to estimate the β parameter.
Figure 2

The GGD of the. (a) horizontal detail subband x 1 ( H L ) ( β 1 ( H L ) = 1 . 07 ) , (b): vertical detail subband x 1 ( L H ) ( β 1 ( L H ) = 1 . 14 ) , (c): diagonal detail subband x 1 ( H H ) ( β 1 ( H H ) = 1 . 15 ) . The detail coefficients of the "einst" image are optimized by minimizing their 2-norm.

It is important to note that the shape parameters of the resulting detail subbands are closer to β = 1 than to β = 2. Further experiments performed on a large dataset of imagesb have shown that the average of β values are closer to 1 (typical values range from 0.5 to 1.5). These observations suggest that minimizing the 1-norm may be more appropriate than 2 minimization. In addition, the former approach has the advantage of producing sparse representations.

3.2 1minimization technique

Instead of minimizing the 2-norm of the detail coefficients x j + 1 ( o ) as done in [37], we propose in this section to optimize each of the prediction filters by minimizing the following 1 criterion:
o { H L , L H , H H } , i { 1 , 2 , 3 } , J 1 ( p j ( o ) ) = m = 1 M j n = 1 N j x i , j ( m , n ) - ( p j ( o ) ) T x ̃ j ( o ) ( m , n )
(11)

where x i,j (m,n) is the (i + 1) th polyphase component to be predicted, x ̃ j ( o ) ( m , n ) is the reference vector containing the samples used in the prediction step, p j ( o ) is the prediction operator vector to be optimized (L will subsequently designate its length). Although the criterion in (11) is convex, a major difficulty that arises in solving this problem stems from the fact that the function to be minimized is not differentiable. Recently, several optimization algorithms have been proposed to solve nonsmooth minimization problems like (11). These problems have been traditionally addressed with linear programming [45]. Alternatively, a flexible class of proximal optimization algorithms has been developed and successfully employed in a number of applications. A survey on these proximal methods can be found in [46]. These methods are also closely related to augmented Lagrangian methods [47]. In our context, we have employed the Douglas-Rachford algorithm which is an efficient optimization tool for this problem [48].

3.2.1 The Douglas-Rachford algorithm

For minimizing the 1 criterion, we will resort to the concept of proximity operators [49], which has been recognized as a fundamental tool in the recent convex optimization literature [50, 51]. The necessary back-ground on convex analysis and proximity operators [52, 53] is given in Appendix A.

Now, we recall that our minimization problem (11) aims at optimizing the prediction filters by minimizing the 1-norm of the difference between the current pixel x i,j and its predicted value. We note here that x i , j = ( x i , j ( m , n ) ) 1 m M j 1 n N j can be viewed as an element of the Euclidean space K j , where K j = M j × N j . Thus, the minimization problem (11) can be rewritten as:
o { H L , L H , H H } , i { 1 , 2 , 3 } , min z j ( o ) V m = 1 M j n = 1 N j x i , j ( m , n ) - z j ( o ) ( m , n )
(12)
where V is the vector space defined as
V = { z j ( o ) = ( z j ( o ) ( m , n ) ) 1 m M j 1 n N j K j | p j ( o ) L , ( m , n ) { 1 , , M j } × { 1 , , N j } , z j ( o ) ( m , n ) = ( p j ( o ) ) T x ˜ j ( o ) ( m , n ) } .
Based on the definition of the indicator function ı V (see Appendix A), Problem (12) is equivalent to the following minimization problem:
o { H L , L H , H H } , i { 1 , 2 , 3 } , min z j ( o ) K j m = 1 M j n = 1 N j x i , j ( m , n ) - z j ( o ) ( m , n ) + ι V ( z j ( o ) ) .
(13)
Therefore, Problem (13) can be viewed as a minimization of a sum of two functions f1 and f2 defined by:
f 1 ( z j ( o ) ) = | | x i , j - z j ( o ) | | 1 = m = 1 M j n = 1 N j x i , j ( m , n ) - z j ( o ) ( m , n )
(14)
f 2 ( z j ( o ) ) = ι V ( z j ( o ) ) .
(15)

In this case, the Douglas-Rachford algorithm can be applied to provide an appealing numerical solution to Problem (13) (see Appendix B).

Although it is an iterative algorithm, we have observed experimentally that the convergence of the Douglas-Rachford algorithm is generally ensured after a small number of iterations (often between 30 et 60 iterations). As an example, we plot in Figure 3a (resp. 3b) the evolution of the criterion J 1 ( p 0 ( H H ) ) (resp. J 1 ( p 0 ( L H ) ) ) w.r.t the iteration number for this algorithm.
Figure 3

Convergence of the Douglas Rachford algorithm w.r.t the iteration number: (a) evolution of J 1 ( p 0 ( H H ) ) , (b) evolution of J 1 ( p 0 ( L H ) ) .

Once the different terms involved in the iterative algorithm (33) are defined, this one can be applied and further extended to optimize all the prediction filters.

4 Global prediction error minimization technique

4.1 Motivation

Up to now, each prediction filter p j ( o ) ( o { H L , L H , H H } ) has been separately optimized by minimizing the 1-norm of the corresponding detail signal x j + 1 ( o ) which seems appropriate to determine p j ( L H ) and p j ( H L ) . However, it can be noticed from Figure 1 that the diagonal detail signal x j + 1 ( H H ) is also used through the second and the third prediction steps to compute the vertical and the horizontal detail signals respectively. Therefore, the solution p j ( H H ) resulting from the previous optimization method may be suboptimal. As a result, we propose to optimize the prediction filter p j ( H H ) by minimizing the global prediction error, as described in detail in the next section.

4.2 Optimization of the prediction filter p j ( H H )

More precisely, instead of minimizing the 1-norm of x j + 1 ( H H ) , the filter p j ( H H ) will be optimized by minimizing the sum of the 1-norm of the three detail subbands x j + 1 ( o ) . To this respect, we will consider the minimization of the following weighted 1 criterion:
J w 1 ( p j ( H H ) ) = o { H L , L H , H H } m , n κ j + 1 ( o ) x j + 1 ( o ) ( m , n )
(16)

where κ j + 1 ( o ) , o {HL, LH, HH}, are strictly positive weighting terms.

Before focusing on the method employed to minimize the proposed criterion, we should first express J w 1 as a function of the filter p j ( H H ) to be optimized.

Let x i , j ( 1 ) ( m , n ) i { 0 , 1 , 2 , 3 } be the four outputs obtained from x i , j ( m , n ) i { 0 , 1 , 2 , 3 } following the first prediction step (see Figure 1). Although x i , j ( 1 ) ( m , n ) = x i , j ( m , n ) for all i {0, 1, 2}, the use of the superscript will make the presentation below easier. Thus x j + 1 ( o ) can be expressed as:
x j + 1 ( o ) ( m , n ) = i { 0 , 1 , 2 , 3 } k , l h i , j ( o , 1 ) ( k , l ) x i , j ( 1 ) ( m - k , n - l ) = i { 0 , 1 , 2 } k , l h i , j ( o , 1 ) ( k , l ) x i , j ( 1 ) ( m - k , n - l ) + k , l h 3 , j ( o , 1 ) ( k , l ) x 3 , j ( 1 ) ( m - k , n - l )
(17)

where h i , j ( o , 1 ) is a filter which depends on the prediction coefficients of p j ( L H ) and p j ( H L ) .

Knowing that
x 3 , j ( 1 ) ( m , n ) = x 3 , j ( m , n ) - ( p j ( H H ) ) T x ̃ j ( H H ) ( m , n )
(18)
where x ˜ j ( H H ) ( m , n ) = ( x i , j ( m r , n s ) ) ( r , s ) P j ( H H ) i { 0 , 1 , 2 } ( P j ( H H ) is the support of the predictor p j ( H H ) ), we thus obtain, after some simple calculations,
o { H H , L H , H L } , x j + 1 ( o ) ( m , n ) = y j ( o , 1 ) ( m , n ) - ( p j ( H H ) ) T x j ( 0 , 1 ) ( m , n )
(19)
Where
y j ( o , 1 ) ( m , n ) = i { 0 , 1 , 2 } k , l h i , j ( o , 1 ) ( k , l ) x i , j ( 1 ) ( m - k , n - l ) + k , l h 3 , j ( o , 1 ) ( k , l ) x 3 , j ( m - k , n - l ) ,
(20)
x j ( o , 1 ) ( m , n ) = ( k , l h 3 , j ( o , 1 ) ( k , l ) x i , j ( m k r , n l s ) ) ( r , s ) P j ( H H ) i { 0 , 1 , 2 } .
(21)
Consequently, the proposed weighted 1 criterion (Equation (16)) can be expressed as:
J w 1 ( p j ( H H ) ) = o { H L , L H , H H } m , n κ j + 1 ( o ) y j ( o , 1 ) ( m , n ) - ( p j ( H H ) ) T x j ( o , 1 ) ( m , n ) .
(22)

It is worth noting that in practice, the determination of y j ( o , 1 ) ( m , n ) and x j ( o , 1 ) ( m , n ) does not require to find the explicit expressions of h i , j ( o , 1 ) and these signals can be determined numerically as follows:

  • The first term (resp. the second one) in the expression of y j ( o , 1 ) ( m , n ) in Equation (20) can be found by computing x j + 1 ( o ) ( m , n ) from the components x i , j ( 1 ) ( m , n ) i { 0 , 1 , 2 , 3 } while setting x 3 , j ( 1 ) ( m , n ) = 0 (resp. while setting x i , j ( 1 ) ( m , n ) = 0 for i {0,1,2} and x 3 , j ( 1 ) ( m , n ) = x 3 , j ( m , n ) ).

  • The vector x j ( o , 1 ) ( m , n ) in Equation (21) can be found as follows. For each i {0,1,2}, the computation of its component k , l h 3 , j ( o , 1 ) ( k , l ) x i , j ( m - k , n - l ) requires to compute x j + 1 ( o ) ( m , n ) by setting x 3 , j ( 1 ) ( m , n ) = x i , j ( m , n ) and x i , j ( 1 ) ( m , n ) = 0 for i' {0,1,2}. The result of this operation has to be considered for different shift values (r, s) (as can be seen in Equation (21)).

Once the different terms involved in the proposed weighted criterion in Equation (22) are defined (the constant values κ j + 1 ( o ) are supposed to be known), we will focus now on its minimization. Indeed, unlike the previous criterion (Equation 11), which consists only of an 1 term, the proposed criterion is a sum of three 1 terms. To minimize such a criterion (22), one can still use the Douglas-Rachford algorithm through a formulation in a product space [46, 54].

4.2.1 Douglas-Rachford algorithm in a product space

Consider the 1 minimization problem:
min P j ( H H ) o { H , L , L H , H H } m , n k j + 1 ( o ) y j ( o , 1 ) ( m , n ) - ( p j ( H H ) ) T x j ( o , 1 ) ( m , n )
(23)

where κ j + 1 ( o ) , o {HL,LH,HH}, are positive weights.

Since the Douglas-Rachford algorithm described hereabove is designed for the sum of two functions, we can reformulate (23) under this form in the 3-fold product space j
j = K j × K j × K j .
(24)
If we define the vector subspace U as
U = Z j = z j ( H H , 1 ) z j ( L H , 1 ) z j ( H L , 1 ) j | p j ( H H ) L , o { H H , L H , H L } , ( m , n ) { 1 , 2 , , M j } × { 1 , 2 , , N j } , z j ( o , 1 ) ( m , n ) = ( p j ( H H ) ) T x j ( o , 1 ) ( m , n ) = Z j = z j ( H H , 1 ) z j ( L H , 1 ) z j ( H L , 1 ) j | p j ( H H ) L , ( m , n ) { 1 , 2 , , M j } × { 1 , 2 , , N j } , Z j ( m , n ) = X j ( m , n ) T p j ( H H ) with X j ( m , n ) = x j ( H H , 1 ) ( m , n ) , x j ( L H , 1 ) ( m , n ) , x j ( H L , 1 ) ( m , n ) ,
(25)
the minimization problem (Equation 23) is equivalent to
min z j j f 3 ( z j ) + f 4 ( z j )
(26)
where
f 3 ( z j ) = o { H L , L H , H H } m , n κ j + 1 ( o ) y j ( o , 1 ) ( m , n ) - z j ( o , 1 ) ( m , n ) f 4 ( z j ) = ι U ( z j ) .
(27)

We are thus back to a problem involving two functions in a larger space, which is the product space j . So, the Douglas-Rachford algorithm can be applied to solve our minimization problem (see Appendix C). Finally, once the prediction filter p j ( H H ) is optimized and fixed, it can be noticed that the other prediction filters p j ( H L ) and p j ( L H ) can be separately optimized by minimizing J 1 ( p j ( H L ) ) and J 1 ( p j ( L H ) ) as explained in Section 3. This is justified by the fact that the inputs of the filter p j ( H L ) (resp. p j ( L H ) ) are independent of the output of the filter p j ( L H ) (resp. p j ( H L ) ).

5 Joint optimization method

5.1 Motivation

From Equations (20) and (21), it can be observed that y j ( o , 1 ) and x j ( o , 1 ) , which are used to optimize p j ( H H ) , depend on the coefficients of the prediction filters p j ( H L ) and p j ( L H ) . On the other hand, since p j ( H L ) and p j ( L H ) use x j + 1 ( H H ) as reference signal in the second and the third prediction steps, their optimal values will depend on the optimal prediction filter p j ( H H ) . Thus, we conclude that the optimization of the filters ( p j ( H L ) , p j ( L H ) ) depends on the optimization of the filter p j ( H H ) and vice-versa.

A joint optimization method can therefore be proposed which iteratively optimizes the prediction filters p j ( H H ) , p j ( H L ) , and p j ( L H ) .

5.2 Proposed algorithms

While the optimization of the prediction filters p j ( H L ) and p j ( L H ) is simple, the optimization of the prediction filter p j ( H H ) is less obvious. Indeed, if we examine the criterion J w 1 , the immediate question that arises is: which values of the weighting parameters will produce the sparsest decomposition?

A simple solution consists of setting all the weights κ j + 1 ( o ) to one. Then, we are considering the particular case of the unweighted 1 criterion, which simply represents the sum of the 1-norm of the three details subbands x j + 1 ( o ) . In this case, the joint optimization problem is solved by applying the following simple iterative algorithm at each resolution level j.

5.2.1 First proposed algorithm

Initialize the iteration number it to 0.

  • Optimize separately the three prediction filters as explained in Section 3. The resulting filters will be denoted respectively by p j ( H H , 0 ) , p j ( L H , 0 ) , and p j ( H L , 0 ) .

  • Compute the resulting global unweighted prediction error (i.e., the sum of the 1-norm of the three resulting details subbands).

for it = 1,2,3,

  • Set p j ( L H ) = p j ( L H , i t - 1 ) , p j ( H L ) = p j ( H L , i t - 1 ) , and optimize P j ( H H ) by minimizing J w 1 ( p j ( H H ) ) (while setting κ j + 1 ( o ) = 1 ). Let p j ( H H , i t ) be the new optimal filter at iteration it.

  • Set p j ( H H ) = p j ( H H , i t ) , and optimize P j ( L H ) by minimizing J 1 ( p 0 ( L H ) ) . Let p j ( L H , i t ) be the new optimal filter.

  • Set p j ( H H ) = p j ( H H , i t ) , and optimize P j ( H L ) by minimizing J 1 ( p j ( H L ) ) . Let p j ( H L , i t ) be the new optimal filter.

Once the prediction filters are optimized, the update filter is finally optimized as explained in Section 2. However, in practice, once all the filters are optimized and the decomposition is performed, the different generated wavelet subbands x j + 1 ( o ) are weighted before the entropy encoding (using JPEG2000 encoder) in order to obtain a distortion in the spatial domain which is very close to the distortion in the wavelet domain.

More precisely, as we can see in Figure 4, each wavelet subband is multiplied by w j + 1 ( o ) , where w j + 1 ( o ) represents the weight corresponding to x j + 1 ( o ) . Generally, these weights are computed based on the wavelet filters used for the reconstruction process as indicated in [55, 56]. A simple weight computation procedure based on the following assumption can be used. As shown in [55], if the error signal in a subband (i.e., the quantization noise) is white and uncorrelated to the other subband errors, the reconstruction distortion in the spatial domain is a weighted sum of the distortion in each wavelet subband. Therefore, for each subband x j + 1 ( o ) , a white Gaussian noise of variance ( σ j + 1 ( o ) ) 2 is firstly added while keeping the remaining subbands noiseless. Then, the resulting distortion in the spatial domain D ^ s is evaluated by taking the inverse transform. Finally, the corresponding subband weight can be estimated as follows:
Figure 4

Wavelet-based compression procedure involving a weighting prior the encoding stage.

w j + 1 ( o ) = D ^ s × 4 j + 1 ( σ j + 1 ( o ) ) 2 .
(28)

This weighting step is very important since standard bit allocation algorithms assume that the quadratic distortion in the wavelet domain is equal to that in the spatial domain, which is not true in the case of biorthogonal wavelets [55]. Therefore, the filters resulting from the first choice of κ j + 1 ( o ) are suboptimal in the sense that they do not take into account the weighting procedure. For this reason, it has been noticed on some experiments (as it can be seen in Section 6) that the basic optimization technique does not achieve the best coding performances.

Thus, a more judicious choice of κ j + 1 ( o ) should take into account the weighting procedure applied to the wavelet coefficients before the entropy encoding process. Furthermore, if in the general formula in Equation (9), we consider the case of β j + 1 ( o ) = 1 , the differential entropy of X j + 1 ( o ) multiplied by w j + 1 ( o ) becomes:
1 M j N j α j + 1 ( o ) ln ( 2 ) m = 1 M j n = 1 N j | x j + 1 ( o ) ( m , n ) | + log 2 ( 2 α j + 1 ( o ) w j + 1 ( o ) )
(29)
where α j + 1 ( o ) can be estimated by using a classical maximum likelihood estimate. Thus, it can be observed from Equation (29) that the first term of the resulting entropy, which corresponds to a weighted 1-norm of x j + 1 ( o ) , is inversely proportional to α j + 1 ( o ) . Consequently, in order to obtain a criterion (Equation 16) that results in a good approximation of the entropy (29), a more reasonable choice of κ j + 1 ( o ) will be as follows:
κ j + 1 ( o ) = 1 α j + 1 ( o ) .
(30)

Since the resulting entropy of each subband uses weights which also depend on the prediction filters (as mentioned above), we propose an iterative algorithm that alternates between optimizing all the filters and redefining the weights. This algorithm, which is performed for each resolution level j, is as follows.

5.2.2 Second proposed algorithm

Initialize the iteration number it to 0.

  • Optimize separately the three prediction filters as explained in Section 3. The resulting filters will be denoted respectively by p j ( H H , 0 ) , p j ( L H , 0 ) , and p j ( H L , 0 ) .

  • Optimize the update filter (as explained in Section 2).

  • Compute the weights w j + 1 ( o , 0 ) of each detail subband as well as the constant values κ j + 1 ( o , 0 ) .

for it = 1,2,3,...

  • Set p j ( L H ) = p j ( L H , i t - 1 ) , p j ( H L ) = p j ( H L , i t - 1 ) , and optimize P j ( H H ) by minimizing J w 1 ( p j ( H H ) ) . Let p j ( H H , i t ) be the new optimal filter.

  • Set p j ( H H ) = p j ( H H , i t ) , and optimize P j ( L H ) by minimizing J 1 ( p j ( L H ) ) . Let p j ( L H , i t ) be the new optimal filter.

  • Set p j ( H H ) = p j ( H H , i t ) , and optimize P j ( H L ) by minimizing J 1 ( p j ( H L ) ) . Let p j ( H L , i t ) be the new optimal filter.

  • Optimize the update filter (as explained in Section 2).

  • Compute the new weights w j + 1 ( o , i t ) as well as κ j + 1 ( o , i t ) .

Let us now make some observations concerning the convergence of the proposed algorithm. Since the goal of the second weighting procedure is to better approximate the entropy, we have computed at the end of each iteration number it the differential entropy of the three resulting details subbands. More precisely, the evaluated criterion, obtained from Equation (29) by setting α j + 1 ( o ) = 1 κ j + 1 ( o ) and performing the sum over the three details subbands, is given by:
o { H L , L H , H H } κ j + 1 ( o , i t ) M j N j ln ( 2 ) m = 1 M j n = 1 N j | x j + 1 ( o ) ( m , n ) | + log 2 2 w j + 1 ( o , i t ) κ j + 1 ( o , i t ) .
(31)
Figure 5 illustrates the evolution of this criterion w.r.t the iteration number of the algorithm. It can be noticed that the decrease of the criterion is mainly achieved during the early iterations (about after 7 iterations).
Figure 5

Convergence of the optimization algorithm w.r.t the iteration number it when it is performed at: (a) j = 1, (b) j = 2.

6 Experimental results

Simulations were carried out on two kinds of still images originally quantized over 8 bpp which are either single views or stereoscopic ones. A large dataset composed of 50 still imagesb and 50 stereo imagesc has been considered. The gain related to the optimization of the NSLS operators, using different minimization criteria, was evaluated in these contexts. In order to show the benefits of the proposed 1 optimization criterion, we provide the results for the following decompositions carried out over three resolution levels:

  • The first one is the LS corresponding to the 5/3 transform, also known as the (2,2) wavelet transform [7]. In the following, this method will be designated by NSLS(2,2).

  • The second method consists of optimizing the prediction and update filters as proposed in [20, 38]. More precisely, the prediction filters are optimized by minimizing the 2-norm of the detail coefficients whereas the update filter is optimized by minimizing the reconstruction error. This optimization method will be designated by NSLS(2,2)-OPT-GM.

  • The third approach corresponds to our previous method presented recently in [37]. While the prediction filters are optimized in the same way as the second method, the update filter is optimized by minimizing the difference between the approximation signal and the decimated version of the output of an ideal low-pass filter. We emphasize here that the prediction filters are optimized separately. This method will be denoted by NSLS(2,2)-OPT-L2.

  • The fourth method modifies the optimization stage of the prediction filters by using the 1-norm instead of the 2-norm. The optimization of the update filter is similar to the technique used in the third method. In what follows, this method will be designated by NSLS(2,2)-OPT-L1.

  • The fifth method consists of jointly optimizing the prediction filters by using the proposed weighted 2 minimization technique where the weights κ j + 1 ( o ) are set to 1 α j + 1 ( o ) . The optimization of the update filter is similar to the technique used in the third and fourth methods. This optimization method will be designated by NSLS(2,2)-OPT-WL1. We have also tested this optimization method when the weights κ j + 1 ( o ) are set to 1. In this case, the method will be denoted by NSLS(2,2)-OPT-WL1 ( κ j + 1 ( o ) = 1 ).

Figures 6 and 7 show the scalability in quality of the reconstruction procedure by providing the variations of the PSNR versus the bitrate for the images "castle" and "straw" using JPEG2000 as entropy codec. A more exhaustive evaluation was also performed by applying the different methods to 50 still imagesb. The average PSNR per-image is illustrated in Figure 8.
Figure 6

PSNR (in dB) versus the bitrate (bpp) after JPEG2000 progressive encoding for the "castle" image.

Figure 7

PSNR (in dB) versus the bitrate (bpp) after JPEG2000 progressive encoding for the "straw" image.

Figure 8

Average PSNR (in dB) computed over 50 still images versus the bitrate (in bpp) after JPEG2000 progressive encoding.

These plots show that NSLS(2,2)-OPT-L2 outperforms NSLS(2,2) by 0.1-0.5 dB. It can also be noticed that NSLS(2,2)-OPT-L2 and NSLS(2,2)-OPT-GM perform similarly in terms of quality of reconstruction. An improvement of 0.1-0.3 dB is obtained by using the 1 minimization technique instead of the 2 one. Finally, the joint optimization technique (NSLS(2,2)-OPT-WL1) outperforms the separate optimization technique (NSLS(2,2)-OPT-L1) and improves the PSNR by 0.1-0.2 dB. The gain becomes more important (up to 0.55 dB) when compared with NSLS(2,2)-OPT-L2. It is important to note here that setting the weights κ j + 1 ( o ) to 1 (NSLS(2,2)-OPT-WL1 ( κ j + 1 ( o ) = 1 )) can yield to a degradation of about 0.1-0.25 dB compared with NSLS(2,2)-OPT-WL1 on some images.

Figures 9 and 10 display the reconstructed images of "lena" and "einst". In addition to PSNR and SSIM metrics, the quality ofthe reconstructed images are also compared in terms of VSNR (Visual Signal-to-Noise ratio) which was found to be an efficient metric for quantifying the visual fidelity of natural images [57]: it is based on physical luminances and visual angle (rather than on digital pixel values and pixel-based dimensions) to accommodate different viewing conditions. It can be observed that the weighted 1 minimization technique significantly improves the visual quality of reconstruction. The difference in VSNR (resp. PSNR) between NSLS(2,2)-OPT-L2 and NSLS(2,2)-OPT-WL1 ranges from 0.35 dB to 0.6 dB (resp. 0.25 dB to 0.3 dB). Comparing Figure 9c (resp. Figure 10c) with Figure 9d (resp. Figure 10d), the visual improvement achieved by our method can be mainly seen in the hat and face of Lena (resp. in Einstein's face).
Figure 9

Reconstructed image at 0.15 bpp using: (a) Original "lena" image. (b) NSLS(2,2), (c) NSLS(2,2)-OPT-L2, (d) NSLS(2,2)-OPT2-WL1.

Figure 10

Reconstructed image at 0.1 bpp using: (a) Original "einst" image, (b) NSLS(2,2), (c) NSLS(2,2)-OPT-L2, (d) NSLS(2,2)-OPT2-WL1.

The second part of the experiments is concerned with stereo images. Most of the existing studies in this field rely on disparity compensation techniques [58, 59]. The basic principles involved in this technique first consists of estimating the disparity map. Then, one image is considered as a reference image and the other is predicted in order to generate a prediction error referred to as a residual image. Finally, the disparity field, the reference image and the residual one are encoded [58, 60]. In this context, Moellenhoff and Maier [61] analyzed the characteristics of the residual image and proved that such images have properties different from natural images. This suggests that transforms that work well for natural images may not be as well-suited for residual images. For this reason, we also proposed to apply these optimization methods for encoding the reference image and the residual one. The resulting rate-distortion curves for the "white house" and "pentagon" stereo images are illustrated in Figures 11 and 12. A more exhaustive evaluation was also performed by applying the different methods to 50 stereo imagesc. The average PSNR per-image is illustrated in Figure 13. Figure 14 displays the reconstructed target image of the "pentagon" stereo pair. It can be observed that the proposed joint optimization method leads to an improvement of 0.35 dB (resp. 0.016) in VSNR (resp. SSIM) compared with the decomposition in which the prediction filters are optimized separately. For instance, it can be noticed that the edges of the pentagon's building as well as the roads are better reconstructed in Figure 14d.
Figure 11

PSNR (in dB) versus the bitrate (bpp) after JPEG2000 progressive encoding for the "white house" stereo images.

Figure 12

PSNR (in dB) versus the bitrate (bpp) after JPEG2000 progressive encoding for the "pentagon" stereo images.

Figure 13

Average PSNR (in dB) computed over 50 stereo images versus the bitrate (in bpp) after JPEG2000 progressive encoding.

Figure 14

Reconstructed target image at 0.15 bpp using: (a) Original target image for the "pentagon" stereo images. (b) NSLS(2,2) (c) NSLS(2,2)-OPT-L2 (d) NSLS(2,2)-OPT2-WL1.

For completeness, the performance of the proposed method (NSLS(2,2)-OPT-WL1) has also been compared with the 9/7 transform retained for the lossy mode of JPEG2000 standard. Table 1 shows the performance of the latter methods in terms of PSNR, SSIM, and VSNR. Since the human eye cannot always distinguish the subjective image quality at middle and high bitrate, the results were restricted to the lower bitrate values.
Table 1

Performance of the proposed method vs the 9/7 transform

  

0.05 bpp

0.1 bpp

0.15 bpp

0.2 bpp

  

NSLS(2,2)-OPT-WL1

9/7

NSLS (2,2)-OPT-WL1

9/7

NSLS (2,2)-OPT-WL1

9/7

NSLS (2,2)-OPT-WL1

9/7

 

PSNR

27.85

27.75

30.25

30.31

31.23

31.35

31.76

31.92

elaine

SSIM

0.669

0.659

0.716

0.715

0.739

0.739

0.754

0.756

 

VSNR

18.44

18.09

23.10

23.05

25.60

25.50

27.28

27.42

 

PSNR

25.10

25.09

27.08

27.18

28.36

28.51

29.51

29.58

castle

SSIM

0.725

0.712

0.790

0.780

0.825

0.821

0.855

0.851

 

VSNR

17.54

17.22

21.55

21.10

23.74

23.40

25.80

25.32

 

PSNR

27.51

27.58

29.12

29.24

29.92

30.12

30.50

30.70

einst

SSIM

0.603

0.601

0.654

0.655

0.687

0.689

0.710

0.715

 

VSNR

15.33

15.25

18.62

18.71

20.37

20.47

21.59

21.94

 

PSNR

26.70

26.68

29.59

29.56

31.25

31.47

32.70

32.90

lena

SSIM

0.747

0.734

0.818

0.808

0.851

0.850

0.871

0.873

 

VSNR

15.94

15.73

20.56

20.18

24.06

23.95

26.12

26.15

 

PSNR

26.51

26.43

29.81

30.33

31.84

32.63

33.61

34.44

cameraman

SSIM

0.783

0.774

0.847

0.842

0.887

0.892

0.914

0.915

 

VSNR

16.74

16.34

21.73

21.66

24.94

25.70

27.75

28.34

 

PSNR

24.65

24.55

26.82

26.86

28.43

28.54

29.52

29.74

boat

SSIM

0.675

0.661

0.753

0.746

0.806

0.802

0.837

0.836

 

VSNR

13.41

13.03

17.14

16.89

20.24

19.76

22.19

21.89

 

PSNR

25.75

25.50

29.24

29.17

30.88

31.16

31.12

32.38

peppers

SSIM

0.720

0.705

0.789

0.778

0.818

0.815

0.834

0.832

 

VSNR

16.00

15.51

21.87

21.19

25.18

25.00

27.22

27.09

 

PSNR

24.19

23.84

30.66

29.88

33.99

33.10

36.13

35.82

plane

SSIM

0.809

0.754

0.890

0.871

0.917

0.903

0.931

0.921

 

VSNR

9.48

7.72

17.73

15.51

21.28

20.30

24.68

24.12

 

PSNR

24.88

24.72

27.67

27.73

29.24

29.46

30.45

30.65

average

SSIM

0.647

0.633

0.727

0.720

0.773

0.771

0.803

0.802

 

VSNR

14.50

13.98

18.90

18.62

21.77

21.71

23.90

23.85

The average evaluation was computed over 50 still images.

The values in bold have been used to identify the method achieving the best coding performance.

While the proposed method is less performant in terms of PSNR than the 9/7 transform for some images, it can be noticed from Table 1 that better results are obtained in terms of perceptual quality. For instance, Figures 15 and 16 illustrate some reconstructed images. It can be observed that the proposed method (NSLS(2,2)-OPT-WL1) achieves a gain of about 0.2-0.4 dB (resp. 0.01-0.013) in terms of VSNR (resp. SSIM). Furthermore, Figures 17 and 18 display the reconstructed target image for the stereo image pairs "shrub" and "spot5". While NSLS(2,2)-OPT-WL1 and 9/7 transform show similar visual quality for the "spot5" pair, the proposed method leads to better quality of reconstruction than the 9/7 transform for the "shrub" stereo images.
Figure 15

Zoom applied on the reconstructed "lena" image at 0.05 bpp using: (a) 9/7 transform (b) NSLS(2,2)-OPT-WL1.

Figure 16

Zoom applied on the reconstructed "lena" image at 0.1 bpp using: (a) 9/7 transform (b) NSLS(2,2)-OPT-WL1.

Figure 17

Zoom applied on the reconstructed target image for the "shrub" stereo images at 0.1 bpp using: (a) 9/7 transform (b) NSLS(2,2)-OPT-WL1.

Figure 18

Zoom applied on the reconstructed target image for a "spot5" stereo images at 0.1 bpp using: (a) 9/7 transform (b) NSLS(2,2)-OPT-WL1.

Before concluding the article, let us now study the complexity of the proposed sparsity criteria for the optimization of the prediction filters. Table 2 gives the iteration number and the execution time for the 1 and weighted 1 minimization techniques when considering different image sizes. These results have been obtained with a Matlab implementation on an Intel Core 2 (2.93 GHz) architecture. It is clear that the execution time increases with the image size. Furthermore, we note that the 1 minimization technique is very fast whereas the weighted 1 technique needs an additional time of about 0.3-2.6 seconds. This increase is due to the fact that the algorithm is reformulated in a three-fold product space as explained in Section 4.2. However, since the Douglas-Rachford algorithm in a product space has some blocks which can be implemented in a parallel way, the complexity can be reduced significantly (up to three times) when performing an appropriate implementation on a multicore architecture. These results as well as the good compression performance in terms of reconstruction quality confirm the effectiveness of the proposed sparsity criteria.
Table 2

Computation time (s) of the sparse optimization methods for the design of each prediction filter

 

Plane

Girl

Boat

Cameraman

 

256 × 256

256 × 256

512 × 512

512 × 512

 

it

Time (s)

it

time(s)

it

time(s)

it

time(s)

1 criterion: p 0 ( H L )

22

0.09

27

0.09

30

0.38

60

0.81

1 criterion: p 0 ( L H )

55

0.15

28

0.09

31

0.39

100

1.13

weighted 1 criterion: p 0 ( H H )

30

0.42

35

0.49

49

3.08

30

2.01

7 Conclusion

In this article, we have studied different optimization techniques for the design of filters in a NSLS structure. A new criterion has been presented for the optimization of the prediction filters in this context. The idea consists of jointly optimizing these filters by minimizing iteratively a weighted 1 criterion. Experimental results carried out on still images and stereo images pair have illustrated the benefits which can be drawn from the proposed optimization technique. In future study, we plan to extend this optimization method to LS with more than two stages like the P-U-P and P-U-P-U structures.

Appendix

A Some background on convex optimization

The main definitions which will be useful to understand our optimization algorithms are briefly summarized below:

  • K is the usual K-dimensional Euclidean space with norm ||.||.

  • The distance function to a nonempty set C K is defined by
    x K , d C ( x ) = inf y C | | x - y | | .
  • The projection of x K onto a nonempty closed convex set C K is the unique point P C (x) C such that d C (x) = ||x - P C (x)||.

  • The indicator function of C is given by
    x K , ι C ( x ) = 0 if x C , + otherwise .
    (32)
  • Γ0( K ) is the class of functions from K to ] - ∞, + ∞] which are lower semi-continuous, convex, and not identically equal to + ∞.

  • The proximity operator of f Γ0( K ) is pro x f : K K : x arg min y K f ( y ) + 1 2 | | x - y | | 2 . It is important to note that the proximity operator generalizes the notion of a projection operator onto a closed convex set C in the sense that pro x ι C = P C , and it moreover possesses most of its attractive properties [49] that make it particularly well-suited for designing iterative minimization algorithms.

B The Douglas Rachford algorithm

The solution of the Problem (13) (which is the sum of the two functions f1 and f2) is obtained by the following iterative algorithm:
Set t j , 0 ( o ) K j , γ > 0 , λ ] 0 , 2 [ , and , for k = 0 , 1 , 2 , z j , k ( o ) = prox γ f 2 t j , k ( o ) t j , k + 1 ( o ) = t j , k ( o ) + λ ( prox γ f 1 ( 2 z j , k ( o ) t j , k ( o ) ) z j , k ( o ) ) .
(33)
An important feature of this algorithm is that it proceeds by splitting, in the sense that the functions f1 and f2 are dealt with in separate steps: in the first step, only function f2 is required to obtain z j , k ( o ) and, in the second step, only function f1 is involved to obtain t j , k + 1 ( o ) . Furthermore, it can be seen that the algorithm requires to compute two proximity operators pro x γ f 1 , and pro x γ f 2 at each iteration. One can find in [46] closed-form expression of the proximity operator of various functions in Γ0(). In our case, the proximity operator of γf1 is given by the soft-thresholding rule:
t j , k ( o ) K j , prox γ f 1 ( t j , k ( o ) ) = ( π j , k ( o ) ( m , n ) ) 1 m M j 1 n N j
(34)
where π j , k ( o ) ( m , n ) = sof t [ - γ , γ ] t j , k ( o ) ( m , n ) - x i , j ( m , n ) + x i , j ( m , n ) and
α , sof t [ - γ , γ ] ( α ) = sign ( α ) ( | α | - γ ) if | α | > γ 0 otherwise .
(35)
Concerning γf2, it is easy to check that its proximity operator is expressed as:
t j , k ( o ) K j , prox γ f 2 ( t j , k ( o ) ) = P V ( t j , k ( o ) ) = ( z ^ j , k ( o ) ( m , n ) ) 1 m M j 1 n N j = ( ( p j , k ( o ) ) T x ˜ j ( o ) ( m , n ) ) 1 m M j 1 n N j
(36)

where p j , k ( o ) = m , n x ̃ j ( o ) ( m , n ) ( x ̃ j ( o ) ( m , n ) ) T - 1 m , n x ̃ j ( o ) ( m , n ) t j , j ( o ) ( m , n ) . .

Finally it is important to note that it has been shown (see [62] and references therein) that every sequence ( z j , k ( o ) ) k generated by the Douglas-Rachford algorithm (33) converges to a solution to problem (13) provided that the parameters γ and λ are fixed as indicated.

C The Douglas-Rachford algorithm in a product space

The solution of the problem (26) (which is the sum of the two functions f3 and f4) is obtained by the following iterative algorithm:
Set t j , 0 j , γ > 0 , λ ] 0 , 2 [ , and , for k = 0 , 1 , 2 , z j , k = prox γ f 4 t j , k t j , k + 1 = t j , k + λ ( prox γ f 3 ( 2 z j , k t j , k ) z j , k ) .
(37)
Note that the above algorithm requires to compute the proximity operators of 2 new functions γf3 and γf4. Concerning the proximity operator of γf3, we have
t j = t j ( H H , 1 ) t j ( L H , 1 ) t j ( H L , 1 ) j , pro x γ f 3 ( t j , k ) = sof t - γ κ j + 1 ( H H ) , γ κ j + 1 ( H H ) ( t j , k ( H H , 1 ) ) sof t - γ κ j + 1 ( L H ) , γ κ j + 1 ( L H ) ( t j , k ( L H , 1 ) ) sof t - γ κ j + 1 ( H L ) , γ κ j + 1 ( H L ) ( t j , k ( H L , 1 ) )
(38)
Where
o { H H , L H , H L } , soft [ γ κ j + 1 ( o ) , γ κ j + 1 ( o ) ] ( t j , k ( o , 1 ) ) = ( soft [ γ κ j + 1 ( o ) , γ κ j + 1 ( o ) ] ( t j , k ( o , 1 ) ( m , n ) ) ) 1 m M j 1 n N j .
Concerning γf4, its proximity operator is given by:
prox γ f 4 ( t j , k ) = P U ( t j , k ) = ( Z ^ j , k ( m , n ) ) 1 m M j 1 n N j = ( X j ( m , n ) T p j , k ( H H ) ) 1 m M j 1 n N j
(39)
where
p j , k ( H H ) = m n X j ( m , n ) X j ( m , n ) T - 1 m , n X j ( m , n ) t j , k ( m , n ) .

Endnotes

Declarations

Acknowledgements

Part of this study has been presented in [63].

Authors’ Affiliations

(1)
Télécom ParisTech
(2)
Ecole Supérieure des Communications de Tunis (SUP'COM-Tunis), Université de Carthage
(3)
Laboratoire d'Informatique Gaspard Monge and CNRS UMR 8049, Université Paris-Est

References

  1. Donoho DL, Johnstone IM: Ideal spatial adaptation by wavelet shrinkage. Biometrika 1994, 81(3):425-455. 10.1093/biomet/81.3.425MathSciNetView ArticleGoogle Scholar
  2. Antonini M, Barlaud M, Mathieu P, Daubechies I: Image coding using wavelet transform. IEEE Trans Image Process 1992, 1(2):205-220. 10.1109/83.136597View ArticleGoogle Scholar
  3. Woods JW: Subband Image Coding. Kluwer Academic Publishers, Norwell, MA, USA; 1990.Google Scholar
  4. Mallat S: A Wavelet Tour of Signal Processing. Academic Press, San Diego; 1998.Google Scholar
  5. Sweldens W: The lifting scheme: a custom-design construction of biorthogonal wavelets. Volume 3. Appl Comput Harmonic Anal; 1996:186-200.Google Scholar
  6. Arai K: Preliminary study on information lossy and lossless coding data compression for the archiving of ADEOS data. IEEE Trans Geosci Remote Sens 1990, 28: 732-734. 10.1109/TGRS.1990.573001View ArticleGoogle Scholar
  7. Calderbank AR, Daubechies I, Sweldens W, Yeo BL: Wavelet transforms that map integers to integers. Appl Comput Harmonic Anal 1998, 5(3):332-369. 10.1006/acha.1997.0238MathSciNetView ArticleGoogle Scholar
  8. Taubman D, Marcellin M: JPEG2000: Image Compression Fundamentals, Standards and Practice. Kluwer Academic Publishers, Norwell, MA, USA; 2001.Google Scholar
  9. Gerek ON, Çetin AE: Adaptive polyphase subband decomposition structures for image compression. IEEE Trans Image Process 2000, 9(10):1649-1660. 10.1109/83.869176View ArticleGoogle Scholar
  10. Gouze A, Antonini M, Barlaud M, Macq B: Optimized lifting scheme for two-dimensional quincunx sampling images. In IEEE International Conference on Image Processing. Volume 2. Thessa-loniki, Greece; 2001:253-258.Google Scholar
  11. Benazza-Benyahia A, Pesquet JC, Hattay J, Masmoudi H: Block-based adaptive vector lifting schemes for multichannel image coding. EURASIP Int J Image Video Process 2007, 10. (2007)Google Scholar
  12. Heijmans H, Piella G, Pesquet-Popescu B: Building adaptive 2D wavelet decompositions by update lifting. In IEEE International Conference on Image Processing. Volume 1. Rochester, New York, USA; 2002:397-400.View ArticleGoogle Scholar
  13. Chokchaitam S: A non-separable two-dimensional LWT for an image compression and its theoretical analysis. Thammasat Internat J Sci Technol 2004, 9: 35-43.Google Scholar
  14. Sun YK: A two-dimensional lifting scheme of integer wavelet transform for lossless image compression. In International Conference on Image Processing. Volume 1. Singapore; 2004:497-500.Google Scholar
  15. Chappelier V, Guillemot C: Oriented wavelet transform for image compression and denoising. IEEE Trans Image Process 2006, 15(10):2892-2903.View ArticleGoogle Scholar
  16. Gerek ON, Çetin AE: A 2D orientation-adaptive prediction filter in lifting structures for image coding. IEEE Trans Image Process 2006, 15: 106-111.View ArticleGoogle Scholar
  17. Ding W, Wu F, Wu X, Li S, Li H: Adaptive directional lifting-based wavelet transform for image coding. IEEE Trans Image Process 2007, 10(2):416-427.MathSciNetView ArticleGoogle Scholar
  18. Boulgouris NV, Strintzis MG: Reversible multiresolution image coding based on adaptive lifting. In IEEE International Conference on Image Processing. Volume 3. Kobe, Japan; 1999:546-550.Google Scholar
  19. Claypoole RL, Davis G, Sweldens W, Baraniuk RG: Nonlinear wavelet transforms for image coding. the 31st Asilomar Conference on Signals, Systems and Computers 1997, 1: 662-667.Google Scholar
  20. Gouze A, Antonini M, Barlaud M, Macq B: Design of signal-adapted multidimensional lifting schemes for lossy coding. IEEE Trans Image Process 2004, 13(12):1589-1603. 10.1109/TIP.2004.837556View ArticleGoogle Scholar
  21. Solé J, Salembier P: Generalized lifting prediction optimization applied to lossless image compression. IEEE Signal Process Lett 2007, 14(10):695-698.View ArticleGoogle Scholar
  22. Chang CL, Girod B: Direction Adaptive discrete wavelet transform for image compression. IEEE Trans Image Process 2007, 16(5):1289-1302.MathSciNetView ArticleGoogle Scholar
  23. Rolon JC, Salembier P: Generalized lifting for sparse image representation and coding. In Picture Coding Symposium. Lisbon, Portugal; 2007.Google Scholar
  24. Liu Y, Ngan KN: Weighted adaptive lifting-based wavelet transform for image coding. IEEE Trans Image Process 2008, 17(4):500-511.MathSciNetView ArticleGoogle Scholar
  25. Mallat S: Geometrical grouplets. Appl Comput Harmonic Anal 2009, 26(2):161-180. 10.1016/j.acha.2008.03.004MathSciNetView ArticleGoogle Scholar
  26. Candes JE, Donoho LD: New tight frames of curvelets and optimal representations of objects with piecewise C2singularities. Commun Pure Appl Math 2004, 57(2):219-266. 10.1002/cpa.10116MathSciNetView ArticleGoogle Scholar
  27. Do MN, Vetterli M: The contourlet transform: an efficient directional multiresolution image representation. IEEE Trans Image Process 2005, 14(12):2091-2106.MathSciNetView ArticleGoogle Scholar
  28. Chappelier V, Guillemot C, Marinkovic S: Image coding with iterated contourlet and wavelet transforms. In International Conference on Image Processing. Volume 5. Singapore; 2004:3157-3160.Google Scholar
  29. Pennec EL, Mallat S: Sparse geometric image representations with bandelets. IEEE Trans Image Process 2005, 14(4):423-438.MathSciNetView ArticleGoogle Scholar
  30. Kingsbury NG: Complex wavelets for shift invariant analysis and filtering of signals. J Appl Comput Harmonic Anal 2001, 10: 234-253. 10.1006/acha.2000.0343MathSciNetView ArticleGoogle Scholar
  31. Fowler JE, Boettcher JB, Pesquet-Popescu B: Image coding using a complex dual-tree wavelet transform. In the European Signal Processing Conference. Poznan, Poland; 2007:994-998.Google Scholar
  32. Boettcher JB, Fowler JE: Video coding using a complex wavelet transform and set partitioning. IEEE Signal Process Lett 2007, 14(9):633-636.View ArticleGoogle Scholar
  33. Reeves TH, Kingsbury NG: Overcomplete image coding using iterative projection-based noise shaping. In International Conference on Image Processing. Volume 3. Rochester, NY; 2007:597-600.Google Scholar
  34. Natarajan BK: Sparse approximate solutions to linear systems. SIAM J Comput 1995, 24(2):227-234. 10.1137/S0097539792240406MathSciNetView ArticleGoogle Scholar
  35. Donoho D: Compressed Sensing. IEEE Trans Inf Theory 2006, 52(4):1289-1306.MathSciNetView ArticleGoogle Scholar
  36. Kaaniche M, Pesquet JC, Benazza-Benyahia A, Pesquet-Popescu B: Two-dimensional non separable adaptive lifting scheme for still and stereo image coding. In IEEE International Conference on Acoustics, Speech and Signal Processing. Dallas, Texas, USA; 2010:1298-1301.Google Scholar
  37. Kaaniche M, Benazza-Benyahia A, Pesquet-Popescu B, Pesquet JC: Non separable lifting scheme with adaptive update step for still and stereo image coding. Elsevier Signal Processing: Special issue on Advances in Multirate Filter Bank Structures and Multiscale Representations 2011, 91(12):2767-2782.View ArticleGoogle Scholar
  38. Pesquet-Popescu B: Two-stage adaptive filter bank. first filling date 1999/07/27, official filling number 99401919.8, European patent number EP1119911. 1999.Google Scholar
  39. LoPresto SM, Ramchandran K, Orchard MT: Image coding based on mixture modeling of wavelet coefficients and a fast estimation quantization framework. In Data Compression Conference. Snowbird, USA; 1997:221-230.Google Scholar
  40. Mallat S: A theory for multiresolution signal decomposition. IEEE Trans Pattern Anal Mach Intell 1989, 11: 674-693. 10.1109/34.192463View ArticleGoogle Scholar
  41. Payan F, Antonini M: An efficient bit allocation for compression normal meshes whith an error driven quantization. Comput Aid Geometr Design (Special Issue On Geometric Mesh Processing) 2005, 22(5):466-486.MathSciNetView ArticleGoogle Scholar
  42. Do MN, Vetterli M: Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance. IEEE Trans Image Process 2002, 11(2):146-158. 10.1109/83.982822MathSciNetView ArticleGoogle Scholar
  43. Gish H, Pierce JN: Asymptotically efficient quantizing. IEEE Trans Inf Theory 1969, 14(5):676-683.View ArticleGoogle Scholar
  44. Petrisor T, Pesquet-Popescu B, Pesquet JC: A Compressed Sensing Approach to Frame-Based Multiple Description Coding. In IEEE International Conference on Acoustics, Speech and Signal Processing. Honolulu, HI; 2007:709-712.Google Scholar
  45. Chen S, Doniho D, Saunders M: Atomic decomposition by basis pursuit. SIAM Rev 2001, 43: 129-159. 10.1137/S003614450037906XMathSciNetView ArticleGoogle Scholar
  46. Combettes PL, Pesquet JC: Proximal splitting methods in signal processing. In Fixed-Point Algorithms for Inverse Problems in Science and Engineering. Edited by: Bauschke HH, Burachik, R, Combettes, PL, Elser, V, Luke, DR, Wolkowicz, H. Springer-Verlag, New York; 2010.Google Scholar
  47. Afonso M, Bioucas-Dias JM, Figueiredo MAT: An augmented Lagrangian approach to the constrained optimization formulation of imaging inverse problems. IEEE Trans Image Process 2011, 20(3):681-695.MathSciNetView ArticleGoogle Scholar
  48. Eckstein J, Bertsekas DP: On the Douglas-Rachford splitting methods and the proximal point algorithm for maximal monotone operators. Math Programm 1992, 55: 293-318. 10.1007/BF01581204MathSciNetView ArticleGoogle Scholar
  49. Moreau JJ: Proximité et dualité dans un espace hilbertien. Bulletin de la Societé Mathématique de France 1965, 93: 273-288.MathSciNetGoogle Scholar
  50. Chaux C, Combettes P, Pesquet JC, Wajs V: A variational formulation for frame based inverse problems. Inverse Probl 2007, 23(4):1495-1518. 10.1088/0266-5611/23/4/008MathSciNetView ArticleGoogle Scholar
  51. Combettes PL, Wajs VR: Signal Recovery by Proximal Forward-Backward Splitting. Multiscale Model Simul 2005, 4(4):1168-1200. 10.1137/050626090MathSciNetView ArticleGoogle Scholar
  52. Hiriart-Urruty JB, Lemaréchal C: Convex Analysis and Minimization Algorithms. Springer-Verlag, Berlin, London; 1993.View ArticleGoogle Scholar
  53. Rockafellar RT: Convex Analysis. Volume 28. Princeton University Press, N.J; 1970.View ArticleGoogle Scholar
  54. Briceno-Arias LM, Combettes PL, Pesquet JC, Pustelnik N: Proximal algorithms for multicomponent image recovery problems. J Math Imag Vision 2010, 41(1-2):3-22.MathSciNetView ArticleGoogle Scholar
  55. Usevitch B: Optimal bit allocation for biorthogonal wavelet coding. In Data Compression Conference. Snowbird, USA; 1996:387-395.Google Scholar
  56. Parrilli S, Cagnazzo M, Pesquet-Popescu B: Distortion evaluation in transform domain for adaptive lifting schemes. In International Workshop on Multimedia Signal Processing. Cairns, Queensland, Australia; 2008:200-205.Google Scholar
  57. Chandler DM, Hemami SS: VSNR: A wavelet-based Visual Signal-to-Noise Ratio for natural images. IEEE Trans Image Process 2007, 16(9):2284-2298.MathSciNetView ArticleGoogle Scholar
  58. Boulgouris NV, Strintzis MG: A family of wavelet-based stereo image coders. IEEE Trans. Circuits and Syst. Video Technol 2002, 12(10):898-903.Google Scholar
  59. Kaaniche M, Benazza-Benyahia A, Pesquet-Popescu B, Pesquet JC: Vector lifting schemes for stereo image coding. IEEE Trans Image Process 2009, 18(11):2463-2475.MathSciNetView ArticleGoogle Scholar
  60. Frajka T, Zeger K: Residual image coding for stereo image compression. Opt Eng 2003, 42: 182-189. 10.1117/1.1526492View ArticleGoogle Scholar
  61. Moellenhoff MS, Maier MW: Characteristics of disparity-compensated stereo image pair residuals. Signal Process: Image Commun 1998, 14: 49-55.Google Scholar
  62. Combettes PL, Pesquet JC: A Douglas-Rachford splitting approach to nonsmooth convex variational signal recovery. IEEE J Sel Top Signal Process 2007, 1: 564-574.View ArticleGoogle Scholar
  63. Kaaniche M, Pesquet JC, Benazza-Benyahia A, Pesquet-Popescu B: Schémas de lifting adaptatifs via des critères parcimonieux. In Colloque GRETSI. Bordeaux, France; 2011:4.Google Scholar

Copyright

© Kaaniche et al; licensee Springer. 2012

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.