4.1 Motivation
Up to now, each prediction filter {\mathbf{p}}_{j}^{\left(o\right)}\left(o\in \left\{HL,LH,HH\right\}\right) has been separately optimized by minimizing the ℓ_{1}norm of the corresponding detail signal {x}_{j+1}^{\left(o\right)} which seems appropriate to determine {\mathbf{p}}_{j}^{\left(LH\right)} and {\mathbf{p}}_{j}^{\left(HL\right)}. However, it can be noticed from Figure 1 that the diagonal detail signal {x}_{j+1}^{\left(HH\right)} is also used through the second and the third prediction steps to compute the vertical and the horizontal detail signals respectively. Therefore, the solution {\mathbf{p}}_{j}^{\left(HH\right)} resulting from the previous optimization method may be suboptimal. As a result, we propose to optimize the prediction filter {\mathbf{p}}_{j}^{\left(HH\right)} by minimizing the global prediction error, as described in detail in the next section.
4.2 Optimization of the prediction filter {\mathbf{p}}_{j}^{\left(HH\right)}
More precisely, instead of minimizing the ℓ_{1}norm of {x}_{j+1}^{\left(HH\right)}, the filter {\mathbf{p}}_{j}^{\left(HH\right)} will be optimized by minimizing the sum of the ℓ_{1}norm of the three detail subbands {x}_{j+1}^{\left(o\right)}. To this respect, we will consider the minimization of the following weighted ℓ_{1} criterion:
{\mathcal{J}}_{w{\ell}_{1}}\left({\mathbf{p}}_{j}^{\left(HH\right)}\right)=\sum _{o\in \left\{HL,LH,HH\right\}}\sum _{m,n}{\kappa}_{j+1}^{\left(o\right)}\left{x}_{j+1}^{\left(o\right)}\left(m,n\right)\right
(16)
where {\kappa}_{j+1}^{\left(o\right)}, o ∈ {HL, LH, HH}, are strictly positive weighting terms.
Before focusing on the method employed to minimize the proposed criterion, we should first express {\mathcal{J}}_{w{\ell}_{1}} as a function of the filter {\mathbf{p}}_{j}^{\left(HH\right)} to be optimized.
Let {\left({x}_{i,j}^{\left(1\right)}\left(m,n\right)\right)}_{i\in \left\{0,1,2,3\right\}} be the four outputs obtained from {\left({x}_{i,j}\left(m,n\right)\right)}_{i\in \left\{0,1,2,3\right\}} following the first prediction step (see Figure 1). Although {x}_{i,j}^{\left(1\right)}\left(m,n\right)={x}_{i,j}\left(m,n\right) for all i ∈ {0, 1, 2}, the use of the superscript will make the presentation below easier. Thus {x}_{j+1}^{\left(o\right)} can be expressed as:
\begin{array}{ll}\hfill {x}_{j+1}^{\left(o\right)}\left(m,n\right)& =\sum _{i\in \left\{0,1,2,3\right\}}\sum _{k,l}{h}_{i,j}^{\left(o,1\right)}\left(k,l\right){x}_{i,j}^{\left(1\right)}\left(mk,nl\right)\phantom{\rule{2em}{0ex}}\\ =\sum _{i\in \left\{0,1,2\right\}}\sum _{k,l}{h}_{i,j}^{\left(o,1\right)}\left(k,l\right){x}_{i,j}^{\left(1\right)}\left(mk,nl\right)+\sum _{k,l}{h}_{3,j}^{\left(o,1\right)}\left(k,l\right){x}_{3,j}^{\left(1\right)}\left(mk,nl\right)\phantom{\rule{2em}{0ex}}\end{array}
(17)
where {h}_{i,j}^{\left(o,1\right)} is a filter which depends on the prediction coefficients of {\mathbf{p}}_{j}^{\left(LH\right)} and {\mathbf{p}}_{j}^{\left(HL\right)}.
Knowing that
{x}_{3,j}^{\left(1\right)}\left(m,n\right)={x}_{3,j}\left(m,n\right){\left({\mathbf{p}}_{j}^{\left(HH\right)}\right)}^{\mathsf{\text{T}}}{\stackrel{\u0303}{\mathbf{x}}}_{j}^{\left(HH\right)}\left(m,n\right)
(18)
where {\tilde{x}}_{j}^{(HH)}(m,n)={\left({x}_{i,j}(mr,ns)\right)}_{\underset{i\in \{0,1,2\}}{(r,s)\in {\mathcal{P}}_{j}^{(HH)}}}({\mathcal{P}}_{j}^{(HH)} is the support of the predictor {\mathbf{p}}_{j}^{\left(HH\right)}), we thus obtain, after some simple calculations,
\forall o\in \left\{HH,LH,HL\right\},\phantom{\rule{1em}{0ex}}{x}_{j+1}^{\left(o\right)}\left(m,n\right)={y}_{j}^{\left(o,1\right)}\left(m,n\right){\left({\mathbf{p}}_{j}^{\left(HH\right)}\right)}^{\mathsf{\text{T}}}{\mathbf{x}}_{j}^{\left(0,1\right)}\left(m,n\right)
(19)
Where
{y}_{j}^{\left(o,1\right)}\left(m,n\right)=\sum _{i\in \left\{0,1,2\right\}}\sum _{k,l}{h}_{i,j}^{\left(o,1\right)}\left(k,l\right){x}_{i,j}^{\left(1\right)}\left(mk,nl\right)+\sum _{k,l}{h}_{3,j}^{\left(o,1\right)}\left(k,l\right){x}_{3,j}\left(mk,nl\right),
(20)
{x}_{j}^{(o,1)}(m,n)={\left({\displaystyle \sum _{k,l}{h}_{3,j}^{(o,1)}}(k,l){x}_{i,j}(mkr,nls)\right)}_{\underset{i\in \{0,1,2\}}{(r,s)\in {\mathcal{P}}_{j}^{(HH)}}}.
(21)
Consequently, the proposed weighted ℓ_{1} criterion (Equation (16)) can be expressed as:
{\mathcal{J}}_{w{\ell}_{1}}\left({\mathbf{p}}_{j}^{\left(HH\right)}\right)=\sum _{o\in \left\{HL,LH,HH\right\}}\sum _{m,n}{\kappa}_{j+1}^{\left(o\right)}\left{y}_{j}^{\left(o,1\right)}\left(m,n\right){\left({\mathbf{p}}_{j}^{\left(HH\right)}\right)}^{\mathsf{\text{T}}}{\mathbf{x}}_{j}^{\left(o,1\right)}\left(m,n\right)\right.
(22)
It is worth noting that in practice, the determination of {y}_{j}^{\left(o,1\right)}\left(m,n\right) and {\mathbf{x}}_{j}^{\left(o,1\right)}\left(m,n\right) does not require to find the explicit expressions of {h}_{i,j}^{\left(o,1\right)} and these signals can be determined numerically as follows:

The first term (resp. the second one) in the expression of {y}_{j}^{\left(o,1\right)}\left(m,n\right) in Equation (20) can be found by computing {x}_{j+1}^{\left(o\right)}\left(m,n\right) from the components {\left({x}_{i,j}^{\left(1\right)}\left(m,n\right)\right)}_{i\in \left\{0,1,2,3\right\}} while setting {x}_{3,j}^{\left(1\right)}\left(m,n\right)=0 (resp. while setting {x}_{i,j}^{\left(1\right)}\left(m,n\right)=0 for i ∈ {0,1,2} and {x}_{3,j}^{\left(1\right)}\left(m,n\right)={x}_{3,j}\left(m,n\right)).

The vector {\mathbf{x}}_{j}^{\left(o,1\right)}\left(m,n\right) in Equation (21) can be found as follows. For each i ∈ {0,1,2}, the computation of its component {\sum}_{k,l}{h}_{3,j}^{\left(o,1\right)}\left(k,l\right){x}_{i,j}\left(mk,nl\right) requires to compute {x}_{j+1}^{\left(o\right)}\left(m,n\right) by setting {x}_{3,j}^{\left(1\right)}\left(m,n\right)={x}_{i,j}\left(m,n\right) and {x}_{{i}^{\prime},j}^{\left(1\right)}\left(m,n\right)=0 for i' ∈ {0,1,2}. The result of this operation has to be considered for different shift values (r, s) (as can be seen in Equation (21)).
Once the different terms involved in the proposed weighted criterion in Equation (22) are defined (the constant values {\kappa}_{j+1}^{\left(o\right)} are supposed to be known), we will focus now on its minimization. Indeed, unlike the previous criterion (Equation 11), which consists only of an ℓ_{1} term, the proposed criterion is a sum of three ℓ_{1} terms. To minimize such a criterion (22), one can still use the DouglasRachford algorithm through a formulation in a product space [46, 54].
4.2.1 DouglasRachford algorithm in a product space
Consider the ℓ_{1} minimization problem:
\underset{{\mathbf{P}}_{j}^{\left(HH\right)}}{\text{min}}\sum _{o\in \left\{H,L,LH,HH\right\}}\sum _{m,n}{k}_{j+1}^{\left(o\right)}\left{y}_{j}^{\left(o,1\right)}\left(m,n\right){\left({\mathbf{p}}_{j}^{\left(HH\right)}\right)}^{T}{\mathbf{x}}_{j}^{\left(o,1\right)}\left(m,n\right)\right
(23)
where {\kappa}_{j+1}^{\left(o\right)}, o ∈ {HL,LH,HH}, are positive weights.
Since the DouglasRachford algorithm described hereabove is designed for the sum of two functions, we can reformulate (23) under this form in the 3fold product space {\mathbb{H}}_{j}
{\mathbb{H}}_{j}={\mathbb{R}}^{{K}_{j}}\times {\mathbb{R}}^{{K}_{j}}\times {\mathbb{R}}^{{K}_{j}}.
(24)
If we define the vector subspace U as
\begin{array}{c}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}U=\left\{{\mathbf{Z}}_{j}\right.=\left(\begin{array}{c}\hfill {\mathbf{z}}_{j}^{\left(HH,1\right)}\hfill \\ \hfill {\mathbf{z}}_{j}^{\left(LH,1\right)}\hfill \\ \hfill {\mathbf{z}}_{j}^{\left(HL,1\right)}\hfill \end{array}\right)\in {\mathbb{H}}_{j}\exists \phantom{\rule{2.77695pt}{0ex}}{\mathbf{p}}_{j}^{\left(HH\right)}\in {\mathbb{R}}^{L},\forall o\in \left\{HH,LH,HL\right\},\\ \left(\right)close="\}">\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\forall \left(m,n\right)\in \left\{1,2,\dots ,{M}_{j}\right\}\times \left\{1,2,\dots ,{N}_{j}\right\},{z}_{j}^{\left(o,1\right)}\left(m,n\right)={\left({\mathbf{p}}_{j}^{\left(HH\right)}\right)}^{\mathsf{\text{T}}}{\mathbf{x}}_{j}^{\left(o,1\right)}\left(m,n\right)\end{array}=\left\{{\mathbf{Z}}_{j}\right.=\left(\begin{array}{c}\hfill {\mathbf{z}}_{j}^{\left(HH,1\right)}\hfill \\ \hfill {\mathbf{z}}_{j}^{\left(LH,1\right)}\hfill \\ \hfill {\mathbf{z}}_{j}^{\left(HL,1\right)}\hfill \end{array}\right)\in {\mathbb{H}}_{j}\exists \phantom{\rule{2.77695pt}{0ex}}{\mathbf{p}}_{j}^{\left(HH\right)}\in {\mathbb{R}}^{L},\forall \left(m,n\right)\in \left\{1,2,\dots ,{M}_{j}\right\}\times \left\{1,2,\dots ,{N}_{j}\right\},\\ \left(\right)close="\}">{\mathbf{Z}}_{j}\left(m,n\right)={\mathbf{X}}_{j}{\left(m,n\right)}^{\mathsf{\text{T}}}{\mathbf{p}}_{j}^{\left(HH\right)}\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{with}}\phantom{\rule{2.77695pt}{0ex}}{\mathbf{X}}_{j}\left(m,n\right)=\left({\mathbf{x}}_{j}^{\left(HH,1\right)}\left(m,n\right),{\mathbf{x}}_{j}^{\left(LH,1\right)}\left(m,n\right),{\mathbf{x}}_{j}^{\left(HL,1\right)}\left(m,n\right)\right)& ,\n
(25)
the minimization problem (Equation 23) is equivalent to
\underset{{\mathbf{z}}_{j}\in {\mathbb{H}}_{j}}{\text{min}}{f}_{3}\left({\mathbf{z}}_{j}\right)+{f}_{4}\left({\mathbf{z}}_{j}\right)
(26)
where
\begin{array}{ll}\hfill {f}_{3}\left({\mathbf{z}}_{j}\right)& =\sum _{o\in \left\{HL,LH,HH\right\}}\sum _{m,n}{\kappa}_{j+1}^{\left(o\right)}\left{y}_{j}^{\left(o,1\right)}\left(m,n\right){z}_{j}^{\left(o,1\right)}\left(m,n\right)\right\phantom{\rule{2em}{0ex}}\\ \hfill {f}_{4}\left({\mathbf{z}}_{j}\right)& ={\iota}_{U}\left({\mathbf{z}}_{j}\right).\phantom{\rule{2em}{0ex}}\end{array}
(27)
We are thus back to a problem involving two functions in a larger space, which is the product space {\mathbb{H}}_{j}. So, the DouglasRachford algorithm can be applied to solve our minimization problem (see Appendix C). Finally, once the prediction filter {\mathbf{p}}_{j}^{\left(HH\right)} is optimized and fixed, it can be noticed that the other prediction filters {\mathbf{p}}_{j}^{\left(HL\right)} and {\mathbf{p}}_{j}^{\left(LH\right)} can be separately optimized by minimizing {\mathcal{J}}_{{\ell}_{1}}\left({\mathbf{p}}_{j}^{\left(HL\right)}\right) and {\mathcal{J}}_{{\ell}_{1}}\left({\mathbf{p}}_{j}^{\left(LH\right)}\right) as explained in Section 3. This is justified by the fact that the inputs of the filter {\mathbf{p}}_{j}^{\left(HL\right)} (resp. {\mathbf{p}}_{j}^{\left(LH\right)}) are independent of the output of the filter {\mathbf{p}}_{j}^{\left(LH\right)} (resp. {\mathbf{p}}_{j}^{\left(HL\right)}).