Noisy image magnification with total variation regularization and order-changed dictionary learning

Xu, Jian; Chang, Zhiguo; Fan, Jiulun; Zhao, Xiaoqiang; Wu, Xiaomin; Wang, Yanzi

doi:10.1186/s13634-015-0225-y

Research
Open access
Published: 06 May 2015

Noisy image magnification with total variation regularization and order-changed dictionary learning

Jian Xu^1,2,
Zhiguo Chang³,
Jiulun Fan¹,
Xiaoqiang Zhao¹,
Xiaomin Wu¹ &
…
Yanzi Wang¹

EURASIP Journal on Advances in Signal Processing volume 2015, Article number: 41 (2015) Cite this article

2197 Accesses
2 Citations
Metrics details

Abstract

Noisy low resolution (LR) images are always obtained in real applications, but many existing image magnification algorithms can not get good result from a noisy LR image. We propose a two-step image magnification algorithm to solve this problem. The proposed algorithm takes the advantages of both regularization-based method and learning-based method. The first step is based on total variation (TV) regularization and the second step is based on sparse representation. In the first step, we add a constraint on the TV regularization model to magnify the LR image and at the same time to suppress the noise in it. In the second step, we propose an order-changed dictionary training algorithm to train the dictionaries which is dominated by texture details. Experimental results demonstrate that the proposed algorithm performs better than many other algorithms when the noise is not serious. The proposed algorithm can also provide better visual quality on natural LR images.

1 Introduction

The technology of image magnification focuses on how to magnify a low resolution (LR) image and at the same time recover some high resolution (HR) details. The methods of this technology can be divided into three categories: the method based on up scaling [1], the method based on reconstruction [2-5], and the method based on learning [6]. Some methods based on up scaling, such as bilinear and bicubic interpolation (BI) [7], are popular since they have low computational complex, but they always produce blurring edges and suffer from artifacts since they use the invariant kernels for all kinds of local textures. Methods based on reconstruction aim at reconstructing the HR image by imitating the inverse process of degradation [2]. These methods rely on the rationality of the reconstructing model. The methods based on up scaling and reconstruction have smaller memory space costs than the learning-based methods in most of the cases. But it is difficult to use some simple mathematical models to fit the sophisticated natural conditions. This makes these methods can not recover many texture details. The learning-based methods are more flexible to deal with the problem [6]. They use the training images to learn the relationship between the HR and LR images, and many existing works have demonstrated their good effect for the high magnification factors.

There are two important aspects in the learning-based algorithms. The first is the feature extraction methods. The second is the learning models.

Many existing feature extraction methods can be utilized to extract features for image magnification problem. Gradient features [6,8], Gabor features [9], fields of experts (FoE) [10] features and histogram oriented gradients (HoG) [11] are developed. To deal with different texture features by different strategies, the input image can be separated into edge and texture components [12], shape and texture components [13,14], different texture regions [15], or different frequency bands [16,17].

The main idea of many existing learning-based models is to use some tools to learn the relationship between the LR and HR images. Neighbor embedding (NE) is based on the assumption that the LR and HR local patches have similar geometries in two distinct feature spaces [18]. However, finding neighbors in millions of data samples is a high time-exhaustive task for the NE-based algorithm. Canonical correlation analysis (CCA) [19-21] assumes that the corresponding HR and LR images have great inner product similarity after a transformation. Compared to the NE-based methods, CCA can accomplish the transformation with lower computational complexity. Sparse representation-based models are widely used [22] in image processing because of its good generalization ability. Yang et al. [6,8] proposed a classical model to transform the HR and LR images into a unified subspace. They suppose the HR and LR images should have the same sparse representations in the subspace. To accomplish the transformation, coupled dictionary training is an important step. Yang et al. proposed joint learning [6] and coupled learning [23] algorithms to train coupled dictionaries. The joint learning algorithm combines the LR and HR patch pairs together to convert the coupled dictionary training task into a single dictionary training task. However, the reliable sparse representations are not guaranteed to be found in the test phase. Yang’s coupled learning algorithm [23] uses the alternately steepest descent algorithm to update the LR and HR dictionaries. Zeyde et al. [8] use a single dictionary training algorithm to train the LR dictionary and then generate the HR dictionary by solving a least square problem. Xu et al. [24] alternately update the LR and HR dictionaries with K-singular value decomposition (K-SVD). In these dictionary training algorithms, Zeyde’s algorithm has the smallest time complexity. Since it is a too strict condition to let the LR and HR sparse representations to be exactly the same, some tools (such as the neural network [25] and linear transformation [26,27]) are employed to model the relationship between the two sparse representations. To accelerate the sparse representation-based algorithm, Timofte et al. group the dictionary atoms [28] or the training samples [29] to decrease the time complexity of calculating the sparse representations. Some algorithms can provide excellent results on some special image classes (such as face [30] and buildings [31]). Besides the abovementioned tools, support vector regression (SVR) [32], Kernel-based regression [33], deep convolutional neural network [34], and fuzzy rule-based prediction [35] are also used as the tools to solve the image magnification problem.

In real applications, the obtained LR images always contain noise (such as taking photos in low-light or strong interference conditions). Since some existing algorithm is not good at dealing with the noisy LR image, we propose an algorithm to cover the shortage. The destination of this algorithm is to reconstruct a clear HR image according to a noisy LR image. The proposed algorithm takes the advantage of both the regularization-based method and the learning-based method. We firstly use the regularization-based method to suppress the noise and then use the learning-based method to recover the details. To make it simple, we briefly call the proposed method total variation and order-changed dictionary training (TV-OCDT) algorithm.

Our contributions can be summarized as follows:

1)
We propose a constraint for the total variation (TV) regularization-based image magnification model. The constraint is helpful to suppress the noise and recover sharp edges.
2)
We propose an order-changed dictionary training algorithm to train the coupled dictionaries. The traditional dictionary training algorithm firstly trains the LR dictionary. Then, generate the HR dictionary according to the LR dictionary. But we firstly train the HR dictionary and then generate the LR dictionary according to the HR dictionary. This strategy changes the dominated content of the dictionaries so that the texture details can be recovered well. Experimental results show that the proposed algorithm is superior to others on the noisy images.

The remainder of this paper is organized as follows. Section 2 describes the proposed algorithm. The experimental results are presented in Section 3. Section 4 concludes this paper.

2 The proposed algorithm

If the input LR image has noise, how could we deal with it? An idea that flashed into the mind may be firstly denoising the LR image and then magnify it. But it is difficult to be executed, since the textures are dense and incomplete in the LR image. Therefore, we propose an algorithm to solve this problem. The framework of the proposed algorithm is shown in Figure 1. A TV regularization-based algorithm is employed to simultaneously accomplish magnification and denoising at first. The details of the proposed TV regularization model will be described in Section 2.1. After the TV regularization, some texture details are damaged. We use an OCDT algorithm to compensate the texture details. The details of this step will be shown in Section 2.2.

2.1 TV regularization with LR constraint

In real applications, we often obtain the noisy LR images. If we directly use some magnification algorithm on these images, the noise will be magnified simultaneously. The strategy of denoising at first is not a good choice. Many existing denoising algorithm [36] works very well on HR images, but can not be executed on LR images because the textures are dense and incomplete (as shown in Figure 2).

To fit the recovered HR image to the initial input LR image L ^s, the famous iterative back projection (IBP) [38] algorithm is widely used in image magnification technology. It can be executed without storing any tools (such as data samples or dictionaries) and has low computational complexity.

The model of IBP is as follows:

$$ {\mathbf{Z}}^{s,IBP} = \arg \mathop {\min }\limits_{\mathbf{Z}}{\left\|{{F(\mathbf{Z})} - {\mathbf{L}}^{s}} \right\|_{2}^{2}} $$

((1))

where Z ^s,IBP is the reconstructed HR image of IBP and ‘ F(·)’ is the operation of down sampling by BI.

The model (1) can be solved in the following ways:

$$ {\mathbf{Z}}^{J+1}={\mathbf{Z}}^{J}+\lambda[ U({\mathbf{L}}^{s}- F({\mathbf{Z}}^{J}))] $$

((2))

where Z ^J is the output of the Jth iteration and ‘ U(·)’ is the operation of up sampling by BI.

If the input is a noisy image, the model of IBP certainly will propagate the noise to the HR image and the noise will be even magnified (as shown in Figure 3). Therefore, some constraints should be added on the IBP model so that it can suppress the noise.

The traditional TV regularization [39] considers that the HR images have a small TV norm. The reconstructed image is obtained from :

$$ {\mathbf{Z}}^{s,TV} = \arg \mathop {\min}\limits_{\mathbf{Z}}\{{\int{|G({\mathbf{Z}})|}+\frac{\lambda}{2}\left\|{F({\mathbf{Z})} - {\mathbf{L}}^{s}} \right\|_{2}^{2}}\} $$

((3))

where Z ^s,TV is the reconstructed HR image of TV regularization and λ is a Lagrange multipliers, ‘ G(·)’ is to calculate the gradients. But we found the traditional TV regularization is not effective enough to suppress the noise in image magnification (as shown in Figure 3).

To suppress the noise in LR images, we add another constraint. We propose the following model:

$$ \begin{aligned} {{\mathbf{Z}}^{s,TV}} &= \arg \mathop {\min }\limits_{{\mathbf{Z}}} \left\{ \int {{{\left| {G({\mathbf{Z}})} \right|}}}+ {\lambda_{1}}\int {{{\left| {G \left({F({\mathbf{Z}})} \right)} \right|}}} \right.\\ &\quad+\left. \frac{\lambda_{2}}{2}\left\| {F({\mathbf{Z})} - {\mathbf{L}}^{s}} \right\|_{2}^{2} \right\} \end{aligned} $$

((4))

where λ ₁ and λ ₂ are the Lagrange multipliers. Obviously, the motivation of the constraint is to make the LR image also have a small TV norm. This motivation is inspired by the classical Rudin-Osher-Fatemi (ROF) TV denoising model [40].

Inspired by the reference [39], the model (4) is solved with the following method.

Define the following functions:

$$ \begin{aligned} &{\mathbf{\Phi}}_{1}({\mathbf{X}})={[ ({\mathbf{X}})_{x}^{2}+ ({\mathbf{X}})_{y}^{2}+\epsilon{\mathbf{1}}]\cdot^{\frac{3}{2}}}\\ &{\mathbf{\Phi}}_{2}({\mathbf{X}})=({\mathbf{X}})_{xx} \bullet(({\mathbf{X}})_{y}^{2}+\epsilon{\mathbf{1}})\\ &{\mathbf{\Phi}}_{3}({\mathbf{X}})=({\mathbf{X}})_{xy} \bullet({\mathbf{X}})_{x}\bullet({\mathbf{X}})_{y}\\ &{\mathbf{\Phi}}_{4}({\mathbf{X}})=({\mathbf{X}})_{yy} \bullet(({\mathbf{X}})_{x}^{2}+\epsilon{\mathbf{1}})) \end{aligned} $$

((5))

where $(\cdot)\cdot ^{\frac {3}{2}}$ is to calculate the $\frac {3}{2}$ power for every element in the matrix and 1 stands for a matrix of ones of the proper size. ‘ ∙’ is to multiply the corresponding elements in matrices. The gradient items are generated with the four gradient filters: ${\mathbf {f}}_{1}=[-1\ 0\ 1], {\mathbf {f}}_{2}={\mathbf {f}}_{1}^{T}, {\mathbf {f}}_{3}=[-1\ 0\ 2\ 0\ 1], {\mathbf {f}}_{4}={\mathbf {f}}_{3}^{T} $, where ‘T’ is transposition.

$$\begin{array}{*{20}l} ({\mathbf{X}})_{x}&={{\mathbf{X}}}*{\mathbf{f}_{1}}\ \ \ \ \ ({\mathbf{X}})_{y}={{\mathbf{X}}}*{\mathbf{f}_{2}} \\ ({\mathbf{X}})_{xx}&={{\mathbf{X}}}*{\mathbf{f}_{3}}\ \ \ \ \ ({\mathbf{X}})_{yy}={{\mathbf{X}}}*{\mathbf{f}_{4}} \end{array} $$

((6))

where ‘ ∗’ is convolution.

The model (4) is solved by the following iterative formula:

$$ \begin{aligned} {\mathbf{Z}}^{J+1}&={\mathbf{Z}}^{J}+{\mathbf{A}}^{J}+\lambda_{1}{\mathbf{B}}^{J}+\lambda_{2}{\mathbf{C}}^{J}\\ {\mathbf{A}}^{J}&=({\mathbf{\Phi}}_{2}({\mathbf{Z}}^{J})-2{\mathbf{\Phi}}_{3}({\mathbf{Z}}^{J})+{\mathbf{\Phi}}_{4}({\mathbf{Z}}^{J}))\bullet/{\mathbf{\Phi}}_{1}({\mathbf{Z}}^{J})\\ {\mathbf{B}}^{J}&=U\left[\left({\mathbf{\Phi}}_{2}(F({\mathbf{Z}}^{J}))-2{\mathbf{\Phi}}_{3}(F({\mathbf{Z}}^{J}))\right.\right.\\ &\quad+\left.\left.{\mathbf{\Phi}}_{4}(F({\mathbf{Z}}^{J}))\right)\bullet/{\mathbf{\Phi}}_{1}(F({\mathbf{Z}}^{J}))\right]\\ {\mathbf{C}}^{J}&=U[{\mathbf{L}}^{s}- F({\mathbf{Z}}^{J})] \end{aligned} $$

((7))

where ‘ ∙/’ is to do division on the elements of matrices, ε is a positive parameter to avoid singularity and is set to 1 according to [39].

Figure 3 compares the images before and after TV regularization. As shown, the proposed TV regularization constraint is benefit to suppress the noise and reconstruct sharp step edges. More experimental results shown in Section 3 will further demonstrate the effect of the proposed TV model.

2.2 OCDT algorithm

Since most of the smooth part and step edges have been recovered by the previous steps, we use the sparse representation-based algorithm to recover the missing texture details. Most of the sparse representation-based image magnification methods work on patches. We also use this strategy. As the operation in Zeyde’s method [8], we produce four gradient maps of Z ^s,TV with four filters (f ₁,f ₂,f ₃,f ₄). These gradient maps are divided into patches, and four gradient patches in a given position are connected into a 4p ² dimensional column vector to form the data set $\{{\mathbf {p}}^{s,d}_{i}\}^{N}_{i=1}$, where ${\mathbf {p}}^{s,d}_{i}$ is the feature vector for the ith patch, N is the patch number. Difference between the desired HR image Z ^s and Z ^s,TV is calculated with:

$$ {\mathbf{Z}}^{s,e}={\mathbf{Z}}^{s}-{\mathbf{Z}}^{s,TV} $$

((8))

The data set $\left \{{\mathbf {p}}^{s,e}_{i}\right \}^{N}_{i=1}$ is generated by dividing Z ^s,e into patches, where ${\mathbf {p}}^{s,e}_{i}$ and ${\mathbf {p}}^{s,d}_{i}$ are corresponding patches.

The classical coupled dictionary training model [23] is:

$$\begin{array}{@{}rcl@{}} \begin{array}{l} \min\limits_{{\mathbf{D}}^{d},{\mathbf{D}}^{e},\{{\mathbf{\alpha}}_{i}\}}\sum_{i=1}^{N}\left(\Vert{\mathbf{p}}^{s,e}_{i}- {\mathbf{D}}^{e}{\mathbf{\alpha}}_{i}{\Vert_{2}^{2}}+\Vert{\mathbf{p}}^{s,d}_{i}-{\mathbf{D}}^{d}{\mathbf{\alpha}}_{i}{\Vert_{2}^{2}}\right)\\ \\ s.t\ {{\left\| {{\mathbf{\alpha}}_{i}} \right\|}_{0}}\leq \hat{T},\Vert{\mathbf{d}}^{d}_{r}\Vert_{2}\leq1,\ \Vert{\mathbf{d}}^{e}_{r}\Vert_{2}\leq1,r=1,2,\cdots,n \end{array} \end{array} $$

((9))

where α _i is the sparse representation, D ^d and D ^e are dictionaries corresponding to $\{{\mathbf {p}}^{s,d}_{i}\}^{N}_{i=1}$ and $\{{\mathbf {p}}^{s,e}_{i}\}^{N}_{i=1}$, respectively, ${\mathbf {d}}^{d}_{r}$ and ${\mathbf {d}}^{e}_{r}$ are their rth dictionary atoms, $\hat {T}$ is the sparseness constraint.

This model can be easily solved with Zeyde’s method [8]. But it is not proper to use the strategy to generate the D ^d firstly and then calculate the D ^e according to the D ^d. We should change the order of the dictionary training. According to the observation, we found Z ^s,TV is dominated by smooth regions and step edges, but lacks texture details. The smooth regions can be well recovered by BI, so the smooth training patches are dropped in dictionary training stage as Yang’s operation [23]. If we train the D ^d firstly, D ^d will be dominated by step edges. But we need coupled dictionaries dominated by texture details, since the step edges have been recovered by TV regularization in the previous steps. Z ^s,e contains the lost texture details in Z ^s. Therefore, we firstly train D ^e and then calculate D ^d according to D ^e.

We firstly train D ^e. We solve the standard single dictionary training problem with K-SVD [41]:

$$ \begin{aligned} &\mathop{\min }\limits_{{{\mathbf{D}}^{e}},\left\{ {{\mathbf{\alpha}}_{i}} \right\}} \sum\limits_{i = 1}^{N} {\left\| {{\mathbf{p}}^{s,e}_{i} - {{\mathbf{D}}^{e}}{\mathbf{\alpha}}_{i}} \right\|_{2}^{2} },\\ &s.t.\ {\left\| {{\mathbf{d}}_{r}^{e}} \right\|_{2}} \le 1,{{\left\| {{\mathbf{\alpha}}_{i}} \right\|}_{0}}\leq \hat{T}, r = 1,2, \cdots,n \end{aligned} $$

((10))

The next issue is how to estimate the D ^d that can provide similar sparse representation α _i for ${\mathbf {p}}^{d}_{i}$. We use the following model to calculate the representation of ${\mathbf {d}}^{e}_{r}$:

$$ {{\mathbf{\beta }}_{r}} = \arg \mathop {\min }\limits_{\mathbf{z}} {\left\| {\mathbf{z}} \right\|_{2}},\ s.t.\ {\mathbf{d}}^{e}_{r}={\mathbf{P}}^{s,e} {\mathbf{z}},\ r = 1,2, \cdots,n $$

((11))

where ${\mathbf {P}}^{s,e}=\left [{\mathbf {p}}^{s,e}_{1}, {\mathbf {p}}^{s,e}_{2},\ldots, {\mathbf {p}}^{s,e}_{N}\right ]$.

The solution of Equation 11 is:

$$ {{\mathbf{\beta }}_{r}} = {\mathbf{P}}^{s,e+} {\mathbf{d}}^{e}_{r} \,=\, \left({{\mathbf{P}}^{s,e}}^{T}{\mathbf{P}}^{s,e}+\varsigma{\mathbf{I}}\right)^{-1}{{\mathbf{P}}^{s,e}}^{T}{{\mathbf{d}}^{e}_{r}},\ r = 1,2, \cdots,n $$

((12))

where ς (set to 0.1) is a small positive parameter to avoid singularity, I is an identity matrix.

D ^d is calculated with:

$$ {\mathbf{D}}^{d}={\mathbf{P}}^{s,d}{\mathbf{B}} $$

((13))

where ${\mathbf {P}}^{s,d}=\left [{\mathbf {p}}^{s,d}_{1}, {\mathbf {p}}^{s,d}_{2},\ldots, {\mathbf {p}}^{s,d}_{N}\right ]$, B=[β ₁,β ₂,⋯,β _n].

Figure 4 shows the visual quality of before and after OCDT. As shown, the output of TV lacks texture details. The step of OCDT recovers the texture details.

2.3 Summary of the proposed algorithm

The dictionary training scheme of the TV-OCDT algorithm is summarized in Algorithm 1. Since it is difficult to estimate the noise level for a natural image in real applications, we trained the dictionaries with clear training images. The test images with different noise levels are all recovered by same dictionaries.

The reconstruction stage of the proposed algorithm is summarized in Algorithm 2.

3 Experiments

In this section, we will first introduce the experimental settings and compare our algorithm with five state-of-the-art algorithms. Then, we will discuss two influential factors for the sparse representation stage (including the patch size and the dictionary size) and two parameters in TV regularization stage (including two Lagrange parameters and iteration number). Finally, we will show the time complexity of the proposed algorithm.

3.1 Experimental settings

In our experiments, we magnify the input LR image both by the factors of 3 and 4. We use the training images in the software package about Yang’s algorithm [6]. Figure 5 shows some training images. We collected 100,000 coupled patches as the external training database. Figure 6 shows some LR test images. The color training images are transformed into gray images. We only use the patches which contain the texture information and the smooth patches are discarded. ${\mathbf {p}}^{s,e}_{i}$ whose variance exceeds 12 is discarded. Its corresponding ${\mathbf {p}}^{s,d}_{i}$ is also discarded.

For the overlapped regions between the adjacent patches, averaging fusion is usually used to obtain the final pixel values. The median value has been considered to be more robust than average value, since it is uneasy to be affected by a handful of bad values. In our algorithm, a Gaussian weighted average is employed for more accurate results. The median values are used as the mean μ of the Gaussian distribution, the variance θ ² is calculated according to μ and the overlap pixel values (u ₁,u ₂,⋯,u _v), where v is the number of overlap pixels. The final pixel value u ^∗ is calculated by Equation 14. Compared with the traditional average values, the Gaussian weighted averages can reduce the effect of the very bad values which are far away from the desired values.

$$ {u^{*}} = \frac{{{\sum\nolimits}_{i = 1}^{v} {\exp \left({ - {{\left| {{u_{i}} - {\rm{\mu }}} \right|}^{2}}/2{{\rm{\theta }}^{2}}} \right){u_{i}}} }}{{{\sum\nolimits}_{i = 1}^{v} {\exp \left({ - {{\left| {{u_{i}} - {\rm{\mu }}} \right|}^{2}}/2{{\rm{\theta }}^{2}}} \right)} }} $$

((14))

$$ {{\rm{\theta }}^{2}} = \frac{1}{v}{\sum\nolimits}_{i = 1}^{v} {{{\left({{u_{i}} - {\rm{\mu }}} \right)}^{2}}} $$

((15))

3.2 Comparison with other methods

We compare our algorithm with five state-of-the-art algorithms, including: Zeyde’s method [8], Anchored Neighborhood Regression (ANR) [28], ANR plus[29], Statistical Prediction Model (SPM) [25], and Deep Convolutional Network (DCN) [34]. The parameter settings about our method will be described in the following sections. In Tables 1 and 2, we compare the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) [44] of the reconstructed HR images referring to different noise standard deviation σ. PSNR and SSIM are all quantitative evaluations of the images. Figures 7 and 8 show the visual comparison. According to the comparison, we can see that though the proposed method obtains lower PSNR and SSIM values than others for images without noise, the reconstructed images have sharp edges and less artifacts than others. When the standard deviation of the noise is less than 10, the proposed algorithm obtains the highest PSNR and SSIM values in most of the cases. The average values are higher than others. From the visual comparison, we can see that the proposed algorithm can suppress the noise well when the noise is less than 10. When the variance of the noise is larger than 10, the PSNR and SSIM values of the proposed algorithm are lower than some of the existing algorithms. From the visual comparison, we can see that all of the algorithms can not obtain good visual quality. Therefore, how to magnify images with serious noise is a problem to be solved.

Table 1 The average PSNR and SSIM values for different methods (3 ×)

Full size table

Table 2 The average PSNR and SSIM values for different methods (4 ×)

Full size table

To test the robustness of the proposed algorithm, we test these algorithms on some natural images. Figure 9 shows the results. The test images are taken with a Coolpad 8750 mobile phone working on LR pattern and the flashlight is on. They are taken in a ground parking at 21:30 in the evening. From the results, we can see that the proposed algorithm provides sharper edges with less artifacts.

3.3 Effects of the parameters

3.3.1 3.3.1 Effects of the dictionary size and patch size

The dictionary and patch size greatly affect all the sparse representation-based image magnification methods. Since a larger overlap is good to denoising, we use the overlap of p−1 for each patch size. In TV regularization step, the iteration number is set to 200 and the Lagrange multipliers are all set to 0.6 for the 3 × magnification and λ ₁ is set to 0.4 and λ ₂ is set to 0.6 for the 4 × magnification (we will discuss these parameters in the next section). Tables 3 and 4 show the average PSNR and SSIM values for different dictionary and patch sizes.

Table 3 The average PSNR and SSIM values for different patch and dictionary sizes (3 ×)

Full size table

Table 4 The average PSNR and SSIM values for different patch and dictionary sizes (4 ×)

Full size table

For the 3 × magnification, all the best PSNR values are obtained when the dictionary size is 512. Most of the best SSIM values are also obtained when the dictionary size is 512. Therefore, we choose 512 as the dictionary size for the 3 × magnification. When the noise standard deviation is 5, the best patch size is 3. When the noise standard deviation is 10 and 20, the best patch size is 5. When the noise standard deviation is 15, the best patch size is 7. Therefore, there are no best patch size which is suitable for every noise standard deviation. But the worst values appear when the patch size is 7. Furthermore, larger patch size will result in larger time complexity. Therefore, we choose 5 in our experiments since it is suitable for two standard deviation values.

For the 4 × magnification, most of the best PSNR and SSIM values are obtained by the dictionary size 1,024. The patch size 7 gets the highest PSNR and SSIM values for most of the standard deviations. Therefore, we choose 1,024 as the best dictionary size and 7 as the best patch size.

3.3.2 3.3.2 Effects of the parameters in TV regularization

The iteration number $\hat {J}$ and two Lagrange multipliers are important parameters for TV regularization. Tables 5 and 6 show the change of the PSNR and SSIM values for different iteration numbers. It is obvious that 200 is the best iteration number for the 3 × magnification. For the 4 × magnification, the best iteration number is 200 when the noise standard deviation is 5 or 20. The best iteration number is 300 when the noise standard deviation is 10 and 15. Since more iteration number will result in more time cost, we choose 200 in our experiments.

Table 5 The average PSNR and SSIM values for different iteration numbers (3×)

Full size table

Table 6 The average PSNR and SSIM values for different iteration numbers (4×)

Full size table

Tables 7 and 8 show the average PSNR and SSIM values for different Lagrange multipliers. As shown, the best Lagrange multipliers are different for different noise standard deviations and magnifications. According to the comparisons stated in Section 3.2, our method is superior to others when the noise standard deviation is less than 10. The advantage is more significant when the noise standard deviation is 5. Therefore, we choose the best Lagrange multipliers according to the data about noise standard deviation 5. For the 3 × magnification, λ ₁ and λ ₂ are all set to 0.6. For the 4 × magnification, λ ₁ is set to 0.4 and λ ₂ is set to 0.6.

Table 7 The average PSNR and SSIM values for different Lagrange multipliers (3 ×)

Full size table

Table 8 The average PSNR and SSIM values for different Lagrange multipliers (4 ×)

Full size table

3.4 Time complex analysis

The time complexity is analyzed as follows. Suppose the time complexity of calculating an unknown pixel value with interpolation algorithm is c, the TV regularization requires order $O(cK_{1}K_{2}\hat {J})$ flops according to Equation 7. OMP algorithm [43,45] is used to calculate the sparse representations. OMP algorithm needs $O\left ({{4p^{2}}n\hat {T}} \right)$ flops when calculating sparse representation for a given feature vector [41]. Therefore, the whole time complexity of the testing phase is $O(4K_{1}K_{2}p^{2}n\hat {T}+cK_{1}K_{2}\hat {J})$. Our algorithm is tested on an AMD FX8150 CPU with 3.6 GHz and 16 G memory PC at windows platform. Table 9 compares the time costs of different methods. Our time cost is larger than Zeyde’s method, ANR, and ANR plus but smaller than SPM and DCN.

Table 9 Time cost comparison (Second)

Full size table

4 Conclusions

The capability of dealing with the noisy LR images is greatly related to the performance of an image magnification algorithm in real applications. In this paper, we propose an algorithm to magnify a noisy LR image. This algorithm combines the ideas of regularization and learning-based algorithm. The experimental results demonstrate that the proposed algorithm performs well when the standard deviation of noise is not very high. But some problems still need to be solved in the future. Firstly, the existing algorithms and the proposed algorithm all can not deal with LR images with serious noise. From Figures 7 and 8, we can see that when the noise is higher than 10, the visual quality is not ideal for all of these methods. Secondly, the performance on the natural images is not good enough. Many texture details still cannot be recovered. We should find better ways to deal with complex natural conditions in the future research.

References

X Liu, D Zhao, J Zhou, W Gao, H Sun, Image interpolation via graph-based bayesian label propagation. IEEE Trans. Image Process. 23(3), 1084–1096 (2014).
Article MathSciNet Google Scholar
W Dong, G Shi, X Li, L Zhang, X Wu, Image reconstruction with locally adaptive sparsity and nonlocal robust regularization. Signal Process. Image Commun. 27(10), 1109–1122 (2012).
Article Google Scholar
W Dong, L Zhang, R Lukac, G Shi, Sparse representation based image interpolation with nonlocal autoregressive modeling. IEEE Trans. Image Process. 22(4), 1382–1394 (2013).
Article MathSciNet Google Scholar
W Dong, L Zhang, G Shi, X Li, Nonlocally centralized sparse representation for image restoration. IEEE Trans. Image Process. 22(4), 1620–1630 (2013).
Article MathSciNet Google Scholar
K Zhang, X Gao, D Tao, X Li, Single image super-resolution with non-local means and steering kernel regression. IEEE Trans. Image Process. 21(11), 4544–4556 (2012).
Article MathSciNet Google Scholar
J Yang, J Wright, TS Huang, Y Ma, Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010).
Article MathSciNet Google Scholar
H Hou, H Andrews, Cubic splines for image interpolation and digital filtering. IEEE Trans. Acous. Speech Signal Process. 26(6), 508–517 (1978).
Article MATH Google Scholar
R Zeyde, M Protter, M Elad, On single image scale-up using sparse-representation. Lecture Notes Comput. Sci. 6920(1), 711–730 (2010).
MathSciNet Google Scholar
K Nguyen, S Sridharan, S Denman, C Fookes, in IEEE Conference on Computer Vision and Pattern Recognition. Feature-domain super-resolution framework for gabor-based face and iris recognition (IEEEProvidence, RI, USA, 2012), pp. 2642–2649.
Q Zhu, L Sun, C Cai, Non-local neighbor embedding for image super-resolution through foe features. Neurocomputing. 141(2), 211–222 (2014).
Article Google Scholar
X Gao, K Zhang, D Tao, X Li, Image super-resolution with sparse neighbor embedding. IEEE Trans. Image Process. 21(7), 3194–3205 (2012).
Article MathSciNet Google Scholar
J Li, W Gong, W Li, F Pan, Single-image super-resolution reconstruction based on global non-zero gradient penalty and non-local laplacian sparse coding. Digital Signal Process. 26(1), 101–112 (2014).
Article Google Scholar
A Akyol, M Gökmen, Super-resolution reconstruction of faces by enhanced global models of shape and texture. Pattern Recognit. 45(12), 4103–4116 (2012).
JS Park, SW Lee, An example-based face hallucination method for single-frame, low-resolution facial images. IEEE Trans. Image Process. 17(10), 1806–1816 (2008).
Article MathSciNet Google Scholar
J Sun, J Zhu, MF Tappen, in IEEE Conference on Computer Vision and Pattern Recognition. Context-constrained hallucination for image super-resolution (IEEESan Francisco, CA, USA, 2010), pp. 231–238.
H Chavez-Roman, V Ponomaryov, Super resolution image generation using wavelet domain interpolation with edge extraction via a sparse representation. IEEE Geoscience Remote Sensing Lett. 11(10), 1777–1781 (2014).
Article Google Scholar
M Nazzal, H Ozkaramanli, Wavelet domain dictionary learning-based single image superresolution. Signal Image Video Process. 1, 1–11 (2014).
Google Scholar
H Chang, DY Yeung, Y Xiong, in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1. Super-resolution through neighbor embedding (IEEEWashington, DC, USA, 2004), pp. 275–282.
H Huang, H He, Super-resolution method for face recognition using nonlinear mappings on coherent features. IEEE Trans. Neural Netw. 22(1), 121–130 (2011).
Article Google Scholar
L An, N Thakoor, B Bhanu, in IEEE International Conference on Image Processing. Vehicle logo super-resolution by canonical correlation analysis (IEEEOrlando, FL, USA, 2012), pp. 2229–2232.
L An, B Bhanu, Face image super-resolution using 2d cca. Signal Process. 103(1), 184–194 (2014).
Article Google Scholar
T Ogawa, M Haseyama, Image inpainting based on sparse representations with a perceptual metric. EURASIP J. Adv. Signal Process. 2013(1), 1–26 (2013).
Article Google Scholar
J Yang, Z Wang, Z Lin, S Cohen, T Huang, Coupled dictionary training for image super-resolution. IEEE Trans. Image Process. 21(8), 3467–3478 (2012).
Article MathSciNet Google Scholar
J Xu, C Qi, Z Chang, in IEEE International Conference on Image Processing. Coupled k-svd dictionary training for super-resolution (IEEEParis, France, 2014), pp. 3910–3914.
T Peleg, M Elad, A statistical prediction model based on sparse representations for single image super-resolution. IEEE Trans. Image Process. 23(6), 2569–2582 (2014).
Article MathSciNet Google Scholar
H Li, Q Hairong, R Zaretzki, in IEEE Conference on Computer Vision and Pattern Recognition. Beta process joint dictionary learning for coupled feature spaces with application to single image super-resolution (Portland, Oregon, USA, 2013), pp. 345–352.
S Wang, L Zhang, Y Liang, Q Pan, in IEEE Conference on Computer Vision and Pattern Recognition. Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis (IEEEProvidence, RI, USA, 2012), pp. 2216–2223.
R Timofte, V De Smet, L Van Gool, in IEEE International Conference on Computer Vision. Anchored neighborhood regression for fast example-based super-resolution (IEEEPortland, Oregon, USA, 2013), pp. 1920–1927.
R Timofte, V De Smet, L Van Gool, in Asian Conference of Computer Vision. A+: Adjusted anchored neighborhood regression for fast super-resolution (Singapore City, Singapore, 2014), pp. 1–15.
J Shi, X Liu, C Qi, Global consistency, local sparsity and pixel correlation: A unified framework for face hallucination. Pattern Recognit. 47(11), 3520–3534 (2014).
Article Google Scholar
F-G Carlos, EJ Candes, in IEEE International Conference on Computer Vision. Super-resolution via transform-invariant group-sparse regularization (IEEESydney, Australia, 2013), pp. 3336–3343.
KS Ni, TQ Nguyen, Image superresolution using support vector regression. IEEE Trans. Image Process. 16(6), 1596–1610 (2007).
Article MathSciNet Google Scholar
KI Kim, Y Kwon, Single-image super-resolution using sparse regression and natural image prior. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1127–1133 (2010).
Article MathSciNet Google Scholar
C Dong, CC Loy, K He, X Tang, in Proceedings of European Conference on Computer Vision. Learning a deep convolutional network for image super-resolution (Zurich, Switzerland, 2014), pp. 1–16.
P Purkait, NR Pal, B Chanda, A fuzzy-rule-based approach for single frame super resolution. IEEE Trans. Image Process. 23(5), 2277–2290 (2014).
Article MathSciNet Google Scholar
A Buades, B Coll, J-M Morel, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2. A non-local algorithm for image denoising (IEEESan Diego, CA, USA, 2005), pp. 60–65.
RC Gonzalez, RE Woods, Digital Image Processing (Pearson Education, India, 2009).
M Irani, S Peleg, Improving resolution by image registration. CVGIP: Graphical Models Image Process. 53(3), 231–239 (1991).
Google Scholar
A Marquina, SJ Osher, Image super-resolution by tv-regularization and bregman iteration. J. Scientific Comput. 37(3), 367–382 (2008).
Article MATH MathSciNet Google Scholar
LI Rudin, S Osher, E Fatemi, Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena. 60(1), 259–268 (1992).
Article MATH Google Scholar
M Aharon, M Elad, A Bruckstein, K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006).
Article Google Scholar
LN Smith, M Elad, Improving dictionary learning: Multiple dictionary updates and coefficient reuse. IEEE Signal Process. Lett. 20(1), 79–82 (2013).
Article Google Scholar
R Rubinstein, M Zibulevsky, M Elad, Efficient Implementation of the K-SVD Algorithm using Batch Orthogonal Matching Pursuit, CS Technion (2008). http://www.cs.technion.ac.il/~ronrubin/Publications/KSVD-OMP-v2.pdf.
Z Wang, AC Bovik, HR Sheikh, EP Simoncelli, Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004).
Article Google Scholar
JA Tropp, AC Gilbert, Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inf. Theory. 53(12), 4655–4666 (2007).
Article MATH MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported by the National Science Foundation of China (Grant no. 61340040, 61202183, 61102095), the Science and Technology Plan in Shannxi Province of China (No.2014KJXX-72), and Young Teachers Foundation of Xi’an University of Posts & Telecommunications.

Author information

Authors and Affiliations

School of Telecommunication and Information Engineering, Xi’an University of Posts and Telecommunications, Weiguo Road, Xi’an, 710121, China
Jian Xu, Jiulun Fan, Xiaoqiang Zhao, Xiaomin Wu & Yanzi Wang
Image Processing and Recognition Center, Xi’an Jiaotong University, Xianning road, Xi’an, 710049, China
Jian Xu
School of Information Engineering, Chang’an University, Erhuan Road, Xi’an, 710064, China
Zhiguo Chang

Authors

Jian Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhiguo Chang
View author publications
You can also search for this author in PubMed Google Scholar
Jiulun Fan
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoqiang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaomin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yanzi Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Xu.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

JX wrote Sections 1 and 2. ZC wrote Sections 3 and 4. JF revised the paper and added many analyses about the experimental results. XZ, XW, and YW did the experiments and collected the data. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Xu, J., Chang, Z., Fan, J. et al. Noisy image magnification with total variation regularization and order-changed dictionary learning. EURASIP J. Adv. Signal Process. 2015, 41 (2015). https://doi.org/10.1186/s13634-015-0225-y

Download citation

Received: 07 August 2014
Accepted: 13 April 2015
Published: 06 May 2015
DOI: https://doi.org/10.1186/s13634-015-0225-y

Noisy image magnification with total variation regularization and order-changed dictionary learning

Abstract

1 Introduction

2 The proposed algorithm

2.1 TV regularization with LR constraint

2.2 OCDT algorithm

2.3 Summary of the proposed algorithm

3 Experiments

3.1 Experimental settings

3.2 Comparison with other methods

3.3 Effects of the parameters

3.3.1 3.3.1 Effects of the dictionary size and patch size

3.3.2 3.3.2 Effects of the parameters in TV regularization

3.4 Time complex analysis

4 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Rights and permissions

About this article

Cite this article

Share this article

Keywords