Skip to main content

Optimal graph edge weights driven nlms with multi-layer residual compensation

Abstract

Non-local Means (NLMs) play essential roles in image denoising, restoration, inpainting, etc., due to its simple theory but effective performance. However, when the noise increases, the denoising accuracy of NLMs decreases significantly. This paper further develop the NLMs-based denoising method to remove noise with less loss of image details. It is realized by embedding an optimal graph edge weights driven NLMs kernel into a multi-layer residual compensation framework. Unlike the patch similarity-based weights in the traditional NLMs filters, the edge weights derived from the optimal graph Laplacian regularization consider (1) the distance between the target pixel and the candidate pixel, (2) the local gradient and (3) the patch similarity. After defining the weights, the graph-based NLMs kernel is then put into a multi-layer framework. The corresponding primal and residual terms at each layer are finally fused with learned weights to recover the image. Experimental results show that our method is effective and robust, especially for piecewise smooth images.

1 Introduction

Image denoising is one of the most fundamental and important tasks in image processing and computer vision. Generally speaking, it aims at retrieving the clean image \({{u}_{ori}}\in {{\mathbb {R}}^{S_1\times S_2}}\) from an observed noisy image \({{u}}\in {{\mathbb {R}}^{S_1\times S_2}}\). Typically, the noisy image is assumed to be corrupted by Gaussian noise. In the past decades, numerous denoising methods have been proposed. Regarding the way to separate \(u_{ori}\) from u, the denoising methods can be divided into two classes: those implemented in the spatial domain and those implemented in the transform domain.

In the spatial domain, classical methods denoise an image by averaging the pixels with different weights, e.g., equal weights of box-car filter, weights depending on the distance between pixels in Gaussian filter, weights computed from geometric and radiometric distances of bilateral filter [11, 12, 19]. Besides, various extensions have been proposed to balance the smoothness and details, e.g., averaging in local windows with adaptive size [22] or local regions with adaptive shape [32]. In contrast to these connected local regions, Buades et al. proposed a method to average pixels in non-local regions named non-local means (NLMs) [3]. The main idea of NLMs is to select similar pixels in a non-local region (even the whole image), then average them with different weights. Since the similarity between two pixels is computed by the corresponding non-local patches around them, NLMs are robust to noise and yield effective performance. However, the NLMs filter removes some image details such as edges and rare textures during denoising, especially when the SNR is low. Although total variation (TV) regularization models [7, 17, 26] have been combined with NLMs approaches to deal with rare patches (no similar pixels have been selected), it still can not address this issue.

The basic idea of denoising in the transform domain is to separate noise from the observed signal in the transform domain. In general, the noise is randomly distributed, changes rapidly in images, and has almost uniform power across the whole frequency domain. The clean images usually change slowly in local areas, and their power is generally distributed on low frequencies. Based on this phenomenon, various transforms have been used for denoising, e.g., Curvelet transform [25], Wavelet transform [6], graph Fourier transform [15, 18, 29]. Among all transform-based methods, the block-match in 3D transform-domain filter (BM3D) is the most popular and widely used method [8]. BM3D filter is effective by combining the NLMs theory with the wavelet transform-based denoising. Except for the transforms with fixed bases, data-driven transforms have also been widely used in image denoising tasks, including PCA [2], sparse coding [14], dictionary learning [9] and compressed sensing [10].

More recently, machine learning-based denoising methods, especially deep learning-based approaches, attract public attention. The deep network was first applied in image denoising in [16], in which the auto-encoder network does not need manually set parameters for removing the noise. Then Zhang et al. proposed the DnCNN to deal with image denoising, super-resolution, and JPEG image deblocking [34]. The generative adversarial network (GAN) is used to remove blind noise in [4]. In addition, attention mechanism [30] and batch re-normalization [31] theories have been introduced in denoising tasks, which achieve excellent performance. In a word, the recent deep learning approaches can yield better results than the traditional filters, however, a considerable amount of high-quality training data is required for the network training, which is not always available in reality.

This paper aims at developing the NLMs by introducing the graph signal processing theory and a multi-layer framework. Graph signal processing (GSP) is a powerful and developed tool for analyzing signals on graphs [5, 20, 24, 33]. Traditional image processing methods regard the image domain as regular 2D grids. But if one treats each pixel on an image as a node of a graph and constructs proper links between nodes, one can interpret an image as a signal on an irregular graph. Then the computation of similarity between pixels and patches will no longer be restricted by the regular grids. In this way, the image information can be exploited more comprehensively by using the GSP. The optimal graph Laplacian regularization (OGLR) method derives the optimal metric space in the sense of minimum mean square error (MMSE), thus defining the optimal edge weights. Unlike the patch similarity-based weights in the classic NLMs filters, the OGLR defines the weights by considering (1) the distance between the target pixel and the candidate pixel, (2) the local gradient and (3) the patch similarity. The OGLR method yields good results for image denoising. However, it needs a large number of iterations to achieve comparable performance. Moreover, it involves a lot of matrix inversion, making the method time costly. Hence in our paper, we replace the weights in the classical NLMs by the graph edge weights from the OGLR algorithm and embed the newly obtained NLMs into a multi-layer framework. The main contributions of this paper are threefold:

Firstly, our method uses the edge weight defined on a graph structure to compute the similarity between nodes and patches, which behaves better than the traditional NLMs. Experiments show that our method is comparable with the state-of-the-art methods, both visually and quantitatively. Furthermore, our method is better at denoising piecewise smooth images, especially when the noise level is high.

Secondly, a multi-layer representation is performed in our method to remove the noise while preserving the details. Obviously, the multi-layer strategy helps to smooth the image better. In addition, for the sake of recovering image details, the residual terms at each layer (the difference between the input and output of the NLMs filter) are combined with the smooth filtered image.

Last but not least, the coefficients of each component derived from each layer, including the smooth filtered image and the residual terms, are learned according to the least square method. These coefficients can be set as default parameters for any image.

Note that a similar idea has been proposed in [28]. However, the key difference is that our method adapts the NLMs filter parameters (the graph Laplacian regularization) to the input image, instead of a fixed filter presented in [28].

This paper is organized as follows. Section 2 introduces the related work about graph construction, the OGLR algorithm and the multi-layer representation of filter images. The proposed denoising method is detailed in Sect. 3. Experiments and results are presented in Sects. 4 and 5 respectively. And finally, conclusions are given in Sect. 6.

2 Related work

2.1 Graph construction

Let \({\mathcal {G}}({\mathcal {V}}, {\mathcal {E}})\) denote a graph structure, \({\mathcal {V}}=\{v_i\}_{i=1}^N\) is a set of N nodes, and \({\mathcal {E}} = \{e_{ij}\}\) a set of edges. If two different nodes \(v_i\) and \(v_j\) are connected, there exists an edge weight \({w_{i,j}}\) describing the affinity between these two nodes. Generally, the larger the \({w_{i,j}}\) is, the more similar or correlated the nodes \(v_i\) and \(v_j\) are. A widely used form of the weight \({w_{i,j}}\) is as follows:

$$\begin{aligned} {w_{i,j}} = \left\{ \begin{array}{ll} \exp \left( -\frac{\phi _{ij}^{2}}{2\varepsilon ^{2} } \right) , &\quad \text {if} \left| \phi _{ij} \right| \le r,\\ 0, &\quad \text {otherwise}, \end{array} \right. \end{aligned}$$
(1)

where \(\phi _{ij}\) measures the distance between two nodes \(v_i\) and \(v_j\), and r is a threshold. Note that \(\phi _{ij}\) does not necessarily correspond to the Euclidean distance between the nodes. Typically, \(w_{ij}\) is non-negative. Apparently, the larger the distance between two nodes, the smaller \(w_{i,j}\) is. The weighted affinity matrix \({\mathbf {W}}\in {\mathbb {R}}^{N\times N}\) is then formed by the weight \({w_{i,j}}\) and measures the similarity between nodes. The degree matrix \(\mathbf{D } \in {\mathbb {R}}^{N\times N}\) is a diagonal matrix with each entry the degree (sum of each row of \({\mathbf {W}}\)) of each node.

A graph signal \({\mathbf {u}}\) is often defined as a discrete signal on the nodes of the graph \({\mathbf {u}}:{\mathcal {V}}\rightarrow {\mathbb {R}}\). The discrete signal \({\mathbf {u}}\) can be regarded as a vector \({\mathbf {u}}\in {\mathbb {R}}^{N}\), where the i-th entry represents the signal value at the i-th node in \({\mathcal {V}}\). In terms of 2D discrete images, each pixel represents a node, and the pixel intensity stands for the signal value.

2.2 The optimal graph Laplacian regularization algorithm

The OGLR algorithm seeks for a metric space to measure the similarity of image patches [21]. For each pixel location in a 2D image, a vector \({\mathbf{v }}_{i}\) of length M is constructed by using a set of exemplar functions \(\left\{ \text{f}_{\text{m}}\right\} _{m=1}^{M}\):

$$\begin{aligned} {{{\mathbf{v }}}_i} = [{{\text{f}}_1}(i){,} {{\text{f}}_2}(i){,} \ldots {,} {{\text{f}}_M}(i)]. \end{aligned}$$
(2)

The set of vector \(\{{{\mathbf{v }}}_i\}_{i=1}^M\) is used to build the weighted graph \({\mathcal {G}}({\mathcal {V}}, {\mathcal {E}}, {\mathbf {W}})\) with N vertices, where N is the total number of pixels. The determination of the exemplar functions is induced from a continuous graph Laplacian regularizer, described by an anisotropic Dirichlet energy functional E(u):

$$\begin{aligned} {{E}({u}) = \int _\varOmega {\nabla {{{u}}^T}{{\mathbf {G}}^{ - 1}}} \nabla {{u}}{(\sqrt{\det {\mathbf {G}}} )^{2\gamma - 1}}{\text{ds}}}, \end{aligned}$$
(3)

where \(s\in \varOmega\) is the pixel location. The metric tensor \({\mathbf {G}}: \varOmega \mapsto {\mathbb {R}}^{2\times 2}\) can be viewed as the structure tensor at s constructed according to the gradients \(\{\nabla \text{f}\}_{{\rm m}= 1}^{\rm M}\):

$$\begin{aligned} {\mathbf {G}} = \sum _{m=1}^M\nabla \text{f}_{{\rm m}}{\nabla \text{f}_{{\rm m}}}^T \end{aligned}$$
(4)

An optimal metric tensor \({\mathbf {G}}^*\) can be estimated by considering the noise model from patch gradients in the MMSE sense:

$$\begin{aligned} {{\mathbf {G}}^*} = {{\tilde{g}}}{{{\tilde{g}}}^T} + {\alpha _g}I, \end{aligned}$$
(5)

where \({{\tilde{g}}}\) is the average gradient of a patch, and the constant \({\alpha _g}\) is determined by the covariance of the patch. With the estimated \({{\mathbf {G}}^*}\), the exemplar functions can be expressed in the following form:

$$\begin{aligned} \begin{aligned} {{\rm f}}_1^*(i)&= \sqrt{{\alpha _g}} {x_i}\\ {{\rm f}}_2^*(i)&= \sqrt{{\alpha _g}} {y_i},\\ {{\rm f}}_3^*(i)&= \frac{1}{{L + {{\sigma _g^2} / {\sigma _p^2}}}}\sum {_{l = 0}^{L - 1}{z_l}} \end{aligned} \end{aligned}$$
(6)

where \(({x_i},{y_i})\) are the coordinates of pixel i, and \(\{{z}_l\}_{l=0}^L\) is a set of L non-local patches that are similar to \({z_0}\). These similar patches are obtained by using the K-Nearest-Neighbour (KNN) algorithm, which seeks L patches with the smallest Euclidean distance from the current patch \({z_0}\). Here \({\sigma _p}\) is a given constant over the whole noisy image, and \({\sigma _g}\) is an estimated variance of the gradient of the patch. \(\text{f}_1^*\) and \(\text{f}_2^*\) indicate the spatial relationship between pixels, \(\text{f}_3^*\) represents the average pixel intensity of a target patch. Note that the coefficient in \(\text{f}_1^*\), \(\text{f}_2^*\) and \(\text{f}_3^*\) can balance the contributions of the spatial and intensity factors. Hence, the three exemplar functions defined in Eq. (6) can be used to construct the optimal graph edge weight.

2.3 The multi-layer framework

A K-layer tree structure can represent the image hierarchy from fine to coarse, and all the leaf nodes can sum up to the input image. In the multi-layer scheme, the output filtered image \({{u_{out}}}\) can be described by a smooth term and several detail terms:

$$\begin{aligned} {{u_{out}}} = {\beta _0}{{u}_{smooth}} + {\beta _1}{{u}_{\det {\text{ai}}{{{\rm l}}_1}}} + \cdots + {\beta _{K }}{{u}_{\det {\text{ai}}{{{\rm l}}_K}}}, \end{aligned}$$
(7)

where \(\{\beta _{{0}},\beta _{{1}},\ldots ,\beta _{{\rm K}}\}\) is a set of coefficients that controls the smoothness and the detail preservation of the output image. More details on the multi-layer scheme can be found in [27, 28].

3 Graph edge weights driven NLMs in a multi-layer framework

Although the OGLR algorithm has excellent filtering performance, it needs numerous iterations to achieve a comparable result. Moreover, it involves a large amount of inverse operation during the denoising process, thus leading to a very high computational cost. Hence in this paper, we would like to take advantage of the edge weights defined in OGLR and apply it to the NLMs algorithm. Then we embed the newly obtained NLMs method into a multi-layer scheme. The multi-layer scheme can decompose input image details from fine to coarse scale, where the fine scale is used to preserve image details and the coarse scale helps smooth the image. The proposed pipeline is shown in Fig. 1.

Fig. 1
figure 1

Flowchart of the proposed algorithm

3.1 NLMs kernel

The NLMs kernel is computed based on the graph edge weights. With the exemplar functions \(\left\{\text{f}_{{\rm m}}\right\} _{m=1}^{M}\), the vector \({{\mathbf{v }}}_i\) on node \(v_i\) is as follows:

$$\begin{aligned} {{{\mathbf{v }}}_i} = \left[ \sqrt{{\alpha _g}} {x_i}, \sqrt{{\alpha _g}} {y_i}, 1/(L + {\sigma _g^2} / {\sigma _p^2})\sum \limits {_{l = 0}^{L - 1}}{z_l}\right] . \end{aligned}$$
(8)

Then the distance \(\phi _{ij}\) in Eq. (1) between node \(v_i\) and \(v_j\) can be obtained by:

$$\begin{aligned} \phi _{ij} =\Vert {\mathbf{v }}_i-{\mathbf{v }}_j\Vert , \end{aligned}$$
(9)

where \(\Vert \cdot \Vert\) is the \({{\mathfrak {L}}_2}\) norm and \(\phi _{ij}\) determines the weighted affinity matrix \({\mathbf {W}}\) according to Eq. (1). The diagonal elements of the degree matrix \({\mathbf {D}}\) is defined as:

$$\begin{aligned} {\mathbf {D}}_{ii} = \sum _j w_{ij}. \end{aligned}$$
(10)

The NLMs kernel \({\mathbf {F}}\) is a normalized version of the weight matrix and obtained by the product of \({\mathbf {D}}^{-1}\) and \({\mathbf {W}}\):

$$\begin{aligned} {\mathbf {F}}={\mathbf {D}}^{-1}{\mathbf {W}} \end{aligned}$$
(11)

The NLMs kernel \({\mathbf {F}}\) is similar to the graph-based bilateral filter [12] and the classical NLMs kernel [3]. The difference lies in that \({\mathbf {F}}\) considers the spatial relationship between pixels and the average intensity of patches. In addition, the relationship and the average intensity are weighted by the gradient estimates, which helps to improve the denoising performance. On the one hand, when the image is polluted by high-level noise, the spatial relationship between pixels dominates the denoising process (like a Gaussian filter). On the other hand, when the signal-noise ratio (SNR) is high, the average intensity plays a more critical role.

It is worthy to note that the OGLR algorithm denoises the target patch \(z_0\) by calculating the inverse of the Laplace operator, i.e., \(z^*=({\mathbf {I}} + \tau {\mathbf {L}})^{-1} z_0\). Although \({\mathbf {L}}\) is of small size, the inverse operation still costs a lot of time. On the contrary, our NLMs kernel works forward, which avoids the inverse operation as done in the OGLR algorithm (except for the inverse of the diagonal matrix D, a linear operation). Hence, our method works much faster than the OGLR method.

3.2 Determine the coefficients with least square

The set of coefficients \(\{\beta _k\}\) in the multi-layer scheme plays a significant role in achieving good denoising performance. In this paper, instead of using parameters according to the s-curve functions proposed in [28], we regard the determination of \(\{\beta _k\}\) as a regression problem and apply the least square algorithm to solve it. Our cost function is as follows:

$$\begin{aligned} \begin{aligned} C&= \min _{\{\beta _0,...\beta _K\}} \sum _{p=1}^P \left\| \left( \beta _0{\mathbf {F}}^K + \sum _{k=1}^K\beta _k({{\mathbf {I}}}-{\mathbf {F}}\right) {\mathbf {F}}^{K-k} ){z_{p}} - {z_{0p}}\right\| _2^2, \end{aligned} \end{aligned}$$
(12)

where K is the number of layers, P is the total number of training images, \(z_p\) represents the p-th noisy image patch, \(z_{0p}\) stands for the p-th noise-free image patch. The aim of (12) is to find an appropriate series of \(\{\beta _k\}\) that work on different filters to minimize the difference between the noisy and clean image patches.

Note that during the training process, we distinguish the images with different noise levels. In other words, each noise level will be assigned with a set of optimal coefficients. For each noise level, when the training process is finished, we will estimate the noise variance according to the newly-obtained \(\{\beta _k\}\). If the estimated noise is higher than a given threshold \(\sigma _{th}\), it is encouraged to train \(\{\beta _k\}\) again with the newly-obtained \(\{\beta _k\}\).

Additionally, the number of layers K is also an important parameter. Details will be discussed in Sects. 3.3 and 4.2.

3.3 NLMs with K residual compensation

The NLMs filter can be embedded into the multi-layer scheme and the output filtered image is with one smooth term and K residuals:

$$\begin{aligned} \begin{aligned} {{{u}_{out}} = {\beta _0}{{\mathbf {F}}^K}{u}+ {\beta _1}({\mathbf {I}}-{\mathbf {F}}){\mathbf {F}}^{K - 1}{u} +...+ {\beta _{K-1}}({\mathbf {I}}-{\mathbf {F}}){\mathbf {F}}{u} + {\beta _{K}}({\mathbf {I}} - {\mathbf {F}}){u}.} \end{aligned} \end{aligned}$$
(13)

Since \({\mathbf {F}}\) is the normalized affinity matrix, it can act as a low-pass filter according to the graph Fourier transform theory. \(({{\mathbf {I}}}-{\mathbf {F}})\) is the normalized Laplacian, and it can function as a high-pass filter [5]. \({\mathbf {F}}^K u\) stands for the smooth term, which is obtained by the cascade of K low-pass filters \({\mathbf {F}}\). The residual K terms are the corresponding detail terms. When \(K = 1\), Eq.(13) degenerates to:

$$\begin{aligned} u_{out}=\beta _0 {\mathbf {F}}u+\beta _1 ({{\mathbf {I}}}-{\mathbf {F}})u, \end{aligned}$$

where the filtered image is composed of one smooth term \({\mathbf {F}}u\) and one residual detail term \(({{\mathbf {I}}}-{\mathbf {F}})u\). Thus, when K increases, the smoother \(u_{out}\) will be. The value of K can not be too large or too small. Too few layers may lead to an incomplete representation of the image, which can not remove the noise effectively, i.e., some details are not restored, or the homogeneous part of the filtered image is not smooth enough etc. However, too many layers would result in a large computation work, which consumes a lot of time, with only a slight performance improvement. The choice of K will be discussed in Sect. 4.2.

With the learned coefficients \(\{\beta _k\}\) and the number of layers K, the proposed method is summarized in Algorithm 1. In addition, the flowchart of the proposed graph-based NLMs with multi-layer residual compensation is shown in Fig. 1, where a noisy image with noise variance \(\sigma = 50\) is used as an example.

figure a

4 Experimentation

4.1 Experimental setup

We testify the effectiveness of the proposed method both on natural images and depth images. Additive white Gaussian noise (AWGN) is added to these images, with standard deviations \({\sigma }\) ranging from 10 to 50. According to different noise variances \(\sigma\), the patch size in our experiment ranges from 10 to 22, the and step size \({N_S}\) is from 2 to 6. In the implementation, the normalization parameter \(\gamma\) in Eq. (3) was empirically set to be 0.6 for the natural images and \(\gamma = 1\) for the depth images. The constant \({\sigma _p}\) in Eq. (6) is set to be \({10^{ 6}}\) and the patch cluster size L is from 5 to 50. The noise variance threshold mentioned in Sec.3.2 is \(\sigma _{th}=5\).

The test images are from public dataset such as the BSDS500 dataset [1] and the Middlebury Stereo Datasets [23].

We compare our method with the original NLMs [3], OGLR [21] and two other state-of-the-art methods, i.e., Block-Matching 3D (BM3D) [8] and the ADNet method [30]. The peak signal noise ratio (PSNR) and the structural similarity (SSIM) are used to evaluate the performance of these methods.

4.2 Determination of number of layers K

To find out the most appropriate number of layers K, we test six different images. Five levels of noise are tested separately, i.e., \(\sigma = 10,20, 30, 40,50\). The average PSNR and SSIM of test images under different noise levels are computed with different K. In our experiments, K ranges from 1 to 6. The maximum of K is set to be 6 because when \(K>6\), the computation of the power of matrix costs a lot of time, which is contrary to our motivation.

Figure 2 shows the PSNR (a) and SSIM (b) results according to K. We can see from (a) that when \(K\ge 4\), the PSNR converges to a fixed value for all five noise levels. Given SSIM (b), the more layers there are, the higher the SSIM, but when \(K\ge 4\), it does not improve significantly. Hence, to balance the PSNR, SSIM, and time cost, we make a compromise by setting \(K=4\) in our algorithm according to Fig. 2.

Fig. 2
figure 2

From above to bottom: a the average PSNR (dB) of different number of layers, b the average SSIM of different number of layers

4.3 Determination of \(\{\beta _k\}\)

In the process of training the coefficient set \(\{\beta _k\}\) in Eq. (13), three depth images(cloth, Aloe and flowerpot) and three natural images (barbara, chips and man) are used as the training data, as shown in Fig. 3. Each image is divided into 3000 to 6000 image patches under different noise levels with varying patch sizes.

The learned sets \(\{\beta _k\}\) are shown in Table 1 and 2. Both the two tables indicate that the smooth term \({\mathbf {F}}^K u\) of Eq. (13) plays the major role in the denoising process. Additionally, although the coefficients of the residual terms are small values, even negative values, they also play important roles in retaining detailed information and removing noise.

Fig. 3
figure 3

The images in the process of training the coefficient set \(\{ \beta _{k} \}\). Depth images: a cloth, b Aloe, c flowerpot; natural images: d barbara, e chips, f man

Table 1 The learned \(\{\beta _k\}\) for depth images with different levels of noise
Table 2 The learned \(\{\beta _k\}\) for real natural images with different levels of noise

5 Results and discussion

Figures 4, 5, 6, and 7 depict the denoising performance of the five methods on four depth images (wood, bowling, lampshade, and teddy) with a noise variance \(\sigma = 50\), the difference between the original depth images and the filtered images, and the corresponding zoomed parts. Figures 8, 9, and 10 show the denoising results of the five methods on three natural images (bird, house and jar) with noise variance \(\sigma = 50\) respectively. From left to right are: the original image, the noisy image, the results obtained by OGLR, BM3D, ADNet and the proposed method.

For the image wood, the horizontal line in the center of the image is seriously blurred by OGLR and BM3D. Moreover, the homogeneous parts are still corrupted and not well restored. ADNet can preserve edges very well visually. However, it generates some undesirable parts, such as the black point in the lower-left corner and the black segment in the center. In our case, although the edges are not preserved as well as ADNet, the homogeneous parts are well smoothed. The middle row shows the figures of differences between the original image and filtered images. Dark contours and areas indicate that there are significant differences. From the difference figures, we can see that our result is more similar to the original clean image. In terms of both visual performance and the indexes, our method provides the best denoising result.

For the image bowling, the PSNR and SSIM of the proposed method are superior to the other three methods. The edge between the bowling ball and pin is blurred, even distorted by BM3D. ADNet generates a deformation on the edge of the ball. The deformation may be due to that the training data does not include images with this kind of data and shape. Our method and OGLR provide better results, while our result is smoother in the homogeneous regions. Additionally, the figures of differences demonstrate that our method has excellent denoising performance and preserves the brightness very well.

The images lampshade and teddy are more complex than the above two images, with more details and weak edges, shown in Figs. 6 and 7. The results show that our method can restore the image very well in smooth areas, but can not preserve the sharp corners, e.g., the corner of the zoomed parts. In addition, our method generates some artifacts in the homogeneous areas. Actually, this phenomenon exists in the NLM theory-based methods, including the NLM, BM3D, OGLR. When the SNR is low, i.e., the noise is strong, some neighboring pixels in a patch may be considered as line structures. These structures will be enhanced or preserved during denoising, thus producing the artifacts.

Figures 8 and 9 show the denoising results of two natural images bird and house under noisy case \(\sigma = 50\). It is clear that the proposed method achieves satisfactory performance in the homogeneous region of the image, like the sky in bird and the shadow of the roof in house. From the enlarged sub-images in Fig. 9, one can see that BM3D, OGLR and ADNet generate some undesirable artifacts and destroy the edge of shadow, while our method provides smoother results with fewer artifacts. Figure 10 shows the results on jar, the carved patterns on the jar can be barely seen after denoising. This hints that our method is not that effective in maintaining the details of texture-rich images.

Figure 11 display the examples on two color images tape and pepper under noisy case \(\sigma = 50\). The color image are denoised channel by channel. Visually speaking, the denoising results of our method is competitive with the other methods

Tables 3 and 4 illustrate the PSNR and SSIM of the proposed method and four other state-of-the-art methods on several depth images and real natural images. The highest indexes are in bold, the second-best are underlined. From the results, we can see that our proposed method is comparable with the state-of-the-art methods. In addition, it outperforms the OGLR method in nearly all cases, especially with a large noise variance \(\sigma\). Furthermore, when \(\sigma\) is large, the performance of our method becomes more competitive. In addition, our method performs better for the piece-wise depth image compared to the performance on real natural images. However, when dealing with texture-rich images such as jar and flower, our result is not as good as ADNet.

The above experiments imply that our method is good at denoising piece-wise smooth images. To further testify this conclusion, we test our method on the Middlebury Stereo Datasets 2006 [13]. Table 5 summarizes a mean PSNR and SSIM of 10 depth images obtained by the five methods. The results show that our method is effective at denoising the depth images, i.e., piece-wise smooth images, and our method outperforms the other methods when the noise level \(\sigma >30\). This is due to that we use the multi-layer framework: the term \({\mathbf {F}}^K {\mathbf {u}}\) is obtained after K filtering, which results in a very smooth term nearly with no noise. The residual terms function as supplements, which helps to restore some details from the noise. Furthermore, our method takes considerably less time to operate than the OGLR method. For instance, our method takes around 30 seconds to process an image with a size of 500*300 pixels and a noise level of sigma=10, whereas the OGLR takes about 90 seconds. However, as compared to other approaches such as BM3D and ADnet, the graph-based methods take longer, which is a common downside of the graph-based method.

Fig. 4
figure 4

Denoising results of depth image (wood) with the noise variance \(\sigma = 50\) [top], the difference between the original depth images and the filtered images [middle], and zoom-in image [bottom]. From the left (a) to right (g) are: the original image, the noisy image, the results obtained by NLM, BM3D, OGLR, ADnet and the proposed method respectively

Fig. 5
figure 5

Denoising results of depth image (bowling) with the noise variance \(\sigma = 50\) [top], the difference between the original depth images and the filtered images [middle], and zoom-in image [bottom]. From the left (a) to right (g) are: the original image, the noisy image, the results obtained by NLM, BM3D, OGLR, ADnet and the proposed method respectively

Fig. 6
figure 6

Denoising results of depth image (lampshade) with the noise variance \(\sigma = 50\) [top], the difference between the original depth images and the filtered images [middle], and zoom-in image [bottom]. From the left (a) to right (g) are: the original image, the noisy image, the results obtained by NLM, BM3D, OGLR, ADnet and the proposed method respectively

Fig. 7
figure 7

Denoising results of depth image (teddy) with the noise variance \(\sigma = 50\) [top], the difference between the original depth images and the filtered images [middle], and zoom-in image [bottom]. From the left (a) to right (g) are: the original image, the noisy image, the results obtained by NLM, BM3D, OGLR, ADnet and the proposed method respectively

Fig. 8
figure 8

Denoising results of natural image (bird) with the noise variance \(\sigma = 50\) [top] and zoom-in image [bottom]. From the left (a) to right (g) are: the orig inal image, the noisy image, the results obtained by NLM, BM3D, OGLR, ADnet and the proposed method respectively

Fig. 9
figure 9

Denoising results of natural image (house) with the noise variance \(\sigma = 50\) [top] and zoom-in image [bottom]. From the left (a) to right (g) are: the original image, the noisy image, the results obtained by NLM, BM3D, OGLR, ADnet and the proposed method respectively

Fig. 10
figure 10

Denoising results of natural image (jar) with the noise variance \(\sigma = 50\) [top]. From the left (a) to right (g) are: the original image, the noisy image, the results obtained by NLM, BM3D, OGLR, ADnet and the proposed method respectively

Fig. 11
figure 11

Denoising results of color images (tape and Pepper) with the noise variance \(\sigma = 50\). From the left (a) to right (g) are: the original image, the noisy image, the results obtained by NLM, BM3D, OGLR, ADnet and the proposed method respectively

Table 3 Image denoising on depth images with NLM ,BM3D, OGLR, ADNet and our method: peformance comparisons in PSNR (Left, in dB) and SSIM (Right)
Table 4 Image denoising on nature images with NLM ,BM3D, OGLR, ADNet and our method: performance comparisons in PSNR (Left, in dB) and SSIM (Right)
Table 5 The results of image denoising on depth dataset

6 Conclusion

In this paper, we propose a graph-based NLMs algorithm for image denoising. The edge weights defined in the OGLR algorithm are applied as the NLMs kernel. A multi-layer residual compensation strategy is then used to recover the details. The coefficients of the smooth term and the residual terms of the multi-layer representation are learned according to the least mean square method. We testify the effectiveness of our method both on natural images and depth images. Our proposed method outperforms the original OGLR method in PSNR/SSIM/time cost. Compared with the other state-of-the-art methods, including the classical NLMs, BM3D and the AD-Net, our proposed method provides comparable or better results. Especially, our method has excellent denoising performance on the piecewise smooth images when the noise level is high.

Availability and data materials

The data that support the findings of this study are available on request from the corresponding author F.Y.

Abbreviations

NLMs:

Non-local means

TV:

Total variation

BM3D:

3D transform-domain filter

PCA:

Principal component analysis

GSP:

Graph signal processing

OGLR:

The optimal graph Laplacian regularization

AWGN:

Additive white Gaussian noise

PSNR:

The peak signal noise ratio

SSIM:

The structural similarity

References

  1. P. Arbelaez, M. Maire, C. Fowlkes, J. Malik, Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2010)

    Article  Google Scholar 

  2. T. Bouwmans, S. Javed, H. Zhang, Z. Lin, R. Otazo, On the applications of robust PCA in image and video processing. Proc. IEEE 106(8), 1427–1457 (2018)

    Article  Google Scholar 

  3. A. Buades, B. Coll, J. Morel, A non-local algorithm for image denoising. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2, pp. 60–65 (2005)

  4. J. Chen, J. Chen, H. Chao, Y. Ming, Image blind denoising with generative adversarial network based noise modeling. in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

  5. G. Cheung, E. Magli, Y. Tanaka, M. Ng, Graph spectral image processing. Proc. IEEE 106(5), 907–930 (2018)

    Article  Google Scholar 

  6. D. Cho, T.D. Bui, Multivariate statistical modeling for image denoising using wavelet transforms. Signal Process.: Image Commun. 20(1), 77–89 (2005)

    Google Scholar 

  7. C. Couprie, L. Grady, L. Najman, J.-C. Pesquet, H. Talbot, Dual constrained TV-based regularization on graphs. SIAM J. Image Sci. 6(3), 1246–1273 (2013)

  8. K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007)

  9. W. Dong, X. Li, L. Zhang, G. Shi, Sparsity-based image denoising via dictionary learning and structural clustering. in CVPR 2011 (IEEE, 2011), p. 457–464 (2011)

  10. K. Egiazarian, A. Foi, V. Katkovnik, Compressed sensing image reconstruction via recursive spatially adaptive filtering. in 2007 IEEE International Conference on Image Processing, vol. 1 (IEEE, 2007). p. I–549

  11. M. Elad, On the origin of the bilateral filter and ways to improve it. IEEE Trans. Image Process. 11(10), 1141–1151 (2002)

    Article  MathSciNet  Google Scholar 

  12. A. Gadde, S.K. Narang, A. Ortega, Bilateral filter: graph spectral interpretation and extensions. in 2013 IEEE International Conference on Image Processing (IEEE, 2013). p. 1222–1226

  13. H. Hirschmüller, D. Scharstein, Evaluation of cost functions for stereo matching. in IEEE Conference on Computer Vision and Pattern Recognition (2007)

  14. A. Hyvärinen, P. Hoyer, E. Oja, Image denoising by sparse code shrinkage. in Intelligent Signal Processing (Citeseer, 1999)

  15. A. Kheradmand, P. Milanfar, A general framework for regularized, similarity-based image restoration. IEEE Trans. Image Process. 23(12), 5136–5151 (2014)

    Article  MathSciNet  Google Scholar 

  16. J. Liang, R. Liu, Stacked denoising autoencoder and dropout together to prevent overfitting in deep neural network, in 2015 8th International Congress on Image and Signal Processing (CISP) (2015)

  17. C. Louchet, L. Moisan, Total variation as a local filter. SIAM J. Image Sci. 4(2), 651–694 (2011)

    Article  MathSciNet  Google Scholar 

  18. F.G. Meyer, X. Shen, Perturbation of the eigenvectors of the graph Laplacian: application to image denoising. Appl. Comput. Harmon. Anal. 36(2), 326–334 (2014)

    Article  MathSciNet  Google Scholar 

  19. P. Milanfar, A tour of modern image filtering: New insights and methods, both practical and theoretical. IEEE Signal Process. Mag. 30(1), 106–128 (2012)

    Article  Google Scholar 

  20. M. Onuki, S. Ono, M. Yamagishi, Y. Tanaka, Graph signal denoising via trilateral filter on graph spectral domain. IEEE Trans. Signal Inform. Process. Over Netw. 2(2), 137–148 (2016)

    Article  MathSciNet  Google Scholar 

  21. J. Pang, G. Cheung, Graph Laplacian regularization for image denoising: analysis in the continuous domain. IEEE Trans. Image Process. 26(4), 1770–1785 (2017)

    Article  MathSciNet  Google Scholar 

  22. J.-M. Park, W.-J. Song, W. Pearlman, Speckle filtering of SAR images based on adaptive windowing. IEE Proc.-Vis. Image Signal Process. 146(4), 191–197 (1999)

    Article  Google Scholar 

  23. D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, P. Westling, High-resolution stereo datasets with subpixel-accurate ground truth, in German conference on pattern recognition (Springer, 2014). p. 31–42

  24. D.I. Shuman, S.K. Narang, P. Frossard, A. Ortega, P. Vandergheynst, The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30(3), 83–98 (2013)

    Article  Google Scholar 

  25. J.-L. Starck, E.J. Candès, D.L. Donoho, The curvelet transform for image denoising. IEEE Trans. Image Process. 11(6), 670–684 (2002)

    Article  MathSciNet  Google Scholar 

  26. C. Sutour, C.A. Deledalle, J.F. Aujol, Adaptive regularization of the NL-means: application to image and video denoising. IEEE Trans. Image Process. 23(8), 3506–3521 (2014)

    Article  MathSciNet  Google Scholar 

  27. H. Talebi, P. Milanfar, Nonlocal image editing. IEEE Trans. Image Process. 23(10), 4460–4473 (2014)

    Article  MathSciNet  Google Scholar 

  28. H. Talebi, P. Milanfar, Fast multilayer Laplacian enhancement. IEEE Trans. Comput. Imaging 2(4), 496–509 (2016)

    Article  MathSciNet  Google Scholar 

  29. G. Taubin, A signal processing approach to fair surface design, in Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques (1995). p. 351–358

  30. C. Tian, X. Yong, Z. Li, W. Zuo, L. Fei, H. Liu, Attention-guided CNN for image denoising. Neural Netw. 124, 177–129 (2020)

  31. C. Tian, X. Yong, W. Zuo, Image denoising using deep CNN with batch renormalization. Neural Netw. 121, 461–473 (2020)

    Article  Google Scholar 

  32. G. Vasile, E. Trouve, J.S. Lee, V. Buzuloiu, Intensity-driven adaptive-neighborhood technique for polarimetric and interferometric SAR parameters estimation. IEEE Trans. Geosci. Remote Sens. 44(6), 1609–1621 (2006)

    Article  Google Scholar 

  33. J. Zeng, G. Cheung, M. Ng, J. Pang, C. Yang, 3D point cloud denoising using graph Laplacian regularization of a low dimensional manifold model. IEEE Trans. Image Process. 29, 3474–3489 (2019)

  34. K. Zhang, W. Zuo, Y. Chen, D. Meng, L. Zhang, Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2016)

Download references

Acknowledgements

Not applicable.

Funding

This research was funded by National Natural Science Foundation of China (61625305).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, methodology, writing-original draft, FY; software, validation, XC; supervision, funding acquisition, LC. All authors approved the final, submitted version of the manuscript.

Corresponding author

Correspondence to Fang Yang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, F., Chen, X. & Chai, L. Optimal graph edge weights driven nlms with multi-layer residual compensation. EURASIP J. Adv. Signal Process. 2021, 88 (2021). https://doi.org/10.1186/s13634-021-00800-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-021-00800-z

Keywords