Robust flash denoising/deblurring by iterative guided filtering
© Seo and Milanfar; licensee Springer. 2012
Received: 23 June 2011
Accepted: 6 January 2012
Published: 6 January 2012
A practical problem addressed recently in computational photography is that of producing a good picture of a poorly lit scene. The consensus approach for solving this problem involves capturing two images and merging them. In particular, using a flash produces one (typically high signal-to-noise ratio [SNR]) image and turning off the flash produces a second (typically low SNR) image. In this article, we present a novel approach for merging two such images. Our method is a generalization of the guided filter approach of He et al., significantly improving its performance. In particular, we analyze the spectral behavior of the guided filter kernel using a matrix formulation, and introduce a novel iterative application of the guided filter. These iterations consist of two parts: a nonlinear anisotropic diffusion of the noisier image, and a nonlinear reaction-diffusion (residual) iteration of the less noisy one. The results of these two processes are combined in an unsupervised manner. We demonstrate that the proposed approach outperforms state-of-the-art methods for both flash/no-flash denoising, and deblurring.
Recently, several techniques [1–5] to enhance the quality of flash/no-flash image pairs have been proposed. While the flash image is better exposed, the lighting is not soft, and generally results in specularities and unnatural appearance. Meanwhile, the no-flash image tends to have a relatively low signal-to-noise ratio (SNR) while containing the natural ambient lighting of the scene. The key idea of flash/no-flash photography is to create a new image that is closest to the look of the real scene by having details of the flash image while maintaining the ambient illumination of the no-flash image. Eisemann and Durand  used bilateral filtering  to give the flash image the ambient tones from the no-flash image. On the other hand, Petschnigg et al.  focused on reducing noise in the no-flash image and transferring details from the flash image to the no-flash image by applying joint (or cross) bilateral filtering . Agrawal et al.  removed flash artifacts, but did not test their method on no-flash images containing severe noise. As opposed to a visible flash used by [2–4], recently Krishnan and Fergus  used both near-infrared and near-ultraviolet illumination for low light image enhancement. Their so-called "dark flash" provides high-frequency detail in a less intrusive way than a visible flash does even though it results in incomplete color information. All these methods ignored any motion blur by either depending on a tripod setting or choosing sufficiently fast shutter speed. However, in practice, the captured images under low-light conditions using a hand-held camera often suffer from motion blur caused by camera shake.
More recently, Zhuo et al.  proposed a flash deblurring method that recovers a sharp image by combining a blurry image and a corresponding flash image. They integrated a so-called flash gradient into a maximum-a-posteriori framework and solved the optimization problem by alternating between blur kernel estimation and sharp image reconstruction. This method outperformed many states-of-the-art single image deblurring [8–10] and color transfer methods . However, the final output of this method looks somewhat blurry because the model only deals with a spatially invariant motion blur.
Others have used multiple pictures of a scene taken at different exposures to generate high dynamic range images. This is called multi-exposure image fusion  which shares some similarity with our problem in that it seeks a new image that is of better quality than any of the input images. However, the flash/no-flash photography is generally more difficult due to the fact that there are only a pair of images. Enhancing a low SNR no-flash image with a spatially variant motion blur only with the help of a single flash image is still a challenging open problem.
2 Overview of the proposed approach
We have provided a significantly expanded statistical derivation and description of the guided filter and its properties in Section 3 and Appendix.
We provide many more experimental results for both flash/no-flash denoising and de- blurring in Section 5.
We describe the key ideas of diffusion and residual iteration and their novel relevance to iterative guided filtering in the Appendix.
We prove the convergence of the proposed iterative estimator in the Appendix.
As supplemental material, we share our project websitec where flash/no-flash relighting examples are also presented.
In Section 3, we outline the guided filter and study its statistical properties. We describe how we actually estimate the linear model coefficients a, b, c, d and α, β, and we provide an interpretation of the proposed iterative framework in matrix form in Section 4. In Section 5, we demonstrate the performance of the system with some experimental results, and finally we conclude the article in Section 6.
3 The guided filter and its properties
Next, we study some fundamental properties of the guided filter kernel in matrix form.
where z is a vector of pixels in Z and W is only a function of z. The filter output can be analyzed as the product of a matrix of weights W with the vector of the given the input image y.
The matrix W is symmetric as shown in Equation 8 and the sum of each row of W is equal to one (W1 N = 1 N ) by definition. However, as seen in Equation 6, the definition of the weights does not necessarily imply that the elements of the matrix W are positive in general. While this is not necessarily a problem in practice, we find it useful for our purposes to approximate this kernel with a proper admissible kernel . That is, for the purposes of analysis, we approximate W as a positive valued, symmetric positive definite matrix with rows summing to one, as similarly done in . For the details, we refer the reader to the Appendix A.
4 Iterative application of local LMMSE predictors
where we compute .
The idea of using these averaged coefficients , is analogous to the simplest form of aggregating multiple local estimates from overlapped patches in image denoising and super-resolution literature . The aggregation helps the filter output look locally smooth and contain fewer artifacts.i Recall that and correspond to the base layer and the detail layer, respectively. The effect of the regularization parameters ε1 and ε2 is quite the opposite in each case in the sense that the higher ε2 is, the more detail through can be obtained; whereas the lower ε1 ensures that the image content in is not over-smoothed.
where n is the iteration number and τ n > 0 is set to be a monotonically decaying functionk of n such that converges. Figure 3 shows an example to illustrate that the resulting coefficients at the 20th iteration predict the underlying data better than α1, β1 do. Similarly, improves upon as shown in Figure 4. This iteration is closely related to diffusion and residual iteration which are two important methods  which we describe briefly below, and with more detail in Appendix.
5 Experimental results
In this section, we apply the proposed approach to flash/no-flash image pairs for denoising and deblurring. We convert images Z and Y from RGB color space to CIE Lab, and perform iterative guided filtering separately in each resulting channel. The final result is converted back to RGB space for display. We used the implementation of the guided filter  from the author's website.o All figures in this section are best viewed in color.p
5.1 Flash/no-flash denoising
5.1.1 Visible flash 
5.1.2 Dark flash 
5.2 Flash/no-flash deblurring
6 Summary and future work
The guided filter has proved to be more effective than the joint bilateral filter in several applications. Yet we have shown that it can be improved significantly more still. We analyzed the spectral behavior of the guided filter kernel using a matrix formulation and improved its performance by applying it iteratively. Iterations of the proposed method consist of a combination of diffusion and residual iteration. We demonstrated that the proposed approach yields outputs that not only preserve fine details of the flash image, but also the ambient lighting of the no-flash image. The proposed method outperforms state-of-the-art methods for flash/no-flash image denoising and deblurring. It would be interesting to see if the performance of other nonparametric filer kernels such as bilateral filters and locally adaptive regression kernels  can be further improved in our iterative framework. It is also worthwhile to explore several other applications such as joint upsampling , image matting , mesh smoothing [24, 25], and specular highlight removal  where the proposed approach can be employed.
Positive definite and symmetric row-stochastic approximation of W
Furthermore, the vectors r and c are unique to within a scalar (i.e., α r, c /α.) Sinkhorn's algorithm for obtaining r and c in effect involves repeated normalization of the rows and columns (see Algorithm 1 for details) so that they sum to one, and is provably convergent and optimal in the cross-entropy sense .
Algorithm 1 Algorithm for scaling a matrix A to a nearby doubly-stochastic matrix
Given a matrix A, let (N, N) be size(A) and initialize r = ones(N, 1);
for k = 1 : iter;
c = 1./(A T r);
r = 1./(A c);
C = diag(c); R = diag(r);
Diffusion and residual iteration
where is a scaled version of by, and therefore the left-hand side of the above is a discretization of the derivative operator , and as detailed in , W - I is effectively the nonlinear Laplacian operator corresponding to the kernel in (6).
An alternative to repeated applications of the filter W is to consider the residual signals, defined as the difference between the estimated signal and the measured signal. This results in a variation of the diffusion estimator which uses the residuals as an additional forcing term. The net result is a type of reaction-diffusion process . In statistics, the use of the residuals in improving estimates has a rather long history, dating at least back to the study of Tukey  who termed the idea "twicing". More recently, the idea has been suggested in the applied mathematics community under the rubric of Bregman iterations , and in the machine learning and statistics literature  as L2-boosting.
where F n is a polynomial function of W of order n + 1. The first iterate is precisely the "twicing" estimate of Tukey .
Convergence of the proposed iterative estimator
where the inequality follows from the knowledge that 0 ≤ λ N ≤ ... λ3 ≤ λ2 <λ1 = 1. Furthermore, in Section 4 we defined τ n to be a monotonically decreasing sequence such that . Hence, all eigenvalues λ i (P n ) are upper bounded by the constant c, independent of the number of iterations n, ensuring the stability of the iterative process.
m This is generally defined as the difference between the estimated signal and the measured signal Z, but in our context refers to the detail signal. n We refer the reader to Appendix C for proof of convergence of the proposed iterative estimator. ohttp://personal.ie.cuhk.edu.hk/~hkm007/. p We refer the reader to the project Website http://users.soe.ucsc.edu/~milanfar/research/rokaf/.html/IGF/. q The window size p for W d and W was set to 21 and 5 respectively for all the denoising examples. r The window size p for W d and W was set to 41 and 81, respectively, for the deblurring examples to deal with displacement between the flash and no-flash image. s Note that due to the use of residuals, this is a different initialization than the one used in the diffusion iterations.
We thank Dilip Krishnan for sharing the post-processing code  for dark-flash examples. This study was supported by the AFOSR Grant FA 9550-07-01-0365 and NSF Grant CCF-1016018. This study was done while the first author was at the University of California.
- He K, Sun J, Tang X: Guided image filtering. Proceedings of European Conference Computer Vision (ECCV) 2010.Google Scholar
- Petschnigg G, Agrawala M, Hoppe H, Szeliski R, Cohen M, Toyama K: Digital photography with flash and no-flash image pairs. ACM Trans Graph 2004, 21(3):664-672.View ArticleGoogle Scholar
- Eisemann E, Durand F: Flash photography enhancement via intrinsic relighting. ACM Trans Graph 2004, 21(3):673-678.View ArticleGoogle Scholar
- Agrawal A, Raskar R, Nayar S, Li Y: Removing photography artifacts using gradient projection and flash-exposure sampling. ACM Trans Graph 2005, 24: 828-835. 10.1145/1073204.1073269View ArticleGoogle Scholar
- Zhuo S, Guo D, Sim T: Robust flash deblurring. IEEE Conference on Computer Vison and Pattern Recognition 2010.Google Scholar
- Tomasi C, Manduchi R: Bilateral Filtering for Gray and Color Images. Proceedings of the 1998 IEEE International Conference of Compute Vision Bombay, India 1998, 836-846.Google Scholar
- Krishnan D, Fergus R: Dark flash photography. ACM Trans Graph 2009, 28(4):594-611.Google Scholar
- Shan Q, Jia J, Brown MS: Globally optimized linear windowed tone-mapping. IEEE Trans Vis Comput Graph 2010, 16(4):663-675.View ArticleGoogle Scholar
- Fergus R, Singh B, Hertsmann A, Roweis ST, Freeman WT: Removing camera shake from a single image. ACM Trans Graph (SIGGRAPH) 2006, 25: 787-794. 10.1145/1141911.1141956View ArticleGoogle Scholar
- Yuan L, Sun J, Quan L, Shum HY: Progressive inter-scale and intra-scale non-blind image deconvolution. ACM Trans Graph 2008, 27(3):1-10.View ArticleGoogle Scholar
- Tai YW, Jia J, Tang CK: Local color transfer via probabilistic segmentation by expectation-maximization. IEEE Conference on Computer Vison and Pattern Recognition 2005.Google Scholar
- Hasinoff W: Variable-aperture photography. PhD Thesis, Department of Computer Science, University of Toronto; 2008.Google Scholar
- Seo H, Milanfar P: Computational photography using a pair of flash/no-flash images by iterative guided filtering. IEEE International Conference on Computer Vision (ICCV) 2011. SubmittedGoogle Scholar
- Buades A, Coll B, Morel JM: A review of image denoising algorithms, with a new one. Multi-scale Model Simulat (SIAM) 2005, 4(2):490-530.MathSciNetView ArticleMATHGoogle Scholar
- Takeda H, Farsiu S, Milanfar P: Kernel regression for image processing and reconstruction. IEEE Trans Image Process 2007, 16(2):349-366.MathSciNetView ArticleGoogle Scholar
- Buades A, Coll B, Morel JM: Nonlocal image and movie denoising. Int J Comput Vis 2008, 76(2):123-139. 10.1007/s11263-007-0052-1View ArticleGoogle Scholar
- Hofmann T, Scholkopf B, Smola AJ: Kernel methods in machine learning. Ann Stat 2008, 36(3):1171-1220. 10.1214/009053607000000677MathSciNetView ArticleMATHGoogle Scholar
- Milanfar P: A tour of modern image processing. IEEE Signal Process Mag 2011. [http://users.soe.ucsc.edu/~milanfar/publications/journal/ModernTour_FinalSubmission.pdf]Google Scholar
- Protter M, Elad M, Takeda H, Milanfar P: Generalizing the non-local-means to super-resolution reconstruction. IEEE Trans Image Process 2009, 18: 36-51.MathSciNetView ArticleGoogle Scholar
- Perona P, Malik J: Scale-space and edge detection using anistropic diffusion. IEEE Trans Pattern Anal Mach Intell 1990, 12(9):629-639.View ArticleGoogle Scholar
- Tukey JW: Exploratory Data Analysis. Addison Wesley, Reading, MA; 1977.MATHGoogle Scholar
- Kopf J, Cohen MF, Lischinski D, Uyttendaele M: Joint bilateral upsampling. ACM Trans Graph 2007, 26(3):96. 10.1145/1276377.1276497View ArticleGoogle Scholar
- He K, Sun J, Tang X: Fast matting using large kernel matting Laplacian matrices. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2010.Google Scholar
- Fleishman S, Drori I, Cohen-Or D: Bilateral mesh denoising. ACM Trans Graph 2003, 22(3):950-953. 10.1145/882262.882368View ArticleGoogle Scholar
- Jones T, Durand F, Desbrun M: Non-iterative feature preserving mesh smoothing. ACM Trans Graph 2003, 22(3):943-949. 10.1145/882262.882367View ArticleGoogle Scholar
- Yang Q, Wang S, Ahuja N: Real-time specular highlight removal using bilateral filtering. ECCV 2010.Google Scholar
- Knight PA: The Sinkhorn-Knopp algorithm: convergence and applications. SIAM J Matrix Anal Appl 2008, 30: 261-275. 10.1137/060659624MathSciNetView ArticleMATHGoogle Scholar
- Sinkhorn R: A relationship between arbitrary positive matrices and doubly stochastic matrices. Ann Math Stat 1964, 35(2):876-879. 10.1214/aoms/1177703591MathSciNetView ArticleMATHGoogle Scholar
- Darroch J, Ratcliff D: Generalized iterative scaling for log-linear models. Ann Math Stat 1972, 43: 1470-1480. 10.1214/aoms/1177692379MathSciNetView ArticleMATHGoogle Scholar
- Singer A, Shkolinsky Y, Nadler B: Diffusion interpretation of nonlocal neighborhood filters for signal denoising. SIAM J Imaging Sci 2009, 2: 118-139. 10.1137/070712146MathSciNetView ArticleMATHGoogle Scholar
- Cottet G, Germain L: Image processing through reaction combined with nonlinear diffusion. Math Comput 1993, 61: 659-673. 10.1090/S0025-5718-1993-1195422-2MathSciNetView ArticleMATHGoogle Scholar
- Osher S, Burger M, Goldfarb D, Xu J, Yin W: An iterative regularization method for total variation-based image restoration. Multiscale Model Simulat 2005, 4(2):460-489. 10.1137/040605412MathSciNetView ArticleMATHGoogle Scholar
- Buhlmann P, Yu B: Boosting with the L2loss: regression and classification. J Am Stat Assoc 2003, 98(462):324-339. 10.1198/016214503000125MathSciNetView ArticleMATHGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.