Skip to main content

Efficient non-uniform deblurring based on generalized additive convolution model

Abstract

Image with non-uniform blurring caused by camera shake can be modeled as a linear combination of the homographically transformed versions of the latent sharp image during exposure. Although such a geometrically motivated model can well approximate camera motion poses, deblurring methods in this line usually suffer from the problems of heavy computational demanding or extensive memory cost. In this paper, we develop generalized additive convolution (GAC) model to address these issues. In GAC model, a camera motion trajectory can be decomposed into a set of camera poses, i.e., in-plane translations (slice) or roll rotations (fiber), which can both be formulated as convolution operation. Moreover, we suggest a greedy algorithm to decompose a camera motion trajectory into a more compact set of slices and fibers, and together with the efficient convolution computation via fast Fourier transform (FFT), the proposed GAC models concurrently overcome the difficulties of computational cost and memory burden, leading to efficient GAC-based deblurring methods. Besides, by incorporating group sparsity of the pose weight matrix into slice GAC, the non-uniform deblurring method naturally approaches toward the uniform blind deconvolution. Experimental results show that GAC-based deblurring methods can obtain satisfactory deblurring results compared to both state-of-the-art uniform and non-uniform deblurring methods and are much more efficient than non-uniform deblurring methods.

1 Introduction

Image blur is generally inevitable due to various factors such as defocus and camera shake. Blind deblurring from a real-world blurry image that needs to estimate the blur procedure and latent sharp image is a very ill-posed problem. Recently, several hardware-assisted approaches had been developed [16], by which additional information can be acquired to reduce the ill-posedness of the blind deblurring. These hardware-assisted approaches can provide much easier deviation of the blur or sharp image, but require complex camera configurations [1, 2] or dedicatedly designed hardware support [36], far from taking the place of traditional imaging devices. At the same time, in the era of ubiquitous acquisition of digital images using portable imaging devices, e.g., digital camera, mobile phone, camera shake is often unavoidable during exposure procedure, a major cause that ruins a photograph. For effective and efficient deblurring, it is crucial to develop appropriate forward blur model that can well explain the image degradation process of camera shake and then to design proper regularizers and efficient optimization algorithms.

Earlier approaches on camera shake deblurring usually assume that the blur is spatially invariant, due to which the blurry observation can be simply modeled as convolution of sharp image and blur kernel, and deblurring can thus be modeled as a blind deconvolution problem [712], where much attention had been devoted to design effective optimization algorithms. Since the convolution operation can be computed efficiently via fast Fourier transform (FFT), these blind deconvolution algorithms are commonly efficient. On one hand, Fergus et al. [7] adopted a mixture of Gaussians for representing the distribution of gradients and introduced a variational Bayesian (VB) approach [13] to estimate the blur kernel. The theoretical and experimental analysis by Levin et al. [8] demonstrated the advantages of VB over maximum a posterior (MAP) and inspired a number of VB-based blind deconvolution algorithms [7, 8, 14, 15]. On the other hand, by enforcing sparser priors [10, 16, 17] or exploiting edge prediction step [9, 1821] to select salient edges in latent image, several carefully designed MAP algorithms can also exhibit promising performance. Most recently, discriminative learning approaches had been developed to learn proper priors for better estimating blur kernel [2225].

However, camera shake blurring is often spatially varying and consequently cannot be simply modeled as a single blur kernel. Thus, how to reasonably model camera shake blurring plays a central role in non-uniform deblurring problem that can be categorized from two aspects. One direct strategy is to approximate the non-uniform blur as multiple blur kernels, where the blurry image is divided into several regions, and each region takes uniform blur. The multiple blur kernels can then be estimated from each region by performing uniform blind deconvolution method locally [26, 27]. The method proposed by Cao et al. [27] incorporated natural and text-specific dictionaries for the blind deblurring of natural scene text image. For non-uniform deblurring, they simply adopted the strategy of estimating the dedicated blur kernel for each text field. However, this kind of approach does not consider the global constraints on local blur kernels based on camera motion trajectory. The other geometrical strategy is based on the projective motion path blur model (PMPB) [28], where a camera shake trajectory can be decomposed into a sequence of camera poses lying in the 6D camera pose space (x-, y-, z-axes rotation and x-, y-, z-axes translation), and consequently, the sharp image projected by each pose is a homography, which is then weighted according to its exposure time, resulting in the blurry image. By far, there are mainly two simplified 3D geometrical models to approximate the 6D subspace, i.e., Whyte et al. [29] suggested to employ x-(pitch), y-(yaw), and z-(roll) axes rotation, while Gupta et al. [30] proposed to adopt x- and y-axes (in-plane) translation and z-axis rotation. By combining the global camera motion constraint and the efficient filter flow (EFF) framework [31], Hirsch et al. suggested to construct local uniform blur models guided by the geometrical constraint [32].

Even in the simplified 3D camera pose subspace [29, 30], non-uniform deblurring methods still suffer the problem of high computational cost or extensive memory burden, placing prominent restriction on its wide applications. To speed up the blur estimation step, Gupta et al. [30] pre-computed a sparse matrix for each homography transform, and thus, the forward non-uniform blur operator can be equivalently defined as the weighted sum of the homography transform matrices, and Hu and Yang [33] further restricted the possible camera pose in a low dimensional subspace. Although the pre-computation techniques can relatively relax the computational inefficiency, these accelerated methods [30, 33] take increasing memory burden to store the huge homography transform matices. Even so, subsequent computation of homography transform matrices is still computationally costly, and thus, it is interesting to ask that is it possible to design forward blur model to benefit from FFT for efficient non-uniform deblurring?

In this paper, we propose generalized additive convolution (GAC) model, by which a camera motion trajectory can be decomposed into in-plane translations (slice) or roll rotations (fiber), resulting in slice GAC model or fiber GAC model. In slice GAC model, a homography is formulated as the rotation of the convolution image of slice and sharp image, and in fiber GAC model, a homography is the inverse polar transform of the convolution of a kernel with the transformed sharp image (please refer to Section 3.2 for detailed proof), in which the convolution operation can be efficiently computed with the help of FFT. By this way, the GAC-based forward blur models only require several FFTs and a number of pixel-wise operations, significantly reducing computational complexity, and concurrently only the basis kernels and a number of rotation matrices are required to be stored so that the problem of memory burden is also relaxed. Furthermore, a greedy algorithm is proposed to generate a more compact set of slices and fibers from any camera motion trajectory, resulting in hybrid GAC model, a more promising way to concurrently solve the problems of computational inefficiency and memory burden. As for the optimization algorithm, we adopt the MAP framework to alternatively estimate non-uniform blur and latent image, in which generalized accelerated proximal gradient (GAPG) algorithm [34], a much efficient optimization algorithm for non-blind deconvolution, is employed for GAC-based non-uniform deblurring.

Moreover, in slice GAC, we introduce group sparsity, i.e., l 2,1-norm, on the pose weight matrix along the rotation angle dimension, interestingly providing a way to make non-uniform deblurring approach toward uniform blind deconvolution. Under alternative minimization framework, we also propose an effective solution to solve this problem. With the group sparsity constraint, more slices with rotation angles around 0 can be activated, and extremely when all the slices except angle 0 are inactivated, the non-uniform delburring method can degrade to uniform blind deconvolution.

Experimental results show that the proposed GAC method can obtain comparable or better deblurring results than the competing uniform and non-uniform deblurring methods. Compared to non-uniform deblurring method, GAC has a much lower peak memory usage than [16, 30, 33] and is much more efficient than the state-of-the-art camera shake removal methods [16, 29, 30, 32, 33].

We summarize our contributions as follows:

  • We develop GAC framework together with two GAC models, i.e., slice GAC and fiber GAC, for forward modeling of camera shake blurring, by which FFT can be employed for efficient computation.

  • To further reduce the computational complexity, a greedy algorithm is proposed to generate hybrid GAC from any camera motion trajectory, resulting in a more compact GAC model.

  • As for the optimization algorithm, we adopt a fast gradient method, i.e., GAPG [34], to estimate latent image, contributing to the faster convergence of the deblurring algorithm.

  • By imposing group sparsity of pose weight matrix in slice GAC, we interestingly provide a way to connect uniform and non-uniform deblurring.

The rest of the paper is organized as follows. Section 2 provides a brief review on camera shake models and optimization algorithms. Section 3 presents the proposed GAC models, which in Section 4 are embedded in non-uniform deblurring, and can be efficiently solved via GAPG. Section 5 presents experimental results, and finally, Section 6 concludes this paper.

2 Prerequisites and related work

For efficient non-uniform deblurring, it is crucial to design forward blur models and efficient optimization algorithms. In this section, we briefly review the original projective motion path blur (PMPB) model and its simplified 3D approximation, and the optimization algorithms in MAP-based deblurring.

2.1 Homography-based model

In the PMPB model, a camera shaken image is geometrically the integration of what the camera “sees” along the motion trajectory that can be modeled in a 6D camera pose space including x-, y-, and z-axes rotation and x-, y-, and z-axes translation. With the discretion of the time t and camera pose space, the integration can be rewritten as the weighted summation of all homographies in the discrete camera pose space, where a homography is the transformed sharp image projected by corresponding camera pose.

For each pose j along the camera motion trajectory, the corresponding homography can be represented as:

$$\begin{array}{@{}rcl@{}} {\boldsymbol{H}_{j}} = \boldsymbol{C}\left({\boldsymbol{R}_{j}} + \frac{\boldsymbol{t}_{j}}{d}\left[ \begin{array}{ccc} 0&0&1 \end{array} \right] \right){\boldsymbol{C}^{- 1}} \end{array} $$
((1))

where C is the matrix of camera intrinsic parameters, d is the depth of the scene, and R j and t j are the rotation matrix and translation vector, respectively.

Given H j , we can construct the corresponding warping matrix K j . Figure 1 shows a flowchart of PMPB model, and the PMPB model treats the process of non-uniform blur as a summation over the images transformed by the warping matrix

$$\begin{array}{@{}rcl@{}} \boldsymbol{b} = {\sum\nolimits}_{j} {{w_{j}}} {\boldsymbol{K}_{j}}\boldsymbol{x} + \boldsymbol{v} = \boldsymbol{Kx} + \boldsymbol{v} \end{array} $$
((2))
Fig. 1
figure 1

PMPB model for camera shake blurring

where w j is the fraction of time the camera spent at pose j, \(\boldsymbol {b}\in {{\mathbb {R}}^{n\times n}}\) is the blurry image, \(\boldsymbol {x}\in {{\mathbb {R}}^{n\times n}}\) is the latent sharp image, K is the sparse warping matrix with size n 2×n 2, and v denotes the additive Gaussian white noise.

PMPB is faithful to real-world camera motion, but has too many unknowns to be estimated. Fortunately, recent studies have shown that the full 6D camera pose space can be well approximated by the discrete 3D subspace of pitch, yaw, and roll rotations [29], or the 3D subspace of in-plane translations and roll rotation in wide focal lengths range [30], which are both effective [35]. In [33], constrained camera pose subspace was further introduced to refine the set of camera poses. Interestingly, the global camera motion constraint can be adopted to guide the construction of local blur kernels [32]. However, the method in [29] suffers from the problem of heavy computational load to compute huge warping matrices, while the methods in [30, 33] suffer from the extensive memory burden problem to store huge warping matrices.

2.2 Additive convolution model

Deng et al. [36] suggested an additive convolution (AC) model for non-blind deblurring, where non-uniform blur is modeled as the weighted summation of the convolution of the sharp image with a set of basis convolution kernels

$$\begin{array}{@{}rcl@{}} {\boldsymbol{b}} = {\sum\nolimits}_{i = 1}^{C} {{{\boldsymbol{\alpha }}_{i}} \circ \left({{\boldsymbol{x}} \otimes {{\boldsymbol{k}}_{i}}} \right)} \end{array} $$
((3))

where k i is the i-th basis convolution kernel, α i denotes the i-th weight matrix, denotes the convolution operation, denotes the pixel-wise multiplication operator, and C is the number of basis convolution kernels. With FFT, the computational complexity of AC model is only \(O(Cn^{2}\mathop {\log }n)\). Furthermore, principal component analysis (PCA) is adopted to learn the basis convolution kernels in advance. However, when applied to blind non-uniform deblurring, both the basis kernels and the weighted maps should be updated in each iteration, making it unpractical in blind deblurring of camera shaken images.

In this paper, we propose a fast forward blur model, i.e., GAC, to represent non-uniform blur caused by camera shake, which is distinctly different with the existing models. Instead of utilizing sparse warping matrix [29, 30, 33], GAC models reformulate homography as convolution related to slice and fiber and can achieve better trade-off between computational cost and memory complexity. Different from [32], GAC is an efficient implementation of the global geometric model rather than the locally uniform approximation. Compared with [36], the basis convolution kernels and the weighted maps can be efficiently constructed based on the camera motion trajectory, and thus is easy to update.

2.3 MAP-based blind deblurring

Given the blurry image b, MAP strategy is usually adopted to estimate the blur K and the latent sharp image x by minimizing the negative logarithm of posterior probability with respect to K and x,

$$\begin{array}{*{20}l} -{\log} p({\boldsymbol K},{\boldsymbol{x}}|{\boldsymbol{b}})\propto&-{\log} p({\boldsymbol{y}}|{\boldsymbol{K}},{\boldsymbol{x}})-{\log} p({\boldsymbol{x}})\\ &-{\log} p({\boldsymbol{K}})\\ {\propto}&{\frac{1}{2}{\left\| {\boldsymbol{K}}{\boldsymbol{x}} - \boldsymbol{b} \right\|}^{2}} + {\Phi_{1}}({\boldsymbol{x}}) + {\Phi_{2}}({\boldsymbol{K}}) \end{array} $$
((4))

where Φ 1(x) denotes the regularizer on the latent sharp image and Φ 2(K) denotes the regularizer on the blur operator. For non-uniform blur, \(\boldsymbol {K} \boldsymbol {x} = {\sum \nolimits }_{j} {{w_{j}}} {{\boldsymbol {K}}_{j}}{\boldsymbol {x}}\phantom {\dot {i}\!}\), where the blur operator is parameterized by the weights w j . Like the alternative updating strategy commonly adopted in uniform blind deconvolution [16, 18], the estimation of weight matrix W and latent image x should be alternatively performed.

To constrain the ill-posed problem for better deblurring quality, proper regularizers should be imposed. In 3D camera pose subspace, only a few poses are active given a camera motion trajectory so that it is reasonable to introduce sparsity on W. In [29, 33], l 2-norm of weight matrix or its gradient is imposed, where the estimation of W is a linear least square problem, which can be solved by gradient-based optimization method, while in [30], non-convex regularizer is introduced, which is optimized by an iterative re-weighted least squares (IRLS).

As for updating the latent clear image, the edge prediction step is usually necessary to guarantee the algorithm converge to the desired solution, so that the sparsity, e.g., total variation [29, 33], is also imposed. In earlier researches, simple but efficient optimization method, e.g., Rachardson-Lucy (RL) algorithm, was adopted to solve the problem [28]; however, the deblurring results often suffer from ringing artifacts. As a non-blind deconvolution problem, fast gradient-based optimization method has been intensively studied [37, 38]. The iterative shrinkage thresholding algorithm (IST) [37] was first proposed, and due to its simplicity and efficiency, two accelerated IST-based algorithms had been developed, i.e., FISTA [39] and TwIST [38], which both possess higher convergence rate. Furthermore, Zuo and Lin proposed the GAPG algorithm [34], which further accelerated the convergence rate of APG algorithm.

In this paper, the GAC-based deblurring problem is solved in the alternative minimization framework, where the updating of pose weight matrix with both sparsity and group sparsity regularizers can be solved, and the updating of latent clear image can be efficiently solved by GAPG algorithm.

3 Generalized additive convolution model for camera shake

In this section, we first propose the general form of the GAC model and then decompose camera motion trajectory into slices and fibers, which provides a solution to specify GAC model for efficient modeling of camera shake blurring. Finally, we propose a greedy algorithm to generate hybrid GAC model and discuss the memory and computational complexity of the GAC model.

3.1 Generalized additive convolution model

The form of the GAC model is defined as

$$\begin{array}{@{}rcl@{}} {\boldsymbol{b}} = {\sum\nolimits}_{i = 1}^{C} {{{\boldsymbol{f}}_{i}}\left({{{\boldsymbol{g}}_{i}}\left({\boldsymbol{x}} \right) \otimes {{\boldsymbol{k}}_{i}}} \right)} \end{array} $$
((5))

where f i and g i are two pixel-wise operators. GAC is the generalization of the AC model [36], by defining f i (x)=α i x and g i (x)=x. Moreover, since both f i and g i are pixel-wise operators, the computational complexity of GAC is the same as that of the AC model, i.e., O(C n 2log2 n), where n 2 is the number of pixels of the image x. The smaller the C value indicates that GAC model is more efficient. Thus, the key issue of GAC model is to better specify k i , f i , and g i to reduce the C value.

3.2 Decomposition of slice and fiber

In this subsection, we show that camera poses in the 3D camera pose subspace [30] can be decomposed into slices and fibers, which then provides proper choices to design k i , f i , and g i in GAC model to reduce C value.

In the 3D camera pose subspace [30], pose in a camera motion trajectory is parameterized as

$$\begin{array}{@{}rcl@{}} {\boldsymbol{\theta }} = \left({{\theta_{z}},{t_{x}},{t_{y}}} \right) \end{array} $$
((6))

where θ z is the roll angle and t x and t y are the translations along x- and y- axes, respectively. For each pose in the 3D camera subspace, θ j =(θ z,j ,t x,j ,t y,j ), by defining

$$\begin{array}{@{}rcl@{}} {{\boldsymbol{M}}_{{{\boldsymbol{\theta }}_{j,r}}}} = \left[ {\begin{array}{ccc} {\cos {{{\theta }}_{z,j}}}&{ - \sin {{{\theta }}_{z,j}}}&{{t_{x,j}}}\\ {\sin {{{\theta }}_{z,j}}}&{\cos {{{\theta }}_{z,j}}}&{{t_{y,j}}}\\ 0&0&1 \end{array}} \right] \end{array} $$
((7))

the homography [40] can be defined as:

$$\begin{array}{@{}rcl@{}} \begin{array}{l} {{\boldsymbol{H}}_{\boldsymbol{\theta }}}_{_{j}} = {\boldsymbol{C}}{{\boldsymbol{M}}_{\boldsymbol{\theta }}}_{_{j}}{{\boldsymbol{C}}^{- 1}} \\ \quad \quad = {\boldsymbol{C}}\left({{{\boldsymbol{R}}_{\theta} }_{_{z,j}} + \frac{1}{d}*{{\boldsymbol{t}}_{j}} \left[ {\begin{array}{ccc} 0&0&1 \end{array}} \right]} \right) {{\boldsymbol{C}}^{- 1}}\\ \quad \quad = {\boldsymbol{C}} \left[ {\begin{array}{ccc} {\cos ({\theta_{z,j}})}&{ - \sin ({\theta_{z,j}})}&{{t_{x,j}}}\\ {\sin ({\theta_{z,j}})}&{\cos ({\theta_{z,j}})}&{{t_{y,j}}}\\ 0&0&1 \end{array}} \right] {{\boldsymbol{C}}^{- 1}} \end{array} \end{array} $$
((8))

where t j =[t x,j ,t y,j ,1]T is the translation vector, d is the depth of the scene.

As in [30, 33], we also assume that the camera intrinsic parameters are known in advance, and the calibration matrix has the standard form:

$$\begin{array}{@{}rcl@{}} {\boldsymbol{C}} = \left| {\begin{array}{ccc} \alpha_{x}&0&{{x_{0}}}\\ 0&\alpha_{y}&{{y_{0}}}\\ 0&0&1 \end{array}} \right| \end{array} $$
((9))

where \(\alpha _{x}=f*\frac {{{m}_{x\_{\text {ccd}}}}}{{{m}_{\textit {x\_im}}}}\phantom {\dot {i}\!}\) is the scale factors relating pixels to distance, \({{m}_{x\_{\text {ccd}}}}\) is the maximum width of the CCD, \({{m}_{x\_{\text {im}}}}\) is the maximum width of the related image, \(\alpha _{y}=f*\frac {{{m}_{y\_{\text {ccd}}}}}{{{m}_{y\_{\text {im}}}}}\phantom {\dot {i}\!}\) is quantified in the same manner, f is the focal length, and (x 0,y 0) is the center coordinate.

Given \({{\boldsymbol {H}}_{\boldsymbol {\theta }}}_{_{j}}\phantom {\dot {i}\!}\), we can construct the corresponding warping matrix \({{\boldsymbol {K}}_{\boldsymbol {\theta }}}_{_{j}}\phantom {\dot {i}\!}\) (more detailed explanations can be found in [30, 33, 36]). We then have that the homography transform can be decomposed into in-plane translation followed by roll rotation, yielding the following proposition.

Proposition 1.

$$\begin{array}{@{}rcl@{}} {{\boldsymbol{K}}_{\boldsymbol{\theta }}}_{_{j}} = {{\boldsymbol{K}}_{\boldsymbol{\theta}}}_{_{j,r}}{{\boldsymbol{K}}_{\boldsymbol{\theta }}}_{_{j,t}} \end{array} $$

Proof.

By defining

$$\begin{array}{@{}rcl@{}} {{\boldsymbol{R}}_{{{\boldsymbol{\theta }}_{j,r}}}} = \left[ {\begin{array}{ccc} {\cos {{{\theta }}_{z,j}}}&{ - \sin {{{\theta }}_{z,j}}}&0\\ {\sin {{{\theta }}_{z,j}}}&{\cos {{{\theta }}_{z,j}}}&0\\ 0&0&1 \end{array}} \right] \end{array} $$
$$\begin{array}{@{}rcl@{}} {{\boldsymbol{T}}_{\boldsymbol{\theta }}}_{_{j,t}} = \left[ {\begin{array}{ccc} 1&0&{{t_{x,j}}}\\ 0&1&{{t_{y,j}}}\\ 0&0&1 \end{array}} \right] \end{array} $$

and according to the definition of \({{\boldsymbol {M}}_{\boldsymbol {\theta }}}_{_{j}}\phantom {\dot {i}\!}\) in Eq. (7), it is obvious to see that \({{\boldsymbol {M}}_{\boldsymbol {\theta }}}_{_{j}} = {{\boldsymbol {T}}_{\boldsymbol {\theta }}}_{_{j,t}}{{\boldsymbol {R}}_{\boldsymbol {\theta }}}_{_{j,r}}\phantom {\dot {i}\!}\). Based on the definition of \({{\boldsymbol {H}}_{\boldsymbol {\theta }}}_{_{j}}\phantom {\dot {i}\!}\) in Eq. (8), we can define \({{\boldsymbol {H}}_{\boldsymbol {\theta }}}_{_{j,r}}\phantom {\dot {i}\!}\) and \({{\boldsymbol {H}}_{\boldsymbol {\theta }}}_{_{j,t}}\phantom {\dot {i}\!}\) as \({{\boldsymbol {H}}_{\boldsymbol {\theta }}}_{_{j,r}} = {\boldsymbol {C}}{{\boldsymbol {R}}_{\boldsymbol {\theta }}}_{_{j,r}}{{\boldsymbol {C}}^{-1}}\phantom {\dot {i}\!}\) and \({{\boldsymbol {H}}_{\boldsymbol {\theta }}}_{_{j,t}} = {\boldsymbol {C}}{{\boldsymbol {T}}_{\boldsymbol {\theta }}}_{_{j,t}}{{\boldsymbol {C}}^{-1}}\phantom {\dot {i}\!}\). One can easily see that

$$\begin{array}{*{20}l} {{\boldsymbol{H}}_{\boldsymbol{\theta }}}_{_{j,t}}{{\boldsymbol{H}}_{\boldsymbol{\theta }}}_{_{j,r}}&={\boldsymbol{C}}{{\boldsymbol{T}}_{\boldsymbol{\theta }}}_{_{j,t}}{{\boldsymbol{C}}^{- 1}}{\boldsymbol{C}}{{\boldsymbol{R}}_{\boldsymbol{\theta }}}_{_{j,r}}{{\boldsymbol{C}}^{- 1}}\\ &={\boldsymbol{C}}{{\boldsymbol{T}}_{\boldsymbol{\theta }}}_{_{j,t}}{{\boldsymbol{R}}_{\boldsymbol{\theta }}}_{_{j,r}}{{\boldsymbol{C}}^{- 1}} = {{\boldsymbol{H}}_{\boldsymbol{\theta }}}_{_{j}} \end{array} $$

and we then have

$$\begin{array}{@{}rcl@{}} \begin{array}{c} {\boldsymbol{H}}_{{{\boldsymbol{\theta }}_{j}}}^{- 1} = {\boldsymbol{H}}_{{{\boldsymbol{\theta}}_{j,r}}}^{- 1}{\boldsymbol{H}}_{{{\boldsymbol{\theta }}_{j,t}}}^{- 1} \end{array} \end{array} $$
((10))

Based on the definition of \({{\boldsymbol {K}}_{\boldsymbol {\theta }}}_{_{j}}{\boldsymbol {x}}\phantom {\dot {i}\!}\), for the pixel at the location [l x1,l x2,1]T, \({{\boldsymbol {K}}_{\boldsymbol {\theta }}}_{_{j}}{\boldsymbol {x}}\phantom {\dot {i}\!}\) assigns the value of the pixel located at \({\boldsymbol {H}}_{{{\boldsymbol {\theta }}_{j}}}^{- 1}\left ({\left [l_{x1}, l_{x2}, 1\right ]^{T}}\right)\) of x to the pixel located at [l x1,l x2,1] of \({{\boldsymbol {K}}_{\boldsymbol {\theta }}}_{_{j}}{\boldsymbol {x}}\phantom {\dot {i}\!}\). Based on Eq. (10), \({\boldsymbol {H}}_{{{\boldsymbol {\theta }}_{j}}}^{- 1}\left ({\left [l_{x1}, l_{x2}, 1\right ]^{T}}\right)\) can also be explained as \({\boldsymbol {H}}_{{{\boldsymbol {\theta }}_{j,r}}}^{- 1}{\boldsymbol {H}}_{{{\boldsymbol {\theta }}_{j,t}}}^{- 1}\left ({\left [l_{x1}, l_{x2}, 1\right ]^{T}}\right)\), which means that \({{\boldsymbol {K}}_{{{\boldsymbol {\theta }}_{j}}}}{\boldsymbol {x}} = {{\boldsymbol {K}}_{{{\boldsymbol {\theta }}_{j,r}}}}{{\boldsymbol {K}}_{{{\boldsymbol {\theta }}_{j,t}}}}{\boldsymbol {x}}\phantom {\dot {i}\!}\).

Proposition 1 indicates that \({{\boldsymbol {K}}_{{{\boldsymbol {\theta }}_{j}}}}{\boldsymbol {x}}\phantom {\dot {i}\!}\) is the combination of two atom operations, i.e., first translating the image x by t x,j and t y,j along x-axis and y-axis respectively, and then rotating the translated image by roll angle θ z,j . So we can rewrite \({{\boldsymbol {K}}_{{{\boldsymbol {\theta }}_{j,r}}}}{\boldsymbol {x}}\phantom {\dot {i}\!}\) and \({{\boldsymbol {K}}_{{{\boldsymbol {\theta }}_{j,t}}}}{\boldsymbol {x}}\phantom {\dot {i}\!}\) as

$$\begin{array}{@{}rcl@{}} {{\boldsymbol{K}}_{{{\boldsymbol{\theta }}_{j,r}}}}{\boldsymbol{x}} = {{\boldsymbol{R}}_{{{{\theta }}_{z,j}}}}({\boldsymbol{x}}) \end{array} $$
((11))
$$\begin{array}{@{}rcl@{}} {{\boldsymbol{K}}_{{{\boldsymbol{\theta }}_{j,t}}}}{\boldsymbol{x}} = {{\boldsymbol{k}}_{{{\boldsymbol{t}}_{j}}}} \otimes {\boldsymbol{x}} \end{array} $$
((12))

where \({{\boldsymbol {R}}_{{{{\theta }}_{z,j}}}}\phantom {\dot {i}\!}\) denotes the pixel-wise image rotation operation, and the translation convolution kernel is defined as

$$ {{\boldsymbol{k}}_{{{\boldsymbol{t}}_{j}}}}(x,y) \left\{ \begin{array}{ll} 1,&\text{if}\;x = {t_{x,j}}\;\text{and}\;y = {t_{y,j}}\\ 0,&\text{else} \end{array}\right. $$
((13))

The non-uniform forward blur model Eq. (2) can be further formulated as

$$\begin{array}{@{}rcl@{}} {\boldsymbol{Kx}} = {\sum\nolimits}_{{{\boldsymbol{\theta }}_{j}}} {{w_{{{\boldsymbol{\theta }}_{j}}}}{{\boldsymbol{K}}_{{{\boldsymbol{\theta }}_{j}}}}{\boldsymbol{x}}} = {\sum\nolimits}_{{{\boldsymbol{\theta }}_{j}}} {{w_{{{\boldsymbol{\theta }}_{j}}}}{{\boldsymbol{K}}_{{\theta_{j,r}}}}{{\boldsymbol{K}}_{{\theta_{j,t}}}} {\boldsymbol{x}}} \end{array} $$
((14))

where \({w_{{{\boldsymbol {\theta }}_{j}}}}\phantom {\dot {i}\!}\) denotes the contribution (weight) of pose θ j . It should be noted that the poses along a real camera motion trajectory form a connected 1D path, and thus, the weights of most poses are zeros. So in Eq. (14), we can only consider the subset of poses with positive weights, \({\mathcal {P}} = \left \{ {{{\boldsymbol {\theta }}_{j}}:{w_{{{\boldsymbol {\theta }}_{j}}}} > 0} \right \}\phantom {\dot {i}\!}\).

Finally, we define two special classes of subsets of \({\mathcal {P}}\): slice and fiber. A slice \({{\mathcal {S}}_{\theta } }\) is defined as \({{\mathcal {S}}_{\theta }} = \{ {{\boldsymbol {\theta }}_{j}} = \{ {\theta _{z,j}},{t_{x,j}},{t_{y,j}}\} : {{\boldsymbol {\theta }}_{j}} \in {\mathcal {P}} {\; \text {and} \;} {\theta _{z,j}} = \theta \}\phantom {\dot {i}\!}\), while a fiber \({{\mathcal {F}}_{\boldsymbol {t}}}\) is defined as \({{\mathcal {F}}_{\boldsymbol {t}}} = \{ {{\boldsymbol {\theta }}_{j}} = \{ {\theta _{z,j}}, {t_{x,j}},{t_{y,j}}\} :{{\boldsymbol {\theta }}_{j}} \in {\mathcal {P}} {\; \text {and} \;}({t_{x,j}},{t_{y,j}}) = {\boldsymbol {t}}\}\phantom {\dot {i}\!}\). Actually, \({\mathcal {P}}\) can be decomposed into a number of non-intersected slices and fibers. In the following, we will introduce how to construct slices and fibers from \({\mathcal {P}}\) and how to reformulate camera shake as the GAC model.

3.3 Slice-based GAC

Figure 2 shows an example of a slice of camera poses which have the same roll angle. Given a slice \({{\mathcal {S}}_{\theta } }\), the blur caused by camera motion within \({{\mathcal {S}}_{\theta } }\) can be formulated as

$$\begin{array}{*{20}l} {{\boldsymbol{K}}_{\theta} }{\boldsymbol{x}}&=\sum\limits_{{{\boldsymbol{\theta }}_{j}} \in {{\mathcal{S}}_{\theta} }} {{w_{{{\boldsymbol{\theta }}_{j}}}}{{\boldsymbol{K}}_{\theta,r}}{{\boldsymbol{K}}_{{{\boldsymbol{\theta }}_{j,t}}}}{\boldsymbol{x}}}\\ &=\sum\limits_{{{\boldsymbol{\theta }}_{j}} \in {{\mathcal{S}}_{\theta} }} {{w_{{{\boldsymbol{\theta }}_{j}}}}{{\boldsymbol{R}}_{\theta} } ({{\boldsymbol{k}}_{{{\boldsymbol{t}}_{j}}}} \otimes {\boldsymbol{x}})} \\ &={{\boldsymbol{R}}_{\theta} }\left({\left({\sum\limits_{{{\boldsymbol{\theta }}_{j}} \in {{\mathcal{S}}_{\theta} }} {{w_{{{\boldsymbol{\theta }}_{j}}}}{{\boldsymbol{k}}_{{{\boldsymbol{t}}_{j}}}}}} \right) \otimes {\boldsymbol{x}}} \right)\\ &={{\boldsymbol{R}}_{\theta} }\left({{{\boldsymbol{k}}_{{{\mathcal{S}}_{\theta} }}} \otimes {\boldsymbol{x}}} \right) \end{array} $$
((15))
Fig. 2
figure 2

Poses of a camera motion trajectory which form a slice

where \({{\boldsymbol {k}}_{{{\mathcal {S}}_{\theta } }}}\) denotes the slice kernel with respect to the roll angle θ.

For a general camera motion trajectory \({\mathcal {P}}\), we first classify the poses in \({\mathcal {P}}\) into a number of non-intersected slices with \({\mathcal {P}} = {\cup _{\theta }}\{{{\mathcal {S}}_{\theta }}\}\), and then, the non-uniform blur in Eq. (14) can be equivalently reformulated as

$$\begin{array}{@{}rcl@{}} {\boldsymbol{Kx}} = {\sum\nolimits}_{\boldsymbol{\theta }} {{{\boldsymbol{R}}_{\theta} }\left({{{\boldsymbol{k}}_{{{\mathcal{S}}_{\theta} }}} \otimes {\boldsymbol{x}}} \right)} \end{array} $$
((16))

It is obvious that this is a GAC model with f θ (x)=R θ (x) and g θ (x)=x. If the range of the roll angles is discretized into n z intervals, we can see that the number C in slice GAC should be not higher than n z .

Similarly, we can define the adjoint operator K T y as

$$\begin{array}{@{}rcl@{}} {{\boldsymbol{K}}^{T}}{\boldsymbol{y}} = {\sum\nolimits}_{\boldsymbol{\theta }} {{{{\boldsymbol{\tilde k}}}_{{{\mathcal{S}}_{\theta} }}} \otimes {\boldsymbol{R}}_{\theta}^{T}({\boldsymbol{y}})} \end{array} $$
((17))

where \({{\boldsymbol {\tilde k}}_{{{\mathcal {S}}_{\theta } }}}\) is the adjoint operator of \({{\boldsymbol {k}}_{{{\mathcal {S}}_{\theta } }}}\) constructed by flipping the \({{\boldsymbol {k}}_{{{\mathcal {S}}_{\theta } }}}\) upside-down and left-to-right, and \({\boldsymbol {R}}_{\theta }^{T}\) is the strict adjoint operator of the discrete version of R θ . It is obvious that the adjoint operator in Eq. (17) is also a GAC model and can be efficiently computed.

Finally, we discuss some implementation issues of slice GAC. To implement R θ , one can simply adopt the Matlab command imrotate. To enhance the efficiency, we maintain a lookup table (LUT) for each discrete roll angle to record the correspondence of the coordinates before and after rotation. By discretizing the range of roll angles into n z intervals, we pre-compute and store n z LUTs. In the continuous case, R θ is the adjoint operator of R θ , but in the discrete case, the error caused by discretization and interpolation cannot be overlooked. Thus, instead of using R θ , we adopt the strictly adjoint operator \({\boldsymbol {R}}_{\theta }^{T}\).

3.4 Fiber-based GAC

Figure 3 shows an example of a fiber of camera poses which have the same translation t=[t x ,t y ]T, which applies a visual validation of our fiber-based GAC model. Given a fiber \({{\mathcal {F}}_{\boldsymbol {t}}}\), the non-uniform blur caused by the camera motion along \({{\mathcal {F}}_{\boldsymbol {t}}}\) can be formulated as

$$\begin{array}{*{20}l} {{\boldsymbol{K}}_{\boldsymbol{t}}}{\boldsymbol{x}}&=\sum\limits_{{\boldsymbol{\theta }_{j}} \in {{\mathcal{F}}_{\boldsymbol{t}}}} {{w_{{\boldsymbol{\theta }_{j}}}} {{\boldsymbol{K}}_{\boldsymbol{\theta}_{j},r}}{{\boldsymbol{K}}_{{{\boldsymbol{\theta }}_{j,t}}}}{\boldsymbol{x}}} \\ &=\sum\limits_{{{\boldsymbol{\theta }}_{j}} \in {{\mathcal{F}}_{\boldsymbol{t}}}} {{w_{{{\boldsymbol{\theta }}_{j}}}}{{\boldsymbol{R}}_ {{\theta_{z,j}}}}({{\boldsymbol{K}}_{{{\boldsymbol{\theta }}_{j,t}}}}{\boldsymbol{x}})} \\ &=\text{IPT}\left({{{\boldsymbol{w}}_{\boldsymbol{t}}} \otimes \text{PT}({{\boldsymbol{K}}_{{{\boldsymbol{\theta}}_{\boldsymbol{t}}}}}{\boldsymbol{x}})} \right) \end{array} $$
((18))
Fig. 3
figure 3

Simulated blurry images obtained using fiber GAC model Eq. (20) and using the 3D camera pose subspace model in [30]. a Fiber, b sharp image, c blurry-image based on the model in [30], and d blurry image based on fiber GAC model

where θ t =[0,t x ,t y ]T, \(\boldsymbol {K}_{{\boldsymbol {\theta }}_{\boldsymbol {t}}} \boldsymbol {x}\phantom {\dot {i}\!}\) denotes the in-plane translation operation, and PT(·) and IPT(·) stand for the polar transform and inverse polar transform [41], respectively. In the polar transform, we use the same interval to discretize angular and roll angles, and thus, the basis filter w t can be defined as

$$\begin{array}{@{}rcl@{}} {{\boldsymbol{w}}_{\boldsymbol{t}}} = \left[{w_{{\boldsymbol{t}},{\theta_{1}}}},{w_{{\boldsymbol{t}},{\theta_{1}}}}, \cdots,{w_{{\boldsymbol{t}},{\theta_{nz}}}}\right] \end{array} $$
((19))

where θ 1 is the minimal roll angle and θ nz is the maximal roll angle.

We compare the model in Eq. (18) with the 3D camera pose subspace model in [30] for the simulation of blur caused by camera shake along a fiber. Figure 3 shows the blurry images obtained by using these two methods. One can easily see that the difference between Fig. 3 c, d is insignificant, which demonstrates that the model in Eq. (18) can be used for modeling camera shaken image with a series of roll rotations.

For a general camera motion trajectory \({\mathcal {P}}\), we first classify the poses in \({\mathcal {P}}\) into a number of non-intersected fibers with \({\mathcal {P}} = { \cup _{\boldsymbol {t}}}\{ {{\mathcal {F}}_{\boldsymbol {t}}}\}\), and then, the non-uniform blur in Eq. (14) can be equivalently reformulated as

$$\begin{array}{@{}rcl@{}} {\boldsymbol{Kx}} = {\sum\nolimits}_{\boldsymbol{t}} \boldsymbol{K}_{\boldsymbol{t}} \boldsymbol{x} = {\sum\nolimits}_{\boldsymbol{t}} {\text{IPT}\left({{{\boldsymbol{w}}_{\boldsymbol{t}}} \otimes \text{PT}({{\boldsymbol{K}}_{{{\boldsymbol{\theta }}_{\boldsymbol{t}}}}}{\boldsymbol{x}})} \right)} \end{array} $$
((20))

It is obvious that this is a GAC model with f t (x)=IPT(x) and \({{\boldsymbol {g}}_{\boldsymbol {t}}}({\boldsymbol {x}}) = \text {PT}({{\boldsymbol {K}}_{\boldsymbol {\theta }_{\boldsymbol {t}}}}{\boldsymbol {x}})\phantom {\dot {i}\!}\). If the range of the in-plane translations is discretized into n x ×n y intervals, then the number C in fiber GAC should be not higher than n x n y .

We can then define the adjoint operator K T y as

$$\begin{array}{@{}rcl@{}} {{\boldsymbol{K}}^{T}}{\boldsymbol{y}} = {\sum\nolimits}_{\boldsymbol{t}} {{{\boldsymbol{K}}_{- \boldsymbol{\theta}_{\boldsymbol{t}}}}\left({P{T^{T}}\left({{{{\boldsymbol{\tilde w}}}_{\boldsymbol{t}}} \otimes IP{T^{T}}({\boldsymbol{x}})} \right)} \right)} \end{array} $$
((21))

where \({{\boldsymbol {\tilde w}}_{\boldsymbol {t}}}\) is the adjoint operator of w t , and PTT and IPTT are the adjoint operators of PT and IPT, respectively. To enhance the computational efficiency, two extra LUTs are pre-computed to record the correspondence of polar and inverse polar transform, respectively. Moreover, we use the strict adjoint operators of PT and IPT, i.e., PTT and IPTT, to avoid the inconsistence caused by discretization and interpolation.

3.5 Hybrid GAC for modeling

For GAC, the key to save computational cost is to reduce C, the number of basis filters. Given a general camera motion trajectory as shown in Fig. 4, neither pure slice-based nor pure fiber-based GAC can guarantee sufficiently small value of C. However, in a hybrid (slice and fiber mixed) decomposition, only two slices and two fibers are required to model the camera motion trajectory, so that the computational complexity is significantly reduced. Thus, we propose a greedy method to decompose a camera motion trajectory into a hybrid set of slices and fibers to reduce the C value.

Fig. 4
figure 4

Decomposing a camera motion trajectory into a slice set and a fiber set

Given the pose subset \({\mathcal {P}}\) and the 3D weight matrix W with W(θ z ,t x ,t y ), the weight of the pose θ=(θ z ,t x ,t y ), in each iteration, the proposed method first finds a candidate slice \({{\mathcal {S}}_{{{\hat \theta }_{z}}}}\), and a candidate fiber \(\phantom {\dot {i}\!}{\mathcal {F}}_{{{\hat t}_{x}},{{\hat t}_{y}}}\), compare their relative contributions, and then choose the slice or fiber with higher weights. By this way, we can obtain a slice set \(\phantom {\dot {i}\!}\left \{ {\left ({{\theta _{j}},{{\mathcal {S}}_{j}}, {{\boldsymbol {k}}_{j}}} \right):j = 1,...,{n_{s}}} \right \}\) and a fiber set \(\phantom {\dot {i}\!}\left \{ {\left ({{\boldsymbol {t}_{i}},{{\mathcal {F}}_{i}},{{\boldsymbol {w}}_{j}}} \right):i = 1,...,{n_{f}}} \right \}\). As shown in Fig. 4, the proposed greedy algorithm can successfully decompose the camera motion trajectory into two slices and two fibers. The detailed algorithm is summarized in Algorithm 1.

Based on the slice set and the fiber set, the non-uniform caused by camera shake can be reformulated as

$$\begin{array}{*{20}l} {\boldsymbol{Kx}} =&{\sum\nolimits}_{i = 1}^{{n_{f}}} {\text{IPT}\left({{{\boldsymbol{w}}_{i}} \otimes \text{PT}({{\boldsymbol{K}}_{\boldsymbol{\theta}_{{\boldsymbol{t}}_{i}}}}{\boldsymbol{x}})} \right)} \\ &+{\sum\nolimits}_{j = 1}^{{n_{s}}} {{{\boldsymbol{R}}_{{\theta_{j}}}}\left({{{\boldsymbol{k}}_{j}} \otimes {\boldsymbol{x}}} \right)} \end{array} $$
((22))

The adjoint operator K T y is then defined as

$$\begin{array}{*{20}l} {\boldsymbol{K}^{T}\boldsymbol{y}}=&{\sum\nolimits}_{i = 1}^{{n_{f}}} \boldsymbol{K}_{\boldsymbol{-\theta}_{\boldsymbol{t}}} {\text{PT}^{T}\left({{{\boldsymbol{\tilde w}}_{i}} \otimes \text{IPT}^{T}({\boldsymbol{y}})} \right)} \\ &+{\sum\nolimits}_{j = 1}^{{n_{s}}} { {{\boldsymbol{\tilde k}}_{j}} \otimes {{\boldsymbol{R}}^{T}_{{\theta_{j}}}} \left({\boldsymbol{y}} \right)} \end{array} $$
((23))

3.6 Discussions

With FFT, the computational complexity of the model in Eq. (22) is O((n s +n f )n 2 log2n). If n s and n f are small, GAC would be more efficient than the other methods. Let n z be the number of intervals for the roll angle. It is reasonable to assume that (n s +n f )<n z ; otherwise, we can use the pure slice-based GAC model.

To further improve the computational efficiency, the LUT method can be adopted for fast image rotation, polar and inverse polar transform, and n z +2 LUTs should be pre-computed and stored in memory. In [30, 33], a sparse n 2×n 2 matrix was constructed for each pose in the 3D camera pose subspace. Compared with the model in [30, 33], GAC can achieve much better tradeoff between memory and computational complexity.

In [32], camera shake blurring is approximated as the sum of R 2 uniformly blurry patches, and the computational complexity of the model in [32] is O(R 2(n/R+w)2 log2(n/R+w)), where w×w is the size of the blur kernel. Compared with [32], when R and w are higher, GAC would be computationally more efficient, and our experimental results also validates the efficiency of GAC against the model by Hirsch et al. [32]. Moreover, the model in [32] is a locally uniform approximation of the camera shake blurring model, while GAC can be strictly equivalent with the geometric model in [30].

4 GAC-based non-uniform deblurring

In this section, by incorporating the proposed GAC forward blur model into existing deblurring model, the GAC-based non-uniform deblurring can be efficiently solved. Then, by imposing l 2,1-norm regularizer on pose weight matrix W, the slice GAC deblurring can approach toward uniform blind deconvolution.

4.1 GAC-based deblurring via GAPG

In typical non-uniform deblurring methods, TV regularizer on latent image and l 2-norm regularizer on pose weight matrix are imposed, yielding the deblurring formulation

$$\begin{array}{@{}rcl@{}} \mathop {\min }\limits_{{\boldsymbol{W}},{\boldsymbol{x}}} {\left\| {\sum\limits_{{\boldsymbol{\theta }} \in {\mathcal{P}}} {{w_{\boldsymbol{\theta }}} {{\boldsymbol{K}}_{\boldsymbol{\theta }}}{\boldsymbol{x}} - {\boldsymbol{b}}}} \right\|^{2}} + \lambda\mathop{\text{TV}}({\boldsymbol{x}}) + \tau\|{\boldsymbol{W}}\|^{2} \end{array} $$
((24))

where λ and τ are trade-off parameters. By substituting the forward blur model with any GAC model, we come to the GAC-based non-uniform deblurring method, which can be solved by alternatively updating pose weight matrix W and latent clear image x. By fixing x, we use the method in [33] to update W, while by fixing W, we develop an efficient solution to x based on GAPG algorithm.

In [34], GAPG was developed to solve uniform non-blind deblurring, where the Lipschitz constant is generalized to diagonal matrix that can guarantee faster convergence rate. By introducing two auxiliary variables d h and d v , we reformulate the TV-based model into the following equivalent problem

$$\begin{array}{*{20}l} \mathop{\min}\limits_{{\boldsymbol{x}},{{\boldsymbol{d}}_{h}},{{\boldsymbol{d}}_{v}}}&\frac{1}{2}\left(\right.\mu\left\|{{\boldsymbol{Kx}}-{\boldsymbol{b}}}\right\|^{2}+\left\|{{{\boldsymbol{d}}_{v}}-{{\boldsymbol{D}}_{v}}{\boldsymbol{x}}}\right\|^{2} \\ &+\left\|{{{\boldsymbol{d}}_{h}}-{{\boldsymbol{D}}_{h}}{\boldsymbol{x}}}\right\|^{2}\left.\right)+\lambda\mu{\left\|{\left({{{\boldsymbol{d}}_{v}}\:{{\boldsymbol{d}}_{h}}}\right)}\right\|_{\text{TV}}} \end{array} $$
((25))

where μ is the relaxation parameter and D h and D v are the horizontal and vertical gradient operators, respectively. According to [34], in each iteration, several sub-problems should be solved for the updating of x, D h , and D v , respectively. We use the same method as [34] for updating D h and D v and choose the proper Lipschitz matrix for updating x. For the updating of x, the subproblem is formulated as

$$\begin{array}{*{20}l} \mathop{\min}\limits_{\boldsymbol{x}}&\frac{\lambda_{\max}}{2}\left\|\vphantom{\frac{0}{0}}\right.\boldsymbol{x}-\left\{\vphantom{\frac{0}{0}}\right.{\boldsymbol{y}}_{\boldsymbol{x}}-{\lambda_{\max}^{-1}}\left[\vphantom{\frac{0}{0}}\right.{\boldsymbol{K}}^{T}\left({{\boldsymbol{Ky}}_{\boldsymbol{x}}-{\boldsymbol{b}}}\right)\\ &+{\boldsymbol{D}}_{v}^{T}\left({{\boldsymbol{D}}_{v}{\boldsymbol{y}}_{\boldsymbol{x}}-{\boldsymbol{y}}_{{{\boldsymbol{d}}}_{v}}}\right)\\ &+{\boldsymbol{D}}_{h}^{T}\left({{\boldsymbol{D}}_{h}{\boldsymbol{y}}_{\boldsymbol{x}}-{\boldsymbol{y}}_{{{\boldsymbol{d}}}_{h}}}\right)\left.\vphantom{\frac{0}{0}}\right]\left.\vphantom{\frac{0}{0}}\right\}\left.\vphantom{\frac{0}{0}}\right\| \end{array} $$
((26))

where \({\boldsymbol {y}}_{\boldsymbol {x}},{\boldsymbol {y}}_{{{\boldsymbol {d}}_{v}}},\phantom {\dot {i}\!}\) and \({\boldsymbol {y}}_{{{\boldsymbol {d}}_{h}}}\phantom {\dot {i}\!}\) have the same definition as in [34]. Here, K T(Ky x b) is computed based on Eq. 22 and Eq. 23. In [34], λ max is set based on the inequality \({\lambda _{\max }} \le {\left ({\sqrt \mu {{\left \| {\boldsymbol {K}} \right \|}_{2}} + \sqrt \eta {{\left \| {{{\boldsymbol {D}}_{h}}} \right \|}_{2}} + \sqrt \eta {{\left \| {{{\boldsymbol {D}}_{v}}} \right \|}_{2}}} \right)^{2}}\phantom {\dot {i}\!}\) with η=1, D h 2≤2, and D v 2≤2 [34]. The diagonal Lipschitz matrix is

$$ {\boldsymbol{L}_{f}}=\text{diag}(\lambda_{\text{max}{\boldsymbol I}},\eta {\boldsymbol I}, \eta{\boldsymbol{I}}) $$
((27))

By using smaller λ max and η, GAPG algorithm will converge faster. For non-uniform blur, we choose K2=1 based on the following proposition.

Proposition 2.

$$ {\left\| {\boldsymbol{K}} \right\|_{2}} \le 1 $$

Proof.

In matrix analysis, we have

$$\begin{array}{@{}rcl@{}} {\left\| {\boldsymbol{K}} \right\|_{2}} \le \sqrt {{{\left\| {\boldsymbol{K}} \right\|}_{1}}{{\left\| {\boldsymbol{K}} \right\|}_{\infty} }} \end{array} $$

where ·1 and · denote the largest l 1-norm of the columns and rows, respectively. For the uniform blur procedure b=Kx, where each row of K is a shifted version of blur kernel K, Zuo and Lin had proved that the l 1-norm of the row of matrix K corresponding to the (i,j)-th entry of b is ||k||1 [34]. For the non-uniform blur matrix K, (i,j)-th entry of b has its own blur kernel k (i,j), and consequently, ||K|| =max(i,j)||k (i,j)||1. By the non-negative constraint k i ≥0,i and the normalization constraint \({\sum \nolimits }_{i}{{{k}_{i}}=1}\), l 1-norm of any blur kernel is 1, i.e., ||k (i,j)||1=1. Thus, we have ||K|| =1. Similarly, ||K||1=1 can be derivated from [34]. Thus, the inequality is proved.

The updating of subproblems should be performed several iterations. With the properly chosen Lipschitz matrix, each variable has it own Lipschitz constant, rather than the largest one adopted in APG algorithm, so that the GAPG-based deblurring method has a faster convergence rate.

4.2 Connection with uniform blind deconvolution

We impose group sparsity, i.e., l 2,1-norm, on weight matrix W along roll angle dimension to connect GAC-based non-uniform deblurring and uniform blind deconvolution

$$\begin{array}{@{}rcl@{}} \mathop {\min }\limits_{{\boldsymbol{W}},{\boldsymbol{x}}} {\left\| {\sum\limits_{{\boldsymbol{\theta }} \in {\mathcal{P}}} {{w_{\boldsymbol{\theta }}} {{\boldsymbol{K}}_{\boldsymbol{\theta }}}{\boldsymbol{x}} - {\boldsymbol{b}}}} \right\|^{2}}\!\!\! +\lambda \mathop{\text{TV}}({\boldsymbol{x}}) + \sum\limits_{\theta_{j}}\tau_{\theta_{j}}\|{\boldsymbol{W}_{\theta_{j}}}\|^{2} \end{array} $$
((28))

where \(\tau _{\theta _{j}}\) is the trade-off parameter controlling the weight of all slices with angle θ j , and by setting \(\tau _{\theta _{j}=0}<\tau _{\theta _{i}},\forall i> j\), the slices with larger rotation angle will be gradually inactivated, and naturally, the non-uniform deblurring will approach toward uniform blind deconvolution.

To solve the this problem, we also adopt the alternative minimization strategy, where the updating of latent clear image shares the same solution via GAPG as Section 4.1. Due to the non-smoothness of l 2,1-norm regularizer, we propose an effective solution to solve it. By introducing auxiliary variable W , the problem can be reformulated as

$$\begin{array}{@{}rcl@{}} \mathop {\min }\limits_{{\boldsymbol{W}},{\boldsymbol{W}'}} {\left\| {\sum\limits_{{\boldsymbol{\theta }} \in {\mathcal{P}}} {{w_{\boldsymbol{\theta }}} {{\boldsymbol{K}}_{\boldsymbol{\theta }}}{\boldsymbol{x}} \,-\, {\boldsymbol{b}}}} \right\|^{2}}\!\!\! +\!\! \sum\limits_{\theta_{j}}\tau_{\theta_{j}}\|{\boldsymbol{W'}_{\theta_{j}}}\|^{2}\!\,+\,\frac{\delta}{2}\|{\boldsymbol W\!\,-\,\boldsymbol W'}\|^{2} \end{array} $$
((29))

where δ is the positive penalty parameter. For the W-subproblem, it is also a linear least square problem, which can be efficiently solved [33], and for the W -subproblem, it is a standard l 2,1-norm optimization problem, which can be solved by group shrinkage operator [42].

The slice GAC with group sparsity provides a natural connection of non-uniform deblurring and uniform deconvolution. When there are only camera in-plane translations, i.e., no roll rotation, the camera shake blurring will be uniform, and the slice GAC can play like the traditional uniform blind deconvolution method. Otherwise, roll rotations will cause the non-uniform blur, where slice GAC can play like non-uniform deblurring method.

Figure 5 shows a deblurring example of the standard slice GAC (S-GAC) and slice GAC with group sparsity (GS-GAC) on a real camera-shaken image. From the distribution of pose weights shown as Fig. 5 d, the group sparsity constrains the slices gathering at small rotation angles, where especially slices with angle 0 dominate, while in slice GAC, the activated poses are more randomly distributed. Thus, for the blurry image with slight roll rotation shown as Fig. 5 a, GS-GAC performs much better than S-GAC, significantly suffering less artifacts.

Fig. 5
figure 5

Comparison of standard S-GAC and GS-GAC. a Original camera shaken image. b Deblurring result by standard S-GAC method. c Deblurring result by GS-GAC. d Distribution of slice angles where the weights with the same rotation angle are summated

5 Experimental results

In this section, we evaluate the performance of proposed GAC deblurring methods. First, three GAC variants, i.e., slice-based GAC (S-GAC), fiber-based GAC (F-GAC), and hybrid GAC (H-GAC), are compared in terms of computational time, memory usage, and visual quality. Due to its superiority over S-GAC and F-GAC, H-GAC is then compared with the state-of-the-art geometrical methods, i.e., Whyte et al. [29], Gupta et al. [30], and Hu and Zhang [33], and the state-of-the-art EFF-based methods, i.e., Hirsch et al. [32], Xu et al. [16], and Cao et al. [27]. Finally, in the comparison with the state-of-the-art uniform deblurring methods [9, 21, 24, 43], slice GAC with GS-GAC is adopted. All the experiments are conducted on a PC with Intel Core i5-2400 3.10 GHz CPU and 16 G RAM memory, and the proposed method is implemented in Matlab.

5.1 Comparison of three GAC variants

On three real-world camera-shaken images shown as Fig. 6, three GAC variants are evaluated in terms of running time and memory usage. As shown in Table 1, the three GAC methods have comparable peak memory cost to each other, while in terms of running time, H-GAC is more efficient than both S-GAC and F-GAC, since the set of slices and fibers decomposed by H-GAC is usually more compact, thus having lower computational complexity.

Fig. 6
figure 6

The results of different GAC models on three real camera-shaken images. For the close-ups, from top to bottom rows are the input real camera-shaken images, deblurring results by F-GAC, deblurring results by S-GAC, deblurring results by H-GAC. a Books. b Butcher shop. c Statue

Table 1 Running time (s) and peak memory usage (GB) of different GAC models

From the restoration quality shown as Fig. 6, H-GAC and S-GAC perform better than F-GAC in recovering more details and achieving more visual plausible deblurring quality. Due to its superiority over both S-GAC and F-GAC, H-GAC is adopted in the following comparison with non-uniform deblurring methods.

5.2 Comparison with geometrical methods

We use five real-world camera-shaken images, shown in the top row of Fig. 7, to compare GAC with three geometrical methods, i.e., Whyte et al. [29], Gupta et al. [30], and Hu et al. [33], where their deblurring results are obtained by running source codes or executable programs provided by authors. Although GAC costs more memory than that of Whyte et al.’s [29] shown as Table 2, it is at least 100 × faster, and in terms of deblurring quality shown as Fig. 7, GAC performs much better than that of Whyte et al.’s [29] in achieving clearer and plausible texture details. At the same time, Gupta et al.’s [30], Hu and Yang’s [33], and the proposed GAC model actually adopt the same 3D subspace to approximate the full 6D camera pose space and consequently can obtain the similar deblurring results. Thus, it is more critical to evaluate their performance in terms of computational efficiency and memory usage. Tables 2 and 3 show that GAC not only is at least 2.5 × faster but also significantly relaxes memory burden than Gupta et al.’s [30] and Hu and Yang’s [33].

Fig. 7
figure 7

Visual comparison of GAC model with three geometrical methods on five images. From top to bottom rows: blurry images, close-ups of blurry images, deblurring results of Whyte et al. [29], Gupta et al. [30], Hu and Yang [33], and GAC, respectively. a Cabin. b Car-in-yard. c Dim-Petrol-Station. d Petrol-Station. e Six-Books

Table 2 Running time (s) of the four geometrical methods
Table 3 Peak memory usage (GB) of the four geometrical methods

5.3 Comparison with non-geometrical methods

Using the images in Fig. 8, we further compare GAC with four non-geometrical methods proposed by Harmeling et al. [44], Hirsch et al. [32], Xu et al. [16], and Cao et al. [27].

Fig. 8
figure 8

Set of real camera-shaken images for comparing with non-geometrical methods

Since neither source code nor executable program of Harmeling et al. [44] and Hirsch et al. [32] is available, we collected the deblurring results from their papers or websites. The non-geometrical methods greatly rely on the reasonability of region division and often sacrifice image details to smooth out possible artifacts at region boundaries. As shown in Fig. 9 a, b, GAC can achieve more visually plausible deblurring results, while the results by Harmeling et al. [44] and Hirsch et al. [32] are visually over-smoothed. Although the method by Cao et al. [27] is designed for better recovering clear text, the camera shake blurring in text field is not fully removed and significant artifacts are included in the results, like the first two images in Fig. 10. Moreover, GAC also performs better than the method by Cao et al. [27] in text-less fields. Xu et al. [16] and Cao et al. [27] respectively provided the executable program and source codes, so in Table 4, we report the CPU running time and memory usage comparison with Xu et al. [16] and Cao et al. [27] on several camera-shaken images shown in Fig. 8, from which one can see that GAC also performs better than Xu et al. or Cao et al. [27]. As for deblurring quality shown as Fig. 9 c, GAC can obtain comparable if not superior deblurring result than Xu et al. [16].

Fig. 9
figure 9

Deblurring effect comparison with non-geometrical methods. The top row from left to right are by Harmeling et al. [44], Hirsch et al. [32], and Xu et al. [16], respectively. The bottom row are deblurring results by GAC. a Butcher Shop. b Coke. c Statue

Fig. 10
figure 10

Deblurring effect comparison with Cao et al. [27]. a Original. b Cao et al. [27]. c GAC

Table 4 Running time (s) and peak memory usage (GB) comparison of Cao et al. [27] and hybrid GAC

5.4 Comparison with uniform deblurring methods

In the comparison experiments with the state-of-the-art uniform deblurring methods, i.e., Shan et al. [9], Xu et al. [21], Zuo et al. [24], and Pan et al. [43], we adopt GS-GAC, which performs better in handling spatially invariant blurry images. In Fig. 11, the first image is of non-uniform blur with slight roll rotations, from which one can see that GAC can achieve more visually plausible deblurring results, such as the statue’s face is recovered more clearly, while the uniform deblurring methods usually suffer from ringing artifacts. Then, on two spatially invariant blurred images, GAC can also achieve satisfactory deblurring results, since the imposed group sparsity enforces the GAC method to play like uniform deblurring method.

Fig. 11
figure 11

Deblurring effect comparison with uniform deblurring methods on uniformly blurred images. First column is the real blurry image. From the second column, left to right are deblurring results by Shan et al. [9], Xu et al. [21], Pan et al. [43], and Zuo et al. [24], respectively. The last column are deblurring results by GAC. a Original. b Shan et al. [9]. c Xu et al. [21]. d Pan et al. [44]. e Zuo et al. [24]. f GAC

6 Conclusions

In this paper, by designing GAC model to geometrically represent camera shake blurring, non-uniform deblurring is efficiently addressed. Since the slices and fibers decomposed from the camera motion trajectory can be formulated as convolution, all the GAC methods can exploit FFT for efficient optimization. Compared with the methods in [30, 33], the proposed method only needs several FFTs and to store several basis convolution kernels and look-up-tables in memory. By incorporating group sparsity into pose weight matrix, the GAC-based deblurring methods can also work like uniform blind deconvolution, better handling uniform blurry images. Compared with non-uniform deblurring methods, GAC method has a much lower peak memory usage and is much more efficient. Compared with uniform deblurring methods, GAC method also can achieve satisfactory deblurring results.

References

  1. S McCloskey, Y Ding, J Yu, Design and estimation of coded exposure point spread functions. IEEE Trans. Pattern Anal. Mach. Intell.34(10), 2071–2077 (2012).

    Article  Google Scholar 

  2. S McCloskey, in ECCV. Velocity-dependent shutter sequences for motion deblurring (SpringerCrete, Greece, 2010).

    Google Scholar 

  3. A Levin, R Fergus, F Durand, WT Freeman, Image and depth from a conventional camera with a coded aperture. ACM Trans. Graph.26(3), 70 (2007).

    Article  Google Scholar 

  4. N Joshi, SB Kang, CL Zitnick, R Szeliski, Image deblurring using inertial measurement sensors. ACM Trans. Graph.29(4), 30 (2010).

    Article  Google Scholar 

  5. Y-W Tai, H Du, MS Brown, S Lin, Correction of spatially varying image and video motion blur using a hybrid camera. IEEE Trans. Pattern Anal. Mach. Intell.32(6), 1012–1028 (2010).

    Article  Google Scholar 

  6. Y-W Tai, N Kong, S Lin, SY Shin, in CVPR. Coded exposure imaging for projective motion deblurring (IEEESan Francisco, USA, 2010), pp. 2408–2415.

    Google Scholar 

  7. R Fergus, B Singh, A Hertzmann, ST Roweis, WT Freeman, Removing camera shake from a single photograph. ACM Trans. Graph.25(3), 787–794 (2006).

    Article  Google Scholar 

  8. A Levin, Y Weiss, F Durand, WT Freeman, in CVPR. Understanding and evaluating blind deconvolution algorithms (IEEEMiami, USA, 2009), pp. 1964–1971.

    Google Scholar 

  9. Q Shan, J Jia, A Agarwala, High-quality motion deblurring from a single image. ACM Trans. Graph.27(3), 73 (2008).

    Article  Google Scholar 

  10. D Krishnan, T Tay, R Fergus, in CVPR. Blind deconvolution using a normalized sparsity measure (IEEEColorado Springs, USA, 2011), pp. 233–240.

    Google Scholar 

  11. TS Cho, S Paris, BKP Horn, WT Freeman, in CVPR. Blur kernel estimation using the radon transform (IEEEColorado Springs, USA, 2011), pp. 241–248.

    Google Scholar 

  12. Z Hu, M-H Yang, in ECCV, 7576. Good regions to deblur (SpringerFirenze, Italy, 2012), pp. 59–72.

    Google Scholar 

  13. J Miskin, DJ MacKay, in Adv. Ind. Compon. Anal. Ensemble learning for blind image separation and deconvolution (SpringerEdinburgh, UK, 2000), pp. 123–141.

    Chapter  Google Scholar 

  14. A Levin, Y Weiss, F Durand, WT Freeman, in CVPR. Efficient marginal likelihood optimization in blind deconvolution (IEEEColorado Springs, USA, 2011), pp. 2657–2664.

    Google Scholar 

  15. SD Babacan, R Molina, MN Do, AK Katsaggelos, in ECCV, 7577. Bayesian blind deconvolution with general sparse image priors (SpringerFirenze, Italy, 2012), pp. 341–355.

    Google Scholar 

  16. L Xu, S Zheng, J Jia, in CVPR. Unnatural l0 sparse representation for natural image deblurring (IEEEColumbus, USA, 2013), pp. 1107–1114.

    Google Scholar 

  17. J Kotera, F Šroubek, P Milanfar, in Compt. Anal. Images Patterns, 8048. Blind deconvolution using alternating maximum a posteriori estimation with heavy-tailed priors (SpringerYork, UK, 2013), pp. 59–66.

    Chapter  Google Scholar 

  18. S Cho, S Lee, Fast motion deblurring. ACM Trans. Graph.28(5), 145 (2009).

    Article  Google Scholar 

  19. N Joshi, R Szeliski, D Kriegman, in CVPR. PSF estimation using sharp edge prediction (IEEEAnchorage, USA, 2008), pp. 1–8.

    Google Scholar 

  20. L Sun, S Cho, J Wang, J Hays, in ICCP. Edge-based blur kernel estimation using patch priors (IEEECluj-Napoca, Romania, 2013), pp. 1–8.

    Google Scholar 

  21. L Xu, J Jia, in ECCV. Two-phase kernel estimation for robust motion deblurring (SpringerCrete, Greece, 2010), pp. 157–170.

    Google Scholar 

  22. K Schelten, S Nowozin, J Jancsary, C Rother, S Roth, in WACV. Interleaved regression tree field cascades for blind image deconvolution (IEEEHawaii, USA, 2015).

    Google Scholar 

  23. CJ Schuler, M Hirsch, S Harmeling, B Schölkopf, in NIPS. Learning to deblur (NIPSMontréal, Canada, 2014).

    Google Scholar 

  24. W Zuo, D Ren, S Gu, L Lin, L Zhang, in CVPR. Discriminative learning of iteration-wise priors for blind deconvolution (IEEEBoston, USA, 2015), pp. 3232–3240.

    Google Scholar 

  25. Z Hu, M-H Yang, Learning good regions to deblur images. Int. J. Comp. Vis.115(3), 345–362 (2015).

    Article  MathSciNet  Google Scholar 

  26. H Ji, K Wang, in CVPR. A two-stage approach to blind spatially-varying motion deblurring (IEEERhode Island, USA, 2012), pp. 73–80.

    Google Scholar 

  27. X Cao, W Ren, W Zuo, X Guo, H Foroosh, Scene text deblurring using text-specific multiscale dictionaries. IEEE Trans. Image Process.24(4), 1302–1314 (2015).

    Article  MathSciNet  Google Scholar 

  28. Y-W Tai, P Tan, MS Brown, Richardson-Lucy deblurring for scenes under a projective motion path. IEEE Trans. Pattern Anal. Mach. Intell.33(8), 1603–1618 (2011).

    Article  Google Scholar 

  29. O Whyte, J Sivic, A Zisserman, J Ponce, Non-uniform deblurring for shaken images. Int. J. Comp. Vis.98(2), 168–186 (2012).

    Article  MathSciNet  MATH  Google Scholar 

  30. A Gupta, N Joshi, CL Zitnick, M Cohen, B Curless, in ECCV, 6311. Single image deblurring using motion density functions (SpringerCrete, Greece, 2010), pp. 171–184.

    Google Scholar 

  31. M Hirsch, S Sra, B Schölkopf, S Harmeling, in CVPR. Efficient filter flow for space-variant multiframe blind deconvolution (IEEESan Francisco, USA, 2010), pp. 607–614.

    Google Scholar 

  32. M Hirsch, CJ Schuler, S Harmeling, B Schölkopf, in ICCV. Fast removal of non-uniform camera shake (IEEEBarcelona, Spain, 2011), pp. 463–470.

    Google Scholar 

  33. Z Hu, M-H Yang, in BMVC. Fast non-uniform deblurring using constrained camera pose subspace (BMVAGuildford, UK, 2012).

    Google Scholar 

  34. W Zuo, Z Lin, A generalized accelerated proximal gradient approach for total-variation-based image restoration. IEEE Trans. Image Process.20(10), 2748–2759 (2011).

    Article  MathSciNet  Google Scholar 

  35. R Köhler, M Hirsch, B Mohler, B Schölkopf, S Harmeling, in ECCV. Recording and playback of camera shake: benchmarking blind deconvolution with a real-world database (SpringerFirenze, Italy, 2012), pp. 27–40.

    Google Scholar 

  36. H Deng, W Zuo, H Zhang, D Zhang, An additive convolution model for fast restoration of nonuniform blurred images. Int. J. Comp. Math.91(11), 2446–2466 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  37. I Daubechies, M Defrise, C De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure. Appl. Math.57(11), 1413–1457 (2004).

    Article  MathSciNet  MATH  Google Scholar 

  38. JM Bioucas-Dias, MA Figueiredo, A new twist: two-step iterative shrinkage/thresholding algorithms for image restoration. IEEE Trans. Image Process.16(12), 2992–3004 (2007).

    Article  MathSciNet  Google Scholar 

  39. A Beck, M Teboulle, Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process.18(11), 2419–2434 (2009).

    Article  MathSciNet  Google Scholar 

  40. S Stančin, S Tomažič, Angle estimation of simultaneous orthogonal rotations from 3d gyroscope measurements. Sensors. 11(9), 8536–8549 (2011).

    Article  Google Scholar 

  41. R Matungka, YF Zheng, RL Ewing, Image registration using adaptive polar transform. IEEE Trans. Image Process.18(10), 2340–2354 (2009).

    Article  MathSciNet  Google Scholar 

  42. J Yang, W Yin, Y Zhang, Y Wang, A fast algorithm for edge-preserving variational multichannel image restoration. SIAM J. Imag. Sci.2(2), 569–592 (2009).

    Article  MathSciNet  MATH  Google Scholar 

  43. J Pan, Z Hu, Z Su, M-H Yang, in CVPR. Deblurring text images via l0-regularized intensity and gradient prior (IEEEColumbus, USA, 2014), pp. 2901–2908.

    Google Scholar 

  44. S Harmeling, M Hirsch, B Schölkopf, in NIPS. Space-variant single-image blind deconvolution for removing camera shake (NIPSVancouver, Canada, 2010), pp. 829–837.

    Google Scholar 

Download references

Acknowledgements

This work is supported in part by NSFC grants (61271093, 61471146, 61173086) and the program of ministry of education for new century excellent talents (NCET-12-0150).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wangmeng Zuo.

Additional information

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, H., Ren, D., Zhang, D. et al. Efficient non-uniform deblurring based on generalized additive convolution model. EURASIP J. Adv. Signal Process. 2016, 22 (2016). https://doi.org/10.1186/s13634-016-0318-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-016-0318-2

Keywords