 Research
 Open Access
 Published:
Clustering KSVD for sparse representation of images
EURASIP Journal on Advances in Signal Processing volume 2019, Article number: 47 (2019)
Abstract
Ksingular value decomposition (KSVD) is a frequently used dictionary learning (DL) algorithm that iteratively works between sparse coding and dictionary updating. The sparse coding process generates sparse coefficients for each training sample, and the sparse coefficients induce clustering features. In the applications like image processing, the features of different clusters vary dramatically. However, all the atoms of dictionary jointly represent the features, regardless of clusters. This would reduce the accuracy of sparse representation. To address this problem, in this study, we develop the clustering KSVD (CKSVD) algorithm for DL and the corresponding greedy algorithm for sparse representation. The atoms are divided into a set of groups, and each group of atoms is employed to represent the image features of a specific cluster. Hence, the features of all clusters can be utilized and the number of redundant atoms are reduced. Additionally, two practical extensions of the CKSVD are provided. Experimental results demonstrate that the proposed methods could provide more accurate sparse representation of images, compared to the conventional KSVD and its existing extended methods. The proposed clustering DL model also has the potential to be applied to the online DL cases.
1 Introduction
Sparse representation aims to model signals as sparse linear combinations of the atoms in a dictionary, and this technique is widely used in various fields of image processing [1–4]. Let \(\boldsymbol {z}\in \mathbb {R}^{n}\) and \(\boldsymbol {D}\in \mathbb {R}^{n\times q}, q\geq n\) denote a signal and an overcomplete dictionary, respectively. The sparse representation of z with respect to the dictionary D is expressed as z≈Ds. The sparse coefficients vector \(\boldsymbol {s}\in \mathbb {R}^{q}\) satisfies ∥s∥_{0}≤k and ∥z−Ds∥_{2}≤ε, where ∥·∥_{0} denotes the number of nonzero entries of a vector, k and ε represent the maximum number of sparse coefficients and sparse representation error, respectively. In general, the dictionaries used for sparse representation can be divided into two categories: analytical dictionaries and learned dictionaries. The analytical dictionaries like wavelet dictionaries can be universally applied, and they are easy to obtain. However, the moderate sparse representation accuracy limits their applications. For better performance, the overcomplete dictionary D is commonly obtained from the DL process using a set of training samples \(\boldsymbol {Z}\in \mathbb {R}^{n\times \xi }\), expressed as:
where t∈{1,2,⋯ξ}, i∈{1,2,⋯q}, and ∥·∥_{F} denote the Frobenius norm. The notation d_{i} denotes the ith column of the dictionary D, which is also referred to as the ith atom. S is the sparse coefficients with respect to Z and D, and it is obtained with D in the DL process; s_{t} is the tth column of S.
To date, researchers have proposed various DL algorithms. In [5], Engan et al. propose the wellknown DL method, named “the method of optimal directions (MOD).” The MOD contained two iterative process, sparse coefficients computing and dictionary updating. The dictionary updating is globally realized by the least squares (LS) computation in terms of training samples and sparse coefficients. In [6], Aharon et al. propose another LSbased algorithm for DL, referred to as the KSVD. Different from the global update strategy, for KSVD, the atoms of dictionary is updated separately. The MOD, the KSVD, and their extended methods are used for batch DL, i.e., the training samples input simultaneously. However, when the training samples can not be obtained all at one, the online learning is required. In [7], Mairal et al. propose the online DL (ODL) algorithm, aiming to update the atoms by using only the newly input samples. This algorithm allows training samples to be input successively and realizes the online learning. Additionally, a set of DL methods that are extended from MOD, KSVD, and ODL have also been proposed, in order to improve the sparse representation accuracy or reduce the computational complexity [8–13].
Among these algorithms, the KSVD is frequently used in the fields of image processing due to its generality and low complexity. The KSVD algorithm consists two processes, sparse coding and dictionary updating, which are executed alternately. In the sparse coding process, at most k sparse coefficients for each training sample are computed via greedy algorithms, inducing clustering features [14–16]. For the KSVD algorithm, all the atoms of dictionary jointly represent the training images, regardless of clusters. While representing different training samples, an atom may be employed by different clusters of features. In the applications of image processing, the features of different clusters vary dramatically, and therefore, the above phenomenon may reduce the accuracy of sparse representation. In [9], Nazzal et al. utilize the residual of training samples to train a set of subdictionaries. However, the subdictionaries are not distinguished by different clusters. In [11], Smith and Elad improve the KSVD by considering only the used atoms in dictionary updating process. In [31], Tariyal et al. propose the deep DL by combining the concepts of DL and deep learning. The multiple DL framework is developed for multiple levels of dictionaries. In [30], Yi et al. build a hierarchical sparse representation framework that consists of the local histogrambased model, the weighted alignment pooling model, and the sparsitybased discriminative model. In [28], Rubinstein et al. propose the approximate KSVD method to reduce the computational complexity, which can be regarded as another implementation of the KSVD. In [29], Mairal et al. develop a multiscale DL framework based on an efficient quadtree decomposition of the learned dictionary.
In this study, we aim to the utilize the clusters of features of training samples. We divide the atoms of learned dictionary into a set of groups, and each group serves for a specific cluster of features. This strategy improves the DL process from two aspects. First, besides the image features of the original training samples, we also consider the features of residuals of different clusters, which reduces the number of redundant atoms. Second, we develop the strategy to ensure that an arbitrary atom of a dictionary is utilized for only a specific feature of a cluster. Hence, the atom would not be influenced by the features of other clusters. Based on the strategy, we propose the CKSVD algorithm, as well as the corresponding greedy recovery algorithm for computing sparse representations. Compared to the conventional KSVD, the CKSVD improves the sparse reconstruction accuracy without increasing the requirements or computational complexity of DL process. Based on the clustering DL model, we also provide two practical extended methods of the CKSVD, which achieve the adaptive sparsity and the dynamic refinement of atoms, respectively.
The remainder of this paper is organized as follows. Section 2 describes the aim of this study and introduces the proposed method. Section 3 provides the extended methods of the CKSVD. Section 4 presents the experimental results. Section 5 discusses the proposed clustering model and its potential to be applied to online learning. Section 6 draws a conclusion.
2 Proposed method
In this section, we primarily review the conventional KSVD algorithm and describe the problem that needs to be addressed. Next, we introduce the proposed CKSVD algorithm.
2.1 Problem formulation
Given the training samples \(\boldsymbol {Z}\in \mathbb {R}^{n\times \xi }\) and the initialized dictionary \(\boldsymbol {D}\in \mathbb {R}^{n\times q}, q\geq n\), the sparse coding process is given by:
where t=1,2⋯,ξ. The process is executed by computing the sparse representation for each training sample expressed as:
where the vector \(\boldsymbol {z}_{t}\in \mathbb {R}^{n}\) denotes an arbitrary training sample. The above problem is commonly solved by greedy algorithms like orthogonal matching pursuit (OMP) [17]. Specifically, for each iteration of the process for solving (3), the atom that leads the largest inner product with the residual of z_{t} is selected. Thus, k atoms are selected successively, inducing k clusters of features. For the image samples, the objective features of different clusters vary greatly. But the atoms jointly represent the features, without considering the clusters. Hence, an arbitrary atom of the dictionary may be interfered by different features.
Here, we provide an example to describe this issue by employing the test image “Koala” with the size of 480×320 from the standard image dataset, Berkeley dataset [18]. We divided the images into patches with the size of 4×4, i.e., n=16. We vectorized the patches and used as the training samples for the KSVD algorithm. The dictionary \(\boldsymbol {D}\in \mathbb {R}^{n\times q}\) was initialized as the Gaussian random matrix. We set the total number of atoms and the maximum number of sparse coefficients for each sample to q=3n and k=3, respectively. After ten iterations, the outputted dictionary was obtained and it was utilized for sparse coding of the image “Koala” via the OMP algorithm. We divided the image into a set of patches and coded these patches respectively by using the learned dictionary. The original image, the clusters of image features, and the residual images are presented in Fig. 1. We obtain the residual image R_{1} by removing the first cluster of features H_{1} from the original image X, and H_{1} can be regarded as the sparse representation by using the first selected atom in the greedy recovering process. Similarly, we can obtain R_{2} and R_{3} by removing H_{2} and H_{3} from R_{1} and R_{2}, respectively.
An atom of the learned dictionary can be used to represent several patches. When an atom is used to represent a patch, it may be invoked in the first (H_{1}), the second (H_{2}), or the third (H_{3}) cluster (as indicated in Fig. 1). Thus, in Table 1 and Fig. 2, we summarize how many times that a specific atom is invoked by different cluster of features. Most atoms are invoked by more than one cluster of features.
To further illustrate this issue, we also selected an atom, the 12th atom of the learned dictionary and collected the features that invoked the atom. The result is displayed in Fig. 3. The features invoked the 12th atom belong to two different clusters. It can be noted that different cluster of features vary greatly. In other word, the 12th atom of the dictionary is employed to represent two different type of features. Obviously, one atom cannot provide the accuracy represents for both two types of features. Hence, the atom has to compromise among these different features to achieve the global minimum representation error. As a result, the performance of learned dictionary will be influenced. The graphical representation of the atom is presented in Fig. 3, indicating it contains part of characteristic of the first cluster of features and part of characteristic of the second cluster of features. It implies the learned atom is a compromise of the two types of features.
To address this problem, we propose the CKSVD algorithm for DL and the corresponding greedy algorithm for sparse recovery, which will be introduced in the following section.
2.2 CKSVD for sparse representation of images
The proposed DL algorithm is also composed of two iterative process. As shown in Fig. 4, for the sparse coding process, we divide the atoms into k groups, which serve for each cluster of features.
In other words, we divide the dictionary into k subdictionaries, expressed as D=[D_{1},D_{2},⋯,D_{k}]. We propose the greedy algorithm to solve the sparse recovery problem, which is described in Algorithm 1. For the lth iterative cycle, the features of the lth cluster are considered, and therefore, we only search the atoms in the lth subdictionary. In other words, only the atoms of D_{l} have the opportunity to be selected in the lth iterative cycle. Among these atoms, the one that is most relative to the residual obtained in the (l−1)th iteration is selected to represent the feature of the lth cluster. We compute the sparse coefficient for the current cluster based on the LS method:
where b is the atom of D indexed by ω, and ω is obtained by steps 3 and 4 in Algorithm 1. Then, the residual of the objective sample is updated by:
The above process is executed until the maximum number of coefficients is reached or the residual is small enough.
The Algorithm 1 is employed for computing the sparse coefficients of each training sample z_{t}. Next, the dictionary updating process is executed. Different from the conventional KSVD, for the proposed method, the subdictionaries \(\{\boldsymbol {D}_{l}\in \mathbb {R}^{n\times q_{l}}\}\) are initialized. The larger parameter q leads better performance of the dictionary but increases the complexity of the DL process and subsequent applications. For simplicity and without loss of generality, in this study, we assume the number of atoms of each subdictionary is the same. For an arbitrary atom d_{i}, we first find the training samples that have used d_{i}, and their indexes are denoted as γ_{i}. Then, we focus on the training samples indexed by γ_{i}, i.e., \(\boldsymbol {Z}_{\gamma _{i}}\), and compute the residual of these samples by excluding the atom d_{i} that is expressed as:
where \(\boldsymbol {R}_{\gamma _{i}}\) denotes the mentioned residual, \(\tilde {\boldsymbol {s}}_{\gamma _{i}}^{j}\) represents the jth row of \(\boldsymbol {S}_{\gamma _{i}}\). In fact, while excluding d_{i}, other used atoms for \(\boldsymbol {Z}_{\gamma _{i}}\) would not belong to the group that contains d_{i}, and this property could be utilized to reduce the computational complexity. Next, we apply the singular value decomposition (SVD) to the residual \(\boldsymbol {R}_{\gamma _{i}}\), expressed as:
We update the atom d_{i} to be the first column of U, denoted as u_{1}, and update the sparse coefficients row \(\tilde {\boldsymbol {s}}_{\gamma _{i}}^{j}\) by multiplying Δ_{1,1} and the first column of V, denoted as v_{1}.
3 Extension of cKSVD
The proposed idea not only leads to the CKSVD method but also builds a framework for DL. In other words, the CKSVD can still be extended for better performance. Next, we introduce two practical extensions.
3.1 Sparsitywise CKSVD
For the standard CKSVD, we fix the sparsity level, i.e., the number of sparse coefficients, for each training sample. However, this may lead to underfitting or overfitting of sparse representation. To address this problem, we develop the sparsitywise CKSVD (SwCKSVD). We employ multiatoms to represent a cluster of features instead of a single atom. To obtain the SwCKSVD from the CKSVD, we set termination conditions to determine the number of used atoms for a training sample. The sparse recovery strategy is summarized in Algorithm 3. For each cluster, the sparse coding is realized via an iterative process. The parameter a_{max} controls the maximum expected number of used atoms. In step 9, it is determined whether more atoms are required by examining if the residual is relative enough to the remained atoms. This operation allows the sparsity to be adaptive to each training sample, aiming to achieve a satisfied representation accuracy by using as few atoms as possible. The parameter ρ controls the threshold of the termination condition, and we empirically suggest ρ=0.4∼0.6.
3.2 Dynamic CKSVD
Although the sparse coding strategies for the KSVD, the CKSVD, and the SwCKSVD are different, their dictionary updating strategies are the same. They all use the first principal component of the SVD result to update dictionary and sparse coefficients (see steps 9 and 10 in Algorithm 2), while ignoring other components. Under the framework of CKSVD, the first cluster contributes most to the representation, and later clusters make less contributions. For instance, it is possible that a second principal component of the SVD result of a residual \(\boldsymbol {R}_{\gamma _{i}}\) in the first cluster, expressed as \(\boldsymbol {u}_{2}\boldsymbol {\Delta }\boldsymbol {v}^{*}_{2}\), may be more significant than a first principal component of the SVD results of a residual in the second cluster. Based on this consideration, we extend the CKSVD to the dynamic CKSVD (DCKSVD), for which the atoms of different clusters are refined after each iterative cycle. The dictionary updating strategy is provided in Algorithm 4. For each iterative cycle, we use the second component of the SVD results with respect to the most used atom in D_{l}, to replace the least used atom in the next cluster of dictionary. This operation makes the dictionaries dynamic that means the atoms will be refined after an iterative cycle and those contribute little to the representation will be abandoned.
4 Results
In this section, we provide the experimental results and analysis. The Berkeley dataset is employed for the experiments [18]. The experiments are organized as follows. First, we conducted the standard CKSVDbased DL process and the conventional KSVDbased DL process, respectively, by using the training dataset, in order to verify the improvement of the CKSVD over the conventional KSVD. Second, we applied different dictionaries to the compressive sensing, which is the typical application in the field of image processing. Besides the KSVD and the standard CKSVD, we also employed the SwCKSVD, the DCKSVD, and two existing methods extended from the KSVD, proposed in [28] and [29], respectively.
4.1 Experiments on sparse representation
The Berkeley dataset contains various images, and they are composed of the training dataset and test dataset. For this part of experiments, we first used the training dataset to execute the DL process based on the proposed method and the conventional KSVD method, respectively. We divided the training dataset into 19,194 patches with a size of 4×4, i.e., n=16, and set the initialized dictionaries to Gaussian random matrices. We considered different number of atoms, different maximum number of sparse coefficients, and different number of iterative cycles in this part of experiments. The parameters are the same for both methods in each set of experiments. After the DL process, we used the obtained dictionaries for the sparse representation of test dataset. The test dataset contains 100 test images with the size of 480×312. The images are represented based on the trained dictionaries by being divided into the image patches. The accuracy is evaluated by the peak signal to noise ratio (PSNR), and the results are presented in Figs. 5, 6, and 7.
It could be noted that, with the same parameters, the dictionaries trained by the CKSVD can provide more accurate sparse representations to the test images. As the number of iterations increases, the performance of dictionaries is improved. When the number of iterations exceeds 10, the increasing trend is slow. Similarly, the accuracy of sparse representations can benefit from the increase of q_{0} the larger q_{0}. However, the growth trend slows down ceaselessly, and a too large q_{0} would increase the computational complexity of sparse coding and dictionary updating. The larger number of sparse coefficients could also improve the performance of dictionaries. But it is not suggested to set a too large k, as it would reduce the sparsity of images.
4.2 Applied to compressive sensing
Compressive sensing (CS) is the technique that compressively samples and reconstructs signals from fewer measurements, in order to reduce the cost of signal transmission and storage[19–23]. The signal x can be compressively sampled by a sensing matrix Φ expressed as:
where \(\boldsymbol {y} \in \mathbb {R}^{m}\) is the measurements and \(\boldsymbol {\Phi }\in \mathbb {R}^{m\times n}\) with m<n. When x denotes the vectorization of an image patch, its sparse representation in terms of the dictionary \(\boldsymbol {D}\in \mathbb {R}^{n\times q}\) can be written as:
where \(\boldsymbol {s}\in \mathbb {R}^{q}\) is the sparse coefficients vector. The operation that recovers the sparse coefficients s from measurements y, sensing matrix Φ, and dictionary D is referred to as reconstruction, which is expressed by:
This problem can be directly solved by greedy algorithms[17, 24–26], and then the original signal x can be obtained by (9).
In this part of experiments, we considered the applications of the proposed methods on compressive sensing. We trained the dictionaries from the samples introduced in Section 4.1 using the compared methods. For the training process, we divided the training images into 36,000 patches with the sizes of 6×6 and 8×8. For the size of 6×6, i.e., n=36, the maximum number of coefficients for all methods was set to 6 and the total number of atoms was set to 324; for the size of 8×8, i.e., n=64, the maximum number of coefficients for all methods was set to 9 and the total number of atoms was set to 576. The number of atoms for all compared method is the same. Hence, for the standard CKSVD, we set the numbers of clusters to 6 and 9 as selected only 1 atom for each cluster. For the SwCKSVD, we set the number of clusters to 3, and therefore, the maximum numbers of used atoms for each cluster were 2 and 3 for different patch sizes. The maximum number of iterative cycles was set to 20 for all compared methods. We select three typical images, “Elephant,” “Horse,” and “Penguin” from the testing dataset as original images and divided them into a set of patches, the size of which was the same as training samples. Each patch was compressed by using the Gaussian random matrices with numbers of measurements, denoted as m. This technique is also referred as blockCS [27]. The measurements were then reconstructed by using the OMP algorithm, and their accuracy was measured by the PSNR. To reduce the influence of stochastic factors that were brought by the Gaussian random sensing matrices, we repeated each trial for 50 times. The average results are presented in Tables 2, 3, and 4.
The results demonstrate that the PSNR of the reconstructed images based on the dictionaries trained by the CKSVD is much higher than those based on the dictionary trained by the conventional KSVD, regardless of the original image, patch size, and the number of measurements. The reason is the CS greedy reconstruction requires that the number of measurements should be at least twice the number of sparse coefficients; if not, the primary sparse coefficients may not be completely reconstructed, and the reconstruction accuracy would be significantly influenced. On the other hand, the required number of sparse coefficients for accurate representation using the proposed method is much smaller than that using the conventional KSVD, as the main features can be presented by only the coefficients and atoms of the first cluster. Taking the example of the experiments with the patch size of 8×8, for the conventional KSVD, the maximum number of sparse coefficients is 9. For the proposed methods, the number of cluster is 3, i.e., each cluster contains at most three coefficients. Then, when employed for CS reconstruction, for the dictionaries trained by the conventional KSVD, maybe only when all nine sparse coefficients are found out would the image can be reconstructed accurately. But for the dictionaries trained by the proposed method, finding out only the coefficients of the first cluster, i.e., 3 coefficients, could lead to a satisfied reconstruction. Therefore, the proposed method has an advantage on CS reconstruction, especially for the low sampling ratio. As the extensions of CKSVD, both the SwCKSVD and the DCKSVD show the improvement over the standard CKSVD, benefiting from the adaptive sparsity and the dynamic refinement of atoms, respectively. The CKSVD, the SwCKSVD, and the DCKSVD also outperform the methods proposed in [28] and [29]. Besides the PSNR comparison, we also provide the visual comparison of the results. The reconstructed images based on the patch size of 8×8 with m=14 and m=32 are presented in Figs. 8 and 9. It can be noted that the quality of reconstructed images based on the proposed methods is obviously higher than those based on the conventional KSVD and the methods proposed in [28] and [29].
Besides the Gaussianinitialized dictionary, we also used the discrete cosine transform (DCT) basis and prelearned dictionary as the initialized dictionary for experiment. For the DCT case, we first generated an overcomplete DCT basis containing the same amount of atoms as the Gaussianinitialized dictionary. Then, we employed all the compared methods to conduct the DL processes. The learned dictionaries were utilized for CS reconstruction of the “Horse” image. Other experimental settings remained unchanged. The results are provided in Table 5.
For the experiment of prelearned dictionary, we firstly chose 10,000 patches for pertraining using the conventional KSVD. Then, the prelearned dictionary was trained again by another 10,000 patches using all compared methods. The initialized dictionaries were the overcomplete DCT basis. Other experimental setting were the same as those used in the previous CS reconstruction experiments. The PSNR of the reconstructed “Horse” image using different dictionaries (including the prelearned dictionaries) was summarized in Table 6.
The results in Table 5 demonstrate the overcomplete DCT basis can also be utilized as the initialized dictionary for the proposed DL methods. Comparing to the Gaussian randominitialized dictionary, the DCTinitialized dictionary provide better performance of accuracy, regardless of the DL methods. In Table 6, it can be noted that the prelearned dictionaries can be still trained by new samples with or without using the online DL methods, and the performance is improved obviously after the new training process. Similarly, the proposed methods outperforms the conventional methods.
4.3 Applied to image denoising
Besides the CS, we also performed the experiments to apply the proposed method to image denoising. Two images were selected for this part of experiments. The first is the “Koala” image from the Berkeley image dataset with the size of 320×480, and it has been introduced in Section 2.1. The other is the standard test image “Pepper” with the size of 512×512. The experimental setup is described as follows.

We first added the Gaussian white noise to the original images with zero mean and the variance of σ.

We selected 20,000 patches and 40,000 patches of the noised “Koala” image and the noised “Pepper” image, respectively. The size of all patches was 8×8.

We used the vectorization of these patches to conduct the DL process for two test images.

We employed the learned dictionary for denoising by using the strategy given in [32].
We initialized all dictionaries as the overcomplete DCT basis with the size of 64×256 and set the maximum of sparse coefficients for each training sample to 6. The maximum number of iterative cycles was set to 10. We set the noise level to be σ=15 and σ=20. For each noise level, we repeated the above process. More detailed description of the denoising experiments can be found in [32]. The conventional KSVD, the methods introduced in [28] and [29], and the proposed CKSVD were employed for comparison. The results are presented in Figs. 10 and 11. It can be noted from the results that the dictionaries trained by the proposed CKSVD provide to the most accurate denoised images, regardless of the original images and the noise levels.
5 Discussion
As mentioned in Section 3, this study not only develops a KSVDbased method but also provides the clustering DL model. The potential and advantage of the clustering model mainly come from two aspects. First, different cluster of dictionaries is isolated from each other. Thus, an atom of learned dictionary could concentrate on a specific type of feature, leading greater utilization of atoms. In other words, a common phenomenon in the conventional DL model can be avoid, that is, a part of atoms is widely employed by training samples whereas others are seldom used. Second, the clustering DL model makes it possible to adjust the sparsity based on different training samples and therefore to reduce the underfitting of overfitting of sparse representation. We provide the SwCKSVD by adaptively selected the number of used atoms for each cluster. It is believed that the adaptive strategy can also be implemented by adjusting the number of clusters. This potential is verified by the fact the SwCKSVD performs obviously better than the standard CKSVD.
Future work could consider extending the clustering DL model to online learning. In this study, we focus on the batch DL and the dictionary updating strategy is based on the SVD. We believe the proposed clustering DL model is not limited to batch DL but can be extended to online DL problem. In [7], the standard ODL method is proposed, for which the informationstoring variables are updated when a new training sample inputs. The variables are then used to update the learned dictionary through an optimization approach. For clustering DL, we may utilize a set of informationstoring variables for different cluster of dictionaries. When a new sample inputs, we could employ Algorithm 1 or Algorithm 3 to solve the sparse coefficients with respect to different cluster of subdictionaries. Then, the sparse coefficients are used to update different cluster of informationstoring variables. Finally, we could update the subdictionaries based on the informationstoring variables, such that the clustering ODL is achieved. We believe the clustering ODL has the potential to be applied in the cases where training samples cannot be obtained simultaneously.
6 Conclusions
We proposed a DL method named CKSVD for sparse representation of images. For CKSVD, the atoms of dictionary are divided into a set of groups, and each group of atoms serve for image features of a specific cluster. Hence, the features of all clusters can be utilized and the redundant atoms are avoided. Based on this strategy, we introduced the CKSVD and two practical extensions. Experimental results demonstrated that the proposed methods could provide more accurate sparse representation of images, compared to the conventional KSVD algorithm and its extended methods.
Availability of data and materials
Please contact the authors for data requests.
Abbreviations
 CKSVD:

Clustering Ksingular value decomposition
 CS:

Compressive sensing
 DCKSVD:

Dynamic CKSVD
 DL:

Dictionary learning
 KSVD:

Ksingular value decomposition
 LS:

Least square
 MOD:

Method of optimal directions
 OMP:

Orthogonal matching pursuit
 PSNR:

Peak signaltonoise ratio
 SwCKSVD:

Sparsitywise CKSVD
References
R. Rubinstein, A. M. Bruckstein, M. Elad, Dictionaries for sparse representation modeling. Proc. IEEE. 98(6), 1045–1057 (2010).
X. Lu, D. Wang, W. Shi, D. Deng, Groupbased single image superresolution with online dictionary learning. EURASIP J. Adv. Signal Process.2016(84), 1–12 (2016).
V. Naumova, K. Schnass, Fast dictionary learning from incomplete data. EURASIP J. Adv. Signal Process.2018(12), 1–21 (2018).
L. Zhang, W. Zuo, D. Zhang, LSDT: latent sparse domain transfer learning for visual adaptation. IEEE Trans. on Image Process.25(3), 1177–1191 (2016).
K. Engan, S. O. Aase, J. H. Husy, Multiframe compression: theory and design. EURASIP Signal Process.90(2), 2121–2140 (2000).
M. Aharon, M. Elad, A. Bruckstein, The KSVD: an algorithm for designing of overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process.54(11), 4311–4322 (2006).
J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res.11:, 19–60 (2010).
B. Dumitrescu, P. Irofti, Regularized KSVD. IEEE Signal Process. Lett.24(3), 309–313 (2017).
M. Nazzal, F. Yeganli, H. Ozkaramanli, A strategy for residual componentbased multiple structured dictionary learning. IEEE Signal Process. Lett.22(11), 2059–2063 (2015).
J. K. Pant, S. Krishnan, Compressive sensing of electrocardiogram signals by promoting sparsity on the secondorder difference and by using dictionary learning. IEEE Trans. Biomed. Circuits Syst.8(2), 293–302 (2014).
L. N. Smith, M. Elad, Improving dictionary learning: multiple dictionary updates and coefficient reuse. IEEE Signal Process. Lett.20(1), 79–82 (2013).
R. Zhao, Q. Wang, Y. Shen, J. Li, Multidimensional dictionary learning algorithm for compressive sensingbased hyperspectral imaging. J. Electron. Imaging. 25(6), 063013 (2016).
K. Skretting, K. Engang, Recursive least squares dictionary learning algorithm. IEEE Trans. Signal Process.58(4), 2121–2130 (2010).
J. A. Tropp, Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory. 50(10), 2231–2242 (2004).
E. J. Candès, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory. 52(2), 489–509 (2006).
E. J. Candès, T. Tao, Decoding by linear programming. IEEE Trans. Inf. Theory. 51(12), 4203–4215 (2005).
J. A. Tropp, A. C. Gilbert, Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inf. Theory. 53(12), 4655–4666 (2007).
D. Martin, C. Fowlkes, D. Tal, J. Malik, in Proc. IEEE Int. Conf. Comput. Vis. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics (IEEEVancouver, 2001), pp. 416–423.
D. L. Donoho, Compressed sensing. IEEE Trans. Inf. Theory. 52(4), 1289–1306 (2006).
E. J. Candès, Compressive sampling. Int. Congress of Mathematicians, Madrid, Spain. 3:, 1433–1452 (2006).
A. Massa, P. Rocca, G. Oliveri, Compressive sensing in electromagnetics  a review. IEEE Anten. Propag. Mag.57(1), 224–238 (2015).
D. Craven, B. McGinley, L. Kilmartin, M. Glavin, E. Jones, Compressed sensing for bioelectric signals: a review. IEEE J. Biomed. Health Inf. 19(2), 539–540 (2015).
Y. Zhang, L. Y. Zhang, et. al, A review of compressive sensing in information security field. IEEE Access. 4:, 2507–2519 (2016).
D. Nion, N. D. Sidiropoulos, Tensor algebra and multidimensional harmonic retrieval in signal processing for MIMO radar. IEEE Trans. Signal Process.58(11), 5693–4705 (2010).
W. Dai, O. Milenkovic, Subspace pursuit for compressive sensing signal reconstruction. IEEE Trans. Inf. Theory. 55(5), 2230–2249 (2009).
D. L. Donoho, Y. Tsaig, I. Drori, J. L. Starck, Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit. IEEE Trans. Inf. Theory. 58(2), 1094–1121 (2012).
L. Gan, in Proc. IEEE Int. Conf. Digit. Signal Process. Block compressed sensing of natural images (IEEEWales, 2007), pp. 403–406.
R. Rubinstein, M. Zibulevsky, M. Elad. Efficient Implementation of the KSVD Algorithm Using Batch Orthogonal Matching Pursuit. Technical Report CS200808 (Technion UniversityHaifa, 2008).
J. Mairal, G. Sapiro, M. Elad, Learning multiscale sparse representations for image restoration. Multiscale Model. Simul.7(1), 214–241 (2008).
Y. Yi, Y. Cheng, C. Xu, Visual tracking based on hierarchical framework and sparse representation. Multimed. Tools Appl.77(13), 16267–16289 (2018).
S. Tariyal, A. Majumdar, R. Singh, M. Vatsa, Deep dictionary learning. Multimed. Tools Appl.4:, 10096–10109 (2016).
M. Elad, M. Aharon, Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process.15(12), 3736–3745 (2006).
Acknowledgements
The authors would like to thank the National Key R&D Program of China and the National Natural Science Foundation of China for the financial support.
Funding
This work was supported by the National Key R&D Program of China under Grant 2017YFD0700302 and by the National Natural Science Foundation of China under Grant 51705193.
Author information
Authors and Affiliations
Contributions
JF provided the methodology. RZ wrote the original manuscript. HY reviewed and edited the manuscirpt. JF and LR funded this study. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Consent for publication
This manuscript does not contain any individual person’s data in any form.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Fu, J., Yuan, H., Zhao, R. et al. Clustering KSVD for sparse representation of images. EURASIP J. Adv. Signal Process. 2019, 47 (2019). https://doi.org/10.1186/s1363401906504
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1363401906504
Keywords
 Dictionary learning
 Sparse representation
 Image processing