Hard versus fuzzy cmeans clustering for color quantization
 Quan Wen^{1} and
 M Emre Celebi^{2}Email author
https://doi.org/10.1186/168761802011118
© Wen and Celebi; licensee Springer. 2011
Received: 2 March 2011
Accepted: 25 November 2011
Published: 25 November 2011
Abstract
Color quantization is an important operation with many applications in graphics and image processing. Most quantization methods are essentially based on data clustering algorithms. Recent studies have demonstrated the effectiveness of hard cmeans (kmeans) clustering algorithm in this domain. Other studies reported similar findings pertaining to the fuzzy cmeans algorithm. Interestingly, none of these studies directly compared the two types of cmeans algorithms. In this study, we implement fast and exact variants of the hard and fuzzy cmeans algorithms with several initialization schemes and then compare the resulting quantizers on a diverse set of images. The results demonstrate that fuzzy cmeans is significantly slower than hard cmeans, and that with respect to output quality, the former algorithm is neither objectively nor subjectively superior to the latter.
1 Introduction
Truecolor images typically contain thousands of colors, which makes their display, storage, transmission, and processing problematic. For this reason, color quantization (reduction) is commonly used as a preprocessing step for various graphics and image processing tasks. In the past, color quantization was a necessity due to the limitations of the display hardware, which could not handle over 16 million possible colors in 24bit images. Although 24bit display hardware has become more common, color quantization still maintains its practical value [1]. Modern applications of color quantization in graphics and image processing include: (i) compression [2], (ii) segmentation [3], (iii) text localization/detection [4], (iv) colortexture analysis [5], (v) watermarking [6], (vi) nonphotorealistic rendering [7], (vii) and contentbased retrieval [8].
The process of color quantization is mainly comprised of two phases: palette design (the selection of a small set of colors that represents the original image colors) and pixel mapping (the assignment of each input pixel to one of the palette colors). The primary objective is to reduce the number of unique colors, N', in an image to C, C ≪ N', with minimal distortion. In most applications, 24bit pixels in the original image are reduced to 8 bits or fewer. Since natural images often contain a large number of colors, faithful representation of these images with a limited size palette is a difficult problem.
Color quantization methods can be broadly classified into two categories [9]: imageindependent methods that determine a universal (fixed) palette without regard to any specific image [10] and imagedependent methods that determine a custom (adaptive) palette based on the color distribution of the images. Despite being very fast, imageindependent methods usually give poor results since they do not take into account the image contents. Therefore, most of the studies in the literature consider only imagedependent methods, which strive to achieve a better balance between computational efficiency and visual quality of the quantization output.
Numerous imagedependent color quantization methods have been developed in the past three decades. These can be categorized into two families: preclustering methods and postclustering methods [1]. Preclustering methods are mostly based on the statistical analysis of the color distribution of the images. Divisive preclustering methods start with a single cluster that contains all N' image colors. This initial cluster is recursively subdivided until C clusters are obtained. Wellknown divisive methods include mediancut [11], octree [12], variancebased method [13], binary splitting method [14], and greedy orthogonal bipartitioning method [15]. On the other hand, agglomerative preclustering methods [16–18] start with N' singleton clusters each of which contains one image color. These clusters are repeatedly merged until C clusters remain. In contrast to preclustering methods that compute the palette only once, postclustering methods first determine an initial palette and then improve it iteratively. Essentially, any data clustering method can be used for this purpose. Since these methods involve iterative or stochastic optimization, they can obtain higher quality results when compared to preclustering methods at the expense of increased computational time. Clustering algorithms adapted to color quantization include hard cmeans [19–22], competitive learning [23–27], fuzzy cmeans [28–32], and selforganizing maps [33–35].
In this paper, we compare the performance of hard and fuzzy cmeans algorithms within the context of color quantization. We implement several efficient variants of both algorithms, each one with a different initialization scheme, and then compare the resulting quantizers on a diverse set of images. The rest of the paper is organized as follows. Section 2 reviews the notions of hard and fuzzy partitions and gives an overview of the hard and fuzzy cmeans algorithms. Section 3 describes the experimental setup and compares the hard and fuzzy cmeans variants on the test images. Finally, Sect. 4 gives the conclusions.
2 Color quantization using cmeans clustering algorithms
2.1 Hard versus fuzzy partitions
Row i of U, say U_{ i } = (u_{ i }_{1}, u_{ i }_{2}, . . . , u_{ iN }), exhibits the characteristic function of the i th partition (cluster) of X: u_{ ik } is 1 if x_{ k } is in the i th partition and 0 otherwise; ${\sum}_{i=1}^{C}{u}_{ik}=1\phantom{\rule{1em}{0ex}}\forall k$ means that each x_{ k } is in exactly one of the C partitions; $0<{\sum}_{k=1}^{N}{u}_{ik}<N\phantom{\rule{1em}{0ex}}\forall i$ means that no partition is empty and no partition is all of X, i.e. 2 ≤ c ≤ N. For obvious reasons, U is often called a partition (membership) matrix.
The concept of hard Cpartition can be generalized by relaxing the first condition in Equation 1 as u_{ ik } ∈ 0[1] in which case the partition matrix U is said to represent a fuzzy Cpartition of X [37]. In a fuzzy partition matrix U, the total membership of each x_{ k } is still 1, but since 0 ≤ u_{ ik } ≤ 1 ∀i, k, it is possible for each x_{ k } to have an arbitrary distribution of membership among the C fuzzy partitions {U_{ i }}.
2.2 Hard cmeans (HCM) clustering algorithm
where U is a hard partition matrix as defined in §2.1, V = {v_{1}, v_{2}, . . . , v_{ C }} ∈ ℝ^{ D } is a set of C cluster representatives (centers), e.g. v_{ i } is the center of hard cluster U_{ i } ∀i, and d_{ ik } denotes the Euclidean $\left({\mathcal{L}}_{2}\right)$ distance between input vector x_{ k } and cluster center v_{ i }, i.e. d_{ ik } = x_{ k } v_{ i }_{2}.
This problem is known to be NPhard even for C = 2 [39] or D = 2 [40], but a heuristic method developed by Lloyd [41] offers a simple solution. Lloyd's algorithm starts with C arbitrary centers, typically chosen uniformly at random from the data points. Each point is then assigned to the nearest center, and each center is recalculated as the mean of all points assigned to it. These two steps are repeated until a predefined termination criterion is met.
The complexity of HCM is $\mathcal{O}\left(NC\right)$ per iteration for a fixed D value. In color quantization applications, D often equals three since the clustering procedure is usually performed in a threedimensional color space such as RGB or CIEL * a * b * [42].
From a clustering perspective, HCM has the following advantages:
◊ It is conceptually simple, versatile, and easy to implement.
◊ It has a time complexity that is linear in N and C.
◊ It is guaranteed to terminate [43] with a quadratic convergence rate [44].
Due to its gradient descent nature, HCM often converges to a local minimum of its objective functional [43] and its output is highly sensitive to the selection of the initial cluster centers. Adverse effects of improper initialization include empty clusters, slower convergence, and a higher chance of getting stuck in bad local minima. From a color quantization perspective, HCM has two additional drawbacks. First, despite its linear time complexity, the iterative nature of the algorithm renders the palette generation phase computationally expensive. Second, the pixel mapping phase is inefficient, since for each input pixel a full search of the palette is required to determine the nearest color. In contrast, preclustering methods often manipulate and store the palette in a special data structure (binary trees are commonly used), which allows for fast nearest neighbor search during the mapping phase. Note that these drawbacks are shared by the majority of postclustering methods, including the fuzzy cmeans algorithm.
We have recently proposed a fast and exact HCM variant called Weighted SortMeans (WSM) that utilizes data reduction and accelerated nearest neighbor search [21, 22]. When initialized with a suitable preclustering method, WSM has been shown to outperform a large number of classic and stateoftheart quantization methods including mediancut [11], octree [12], variancebased method [13], binary splitting method [14], greedy orthogonal bipartitioning method [15], neuquant [33], split and merge method [18], adaptive distributing units method [23, 26], finitestate HCM method [19], and stableflags HCM method [20].
In this study, WSM is used in place of HCM since both algorithms give numerically identical results. However, in the remainder of this paper, WSM will be referred to as HCM for reasons of uniformity.
2.3 Fuzzy cmeans (FCM) clustering algorithm
where the parameter 1 ≤ m < ∞ controls the degree of membership sharing between fuzzy clusters in X.
A näive implementation of FCM has a complexity of $\mathcal{O}\left(N{C}^{2}\right)$ per iteration, which is quadratic in the number of clusters. In this study, a linear complexity formulation, i.e. $\mathcal{O}\left(NC\right)$, described in [46] is used. In order to take advantage of the peculiarities of color image data (presence of duplicate samples, limited range, and sparsity), the same data reduction strategy used in WSM is incorporated into FCM.
3 Experimental results and discussion
3.1 Image set and performance criteria
where I and $\widehat{\mathbf{I}}$ denote, respectively, the H × W original and quantized images in the RGB color space. MAE and MSE represent the average color distortion with respect to the ${\mathcal{L}}_{1}$ (Cityblock) and ${\mathcal{L}}_{2}^{2}$ (squared Euclidean) norms, respectively. Note that most of the other popular evaluation measures in the color quantization literature such as peak signaltonoise ratio (PSNR), normalized MSE, root MSE, and average color distortion [24, 34] are variants of MAE or MSE.
The efficiency of a quantization method was measured by CPU time in milliseconds, which includes the time required for both the palette generation and the pixel mapping phases. The fast pixel mapping algorithm described in [49] was used in the experiments. All of the programs were implemented in the C language, compiled with the gcc v4.4.3 compiler, and executed on an Intel Xeon E5520 2.26 GHz machine. The time figures were averaged over 20 runs.
3.2 Comparison of HCM and FCM
The following wellknown preclustering methods were used in the experiments:

Mediancut (MC)[11]: This method starts by building a 32 × 32 × 32 color histogram that contains the original pixel values reduced to 5 bits per channel by uniform quantization (bitcutting). This histogram volume is then recursively split into smaller boxes until C boxes are obtained. At each step, the box that contains the largest number of pixels is split along the longest axis at the median point, so that the resulting subboxes each contain approximately the same number of pixels. The centroids of the final C boxes are taken as the color palette.

Octree (OCT)[12]: This twophase method first builds an octree (a tree data structure in which each internal node has up to eight children) that represents the color distribution of the input image and then, starting from the bottom of the tree, prunes the tree by merging its nodes until C colors are obtained. In the experiments, the tree depth was limited to 6.

Variancebased method (WAN)[13]: This method is similar to MC with the exception that at each step the box with the largest weighted variance (squared error) is split along the major (principal) axis at the point that minimizes the marginal squared error.

Greedy orthogonal bipartitioning method (WU)[15]: This method is similar to WAN with the exception that at each step the box with the largest weighted variance is split along the axis that minimizes the sum of the variances on both sides.
Four variants of HCM/FCM, each one initialized with a different preclustering method, were tested. Each variant was executed until it converged. Convergence was determined by the following commonly used criterion [50]: (J_{(i1)} J_{(i)})/J_{(i)}≤ ε, where J_{(i)}denotes the value of the objective functional (Eqs. (2) and (3) for HCM and FCM, respectively) at the end of the i th iteration. The convergence threshold was set to ε = 0.001.
The weighting exponent (m) value recommended for color quantization applications ranges between 1.3 [30] and 2.0 [31]. In the experiments, four different m values were tested for each of the FCM variants: 1.25, 1.50, 1.75, and 2.00.
MAE comparison of the quantization methods
Hats  Motocross  

HCM  FCM  HCM  FCM  
C  Init  1.25  1.50  1.75  2.00  Init  1.25  1.50  1.75  2.00  
32  MC  30  16  16  16  16  15  26  19  19  19  18  18 
OCT  19  15  15  15  15  15  21  17  18  18  18  18  
WAN  26  15  15  15  15  15  24  18  18  18  18  18  
WU  18  15  15  15  15  15  21  18  18  17  17  18  
64  MC  18  12  12  11  11  11  20  15  15  14  14  14 
OCT  13  10  10  10  10  10  15  13  13  13  13  13  
WAN  18  11  11  10  10  11  19  14  14  13  13  14  
WU  12  10  10  10  10  10  15  13  13  13  13  13  
128  MC  13  9  8  8  8  8  16  12  11  11  11  11 
OCT  9  7  7  7  7  7  12  10  10  10  10  10  
WAN  11  8  7  7  7  7  15  10  10  10  10  11  
WU  9  7  7  7  7  7  12  10  10  10  10  10  
256  MC  10  7  6  6  6  6  13  9  9  9  8  9 
OCT  6  5  5  5  5  5  9  8  8  8  8  8  
WAN  9  5  5  5  5  5  12  8  8  8  8  8  
WU  6  5  5  5  5  5  9  8  8  8  8  8  
Flowers and Sill  Cover Girl  
HCM  FCM  HCM  FCM  
C  Init  1.25  1.50  1.75  2.00  Init  1.25  1.50  1.75  2.00  
32  MC  20  14  14  14  13  13  22  16  15  14  14  14 
OCT  15  12  12  12  12  12  17  14  14  14  13  13  
WAN  17  12  12  12  12  12  18  14  14  14  14  14  
WU  14  12  12  12  12  12  16  14  14  14  14  14  
64  MC  14  11  10  10  10  10  16  11  11  11  11  10 
OCT  11  9  9  9  9  9  12  10  10  10  10  10  
WAN  12  9  9  9  9  9  15  11  11  10  10  11  
WU  10  9  9  9  9  9  12  10  10  10  10  10  
128  MC  12  8  8  8  7  7  13  9  8  8  8  8 
OCT  8  7  7  7  7  7  9  8  7  7  7  8  
WAN  9  7  7  7  7  7  12  8  8  8  8  8  
WU  8  7  7  7  7  7  9  8  8  8  8  8  
256  MC  9  6  6  6  6  6  11  7  7  6  6  6 
OCT  6  5  5  5  5  5  7  6  6  6  6  6  
WAN  8  5  5  5  5  5  10  6  6  6  6  6  
WU  6  5  5  5  5  5  7  6  6  6  6  6  
Parrots  Poolballs  
HCM  FCM  HCM  FCM  
C  Init  1.25  1.50  1.75  2.00  Init  1.25  1.50  1.75  2.00  
32  MC  28  21  21  20  21  21  12  9  9  9  7  7 
OCT  24  20  20  20  20  20  8  6  6  6  6  6  
WAN  25  21  20  20  20  20  11  6  6  6  6  6  
WU  23  20  20  20  20  20  7  7  6  6  6  6  
64  MC  22  15  15  15  15  15  9  6  6  6  5  5 
OCT  18  15  15  15  15  15  5  4  4  3  3  4  
WAN  19  15  15  15  15  15  9  4  4  4  4  4  
WU  17  15  15  15  15  15  5  4  4  4  4  4  
128  MC  16  12  12  12  12  12  7  5  5  5  4  3 
OCT  14  11  11  11  11  11  3  2  2  2  2  2  
WAN  15  11  11  11  11  12  9  3  3  3  3  3  
WU  13  11  11  11  11  11  4  3  3  3  2  2  
256  MC  13  9  9  9  9  9  7  4  3  3  3  2 
OCT  10  9  8  8  9  9  2  2  2  2  2  2  
WAN  12  9  9  9  9  9  8  2  2  2  2  2  
WU  10  9  8  8  9  9  4  2  2  2  2  2 
MSE comparison of the quantization methods
Hats  Motocross  

HCM  FCM  HCM  FCM  
C  Init  1.25  1.50  1.75  2.00  Init  1.25  1.50  1.75  2.00  
32  MC  618  159  169  163  175  185  427  217  209  229  236  253 
OCT  293  185  184  187  214  242  301  197  203  249  277  280  
WAN  624  162  160  165  172  201  446  194  193  220  235  291  
WU  213  157  157  156  163  172  268  191  191  194  198  208  
64  MC  192  91  87  86  87  99  232  125  123  119  125  134 
OCT  132  79  79  78  87  94  159  111  112  122  129  142  
WAN  311  89  83  84  100  110  292  112  111  117  122  141  
WU  103  72  75  75  79  85  147  109  109  111  121  126  
128  MC  111  47  45  45  50  52  154  76  74  72  75  86 
OCT  65  43  43  43  48  52  96  65  65  69  76  91  
WAN  106  44  42  44  48  51  169  66  66  68  72  85  
WU  52  38  40  40  42  46  87  63  63  65  70  84  
256  MC  63  29  27  26  28  31  100  49  45  45  48  57 
OCT  34  22  24  25  28  33  54  39  39  42  48  55  
WAN  53  21  23  24  26  30  92  39  39  40  44  53  
WU  30  21  23  23  25  28  51  38  38  39  43  50  
Flowers and Sill  Cover Girl  
HCM  FCM  HCM  FCM  
C  Init  1.25  1.50  1.75  2.00  Init  1.25  1.50  1.75  2.00  
32  MC  257  117  117  114  112  120  269  142  132  127  130  135 
OCT  155  102  102  102  109  120  182  127  127  128  131  137  
WAN  198  102  100  101  107  114  230  126  127  129  133  137  
WU  134  101  100  101  103  108  162  126  125  126  129  133  
64  MC  113  66  64  64  65  70  145  79  78  76  80  85 
OCT  88  58  57  58  66  75  105  72  72  75  78  87  
WAN  98  56  55  56  59  64  157  75  75  77  83  88  
WU  71  53  56  57  59  61  93  71  72  73  76  82  
128  MC  84  42  39  38  39  43  104  52  45  44  47  56 
OCT  47  33  33  34  37  42  62  42  42  44  47  52  
WAN  57  29  32  33  35  39  102  44  43  45  50  57  
WU  40  30  32  32  34  38  55  41  40  41  44  49  
256  MC  48  23  24  23  24  27  68  32  29  28  29  34 
OCT  26  19  21  21  24  27  36  25  25  25  29  33  
WAN  37  18  20  20  22  25  63  26  25  26  28  32  
WU  26  18  20  20  22  24  33  24  24  24  26  31  
Parrots  Poolballs  
HCM  FCM  HCM  FCM  
C  Init  1.25  1.50  1.75  2.00  Init  1.25  1.50  1.75  2.00  
32  MC  418  240  240  241  274  285  136  74  72  71  66  61 
OCT  342  247  246  246  255  265  130  74  67  75  85  88  
WAN  376  246  239  246  254  263  112  49  49  50  52  54  
WU  299  234  234  237  244  256  68  50  50  50  50  54  
64  MC  274  137  137  138  140  157  64  39  39  39  28  30 
OCT  191  133  132  135  140  155  48  29  27  28  29  34  
WAN  233  131  131  132  141  164  59  22  22  22  22  24  
WU  167  130  130  131  135  155  31  22  21  21  22  23  
128  MC  147  82  80  82  86  95  38  22  21  19  15  15 
OCT  111  79  78  79  85  97  20  12  12  12  13  16  
WAN  153  78  77  80  88  97  45  12  11  11  11  12  
WU  95  77  77  78  83  91  17  11  10  10  11  11  
256  MC  96  50  49  49  53  62  27  13  10  9  8  8 
OCT  64  48  47  50  54  61  9  6  5  6  6  7  
WAN  92  44  47  49  55  61  38  6  6  5  6  6  
WU  58  46  46  48  52  59  11  6  5  5  6  6 
CPU time comparison of the quantization methods
Hats  Motocross  

HCM  FCM  HCM  FCM  
C  1.25  1.50  1.75  2.00  1.25  1.50  1.75  2.00  
32  MC  48  2,664  3,238  3,192  934  84  11,797  7,749  9,244  1,895 
OCT  80  1,883  2,032  1,656  691  110  4,139  5,034  4,054  912  
WAN  45  3,406  2,709  2,980  762  60  4,261  2,971  4,013  715  
WU  50  1,976  2,227  1,854  425  60  4,547  4,751  4,016  974  
64  MC  59  10,536  11,059  5,494  1,211  101  29,081  24,021  24,858  5,640 
OCT  97  5,045  7,353  5,533  1,379  130  10,154  8,752  9,366  1,857  
WAN  62  9,350  9,729  10,303  1,501  94  12,531  8,842  10,308  3,160  
WU  54  4,228  4,756  4,822  1,332  71  6,361  6,903  8,441  2,020  
128  MC  108  20,269  19,945  15,815  2,879  156  49,930  54,102  57,146  14,704 
OCT  141  12,700  11,745  8,799  2,444  180  22,410  20,504  18,866  5,297  
WAN  89  22,871  13,143  11,544  2,071  125  17,472  19,467  23,061  5,683  
WU  76  12,719  11,191  11,114  2,300  113  15,604  14,833  13,684  5,049  
256  MC  267  42,670  51,559  35,602  6,126  607  144,758  116,915  131,130  28,752 
OCT  306  20,287  19,512  17,806  5,039  328  39,101  42,906  37,946  7,988  
WAN  202  26,505  20,574  18,794  5,649  380  50,621  45,127  38,105  9,152  
WU  191  19,058  20,692  18,763  5,434  284  39,098  43,176  32,835  8,767  
Flowers and Sill  Cover Girl  
HCM  FCM  HCM  FCM  
C  1.25  1.50  1.75  2.00  1.25  1.50  1.75  2.00  
32  MC  56  5,591  5,633  5,243  1,385  55  6,067  6,772  7,402  1,545 
OCT  81  2,618  4,151  3,447  645  82  1,992  2,615  2,026  584  
WAN  42  2,240  2,525  2,625  709  45  1,934  1,988  1,975  613  
WU  42  2,111  1,585  1,590  547  41  1,927  1,692  2,264  511  
64  MC  62  10,508  9,098  8,938  1,970  77  14,165  24,945  18,248  4,979 
OCT  99  9,091  6,579  7,396  1,369  100  6,431  6,775  4,570  1,803  
WAN  58  5,413  4,060  4,491  1,067  59  6,540  9,785  7,905  2,574  
WU  53  3,887  3,992  3,434  1,005  62  5,745  4,913  4,242  1,409  
128  MC  124  35,372  31,854  28,658  4,198  120  47,186  45,248  34,731  9,428 
OCT  120  9,787  11,505  11,709  2,375  130  12,311  13,002  9,794  2,290  
WAN  86  10,875  10,344  11,189  2,378  103  19,432  12,332  13,069  3,347  
WU  84  9,145  12,170  9,570  2,897  95  11,016  9,889  8,602  2,872  
256  MC  368  63,209  64,305  46,177  9,147  403  84,079  104,289  71,327  19,082 
OCT  291  30,560  27,794  23,475  4,738  279  31,042  27,404  25,272  6,417  
WAN  223  28,113  21,109  33,265  5,994  238  33,780  31,421  35,709  6,883  
WU  226  19,480  19,660  19,310  5,480  216  27,107  25,100  26,488  7,728  
Parrots  Poolballs  
HCM  FCM  HCM  FCM  
C  1.25  1.50  1.75  2.00  1.25  1.50  1.75  2.00  
32  MC  74  8,209  9,359  6,894  1,917  15  1,076  813  1,004  518 
OCT  124  8,127  8,586  13,018  2,408  31  980  1,041  974  305  
WAN  65  8,465  4,977  4,095  1,172  15  549  467  441  116  
WU  60  3,793  3,346  3,071  1,362  15  729  1,080  1,274  201  
64  MC  120  16,492  16,168  18,400  4,936  17  1,556  1,504  2,819  708 
OCT  132  10,659  8,395  9,286  2,773  36  3,261  2,625  2,692  519  
WAN  85  11,756  12,993  8,709  3,065  19  1,133  1,396  1,103  371  
WU  80  6,438  6,155  6,665  2,184  20  1,353  1,056  867  314  
128  MC  158  49,581  49,913  42,309  12,247  33  2,492  5,939  4,760  849 
OCT  181  28,474  27,161  26,921  5,902  51  3,032  2,385  3,310  1,042  
WAN  136  30,827  20,314  23,764  6,878  36  3,576  4,150  2,517  767  
WU  122  15,272  19,182  20,661  6,875  33  4,816  3,629  3,484  581  
256  MC  536  128,094  103,153  104,613  20,178  224  15,378  10,863  9,566  2,499 
OCT  391  54,419  57,325  41,750  10,665  144  6,091  6,194  5,398  1,306  
WAN  380  63,969  59,283  50,189  16,601  120  6,372  4,831  6,123  1,292  
WU  306  42,535  38,776  43,910  12,148  113  4,977  5,865  7,330  1,291 
⊳ The most effective initialization method is WU, whereas the least effective one is MC.
⊳ Both HCM and FCM reduces the quantization distortion regardless of the initialization method used. However, the percentage of MAE/MSE reduction is more significant for some initialization methods than others. In general, HCM/FCM is more likely to obtain a significant improvement in MAE/MSE when initialized by an ineffective preclustering algorithm such as MC or WAN. This is not surprising given that such ineffective methods generate outputs that are likely to be far from a local minimum, and hence HCM/FCM can significantly improve upon their results.
⊳ With respect to MAE, the HCM variant and the four FCM variants have virtually identical performance.
⊳ With respect to MSE, the performances of the HCM variant and the FCM variant with m = 1.25 are indistinguishable. Furthermore, the effectiveness of the FCM variants degrades with increasing m value.
⊳ On average, HCM is 92 times faster than FCM. This is because HCM uses hard memberships, which makes possible various computational optimizations that do not affect accuracy of the algorithm [51–55]. On the other hand, due to the intensive fuzzy membership calculations involved, accelerating FCM is significantly more difficult, which is why the majority of existing acceleration methods involve approximations [56–60]. Note that the fast HCM/FCM implementations used in this study give exactly the same results as the conventional HCM/FCM.
⊳ The FCM variant with m = 2.00 is the fastest since, among the m values tested in this study, only m = 2.00 leads to integer exponents in Equations 4 and 5.
It could be argued that HCM's objective functional, Equation 2, is essentially equivalent to MSE, Equation 6, and therefore it is unreasonable to expect FCM to outperform HCM with respect to MSE unless m ≈ 1.00. However, neither HCM nor FCM minimizes MAE and yet their MAE performances are nearly identical. Hence, it can be safely concluded that FCM is not superior to HCM with respect to quantization effectiveness. Moreover, due to its simple formulation, HCM is amenable to various optimization techniques, whereas FCM's formulation permits only modest acceleration. Therefore, HCM should definitely be preferred over FCM when computationally efficiency is of prime importance.
4 Conclusions
In this paper, hard and fuzzy cmeans clustering algorithms were compared within the context of color quantization. Fast and exact variants of both algorithms with several initialization schemes were compared on a diverse set of publicly available test images. The results indicate that fuzzy cmeans does not seem to offer any advantage over hard cmeans. Furthermore, due to the intensive membership calculations involved, fuzzy cmeans is significantly slower than hard cmeans, which makes it unsuitable for timecritical applications. In contrast, as was also demonstrated in a recent study [22], an efficient implementation of hard cmeans with an appropriate initialization scheme can serve as a fast and effective color quantizer.
Declarations
Acknowledgements
This publication was made possible by grants from the Louisiana Board of Regents (LEQSF200811RDA12), US National Science Foundation (0959583, 1117457), and National Natural Science Foundation of China (61050110449).
Authors’ Affiliations
References
 Brun L, Trémeau A: Digital Color Imaging Handbook. CRC Press; 2002:589638. Ch. Color QuantizationGoogle Scholar
 Yang CK, Tsai WH: Color image compression using quantization, thresholding, and edge detection techniques all based on the momentpreserving principle. Pattern Recognit Lett 1998,19(2):205215.View ArticleGoogle Scholar
 Deng Y, Manjunath B: Unsupervised segmentation of colortexture regions in images and video. IEEE Trans Pattern Anal Mach Intell 2001,23(8):800810.View ArticleGoogle Scholar
 Sherkat N, Allen T, Wong S: Use of colour for handfilled form analysis and recognition. Pattern Anal Appl 2005,8(1):163180.MathSciNetView ArticleGoogle Scholar
 Sertel O, Kong J, Catalyurek UV, Lozanski G, Saltz JH, Gurcan MN: Histopathological image analysis using modelbased intermediate representations and color texture: follicular lymphoma grading. J Signal Process Syst 2009,55(13):169183.View ArticleGoogle Scholar
 Kuo CT, Cheng SC: Fusion of color edge detection and color quantization for color image watermarking using principal axes Analysis. Pattern Recognit 2007,40(12):36913704.View ArticleMATHGoogle Scholar
 Wang S, Cai K, Lu J, Liu X, Wu E: Realtime coherent stylization for augmented reality. Visual Comput 2010,26(68):445455.View ArticleGoogle Scholar
 Deng Y, Manjunath B, Kenney C, Moore M, Shin H: An efficient color representation for image retrieval. IEEE Trans Image Process 2001,10(1):140147.View ArticleMATHGoogle Scholar
 Xiang Z: Handbook of Approximation Algorithms and Metaheuristics. Chapman & Hall/CRC; 2007:8618617. Ch. Color QuantizationGoogle Scholar
 Mojsilovic A, Soljanin E: Color quantization and processing by fibonacci lattices. IEEE Trans Image Process 2001,10(11):17121725.MathSciNetView ArticleMATHGoogle Scholar
 Heckbert P: Color image quantization for frame buffer display. ACM SIGGRAPH Comput Graph 1982,16(3):297307.View ArticleGoogle Scholar
 Gervautz M, Purgathofer W: New Trends in Computer Graphics. Springer; 1988:219231. Ch. A Simple Method for Color Quantization: Octree QuantizationView ArticleGoogle Scholar
 Wan S, Prusinkiewicz P, Wong S: Variancebased color image quantization for frame buffer display. Color Res Appl 1990,15(1):5258.View ArticleGoogle Scholar
 Orchard M, Bouman C: Color quantization of images. IEEE Trans Signal Process 1991,39(12):26772690.View ArticleGoogle Scholar
 Wu X: Graphics Gems. Volume II. Academic Press; 1991:126133. Ch. Efficient Statistical Computations for Optimal Color QuantizationView ArticleGoogle Scholar
 Balasubramanian R, Allebach J: A new approach to palette selection for color images. J Imaging Technol 1991,17(6):284290.Google Scholar
 Velho L, Gomez J, Sobreiro M: Color image quantization by pairwise clustering. Proceedings of the 10th Brazilian Symposium on Computer Graphics and Image Processing 1997, 203210.View ArticleGoogle Scholar
 Brun L, Mokhtari M: Two high speed color quantization algorithms. Proceedings of the 1st International Conference on Color in Graphics and Image Processing 2000, 116121.Google Scholar
 Huang YL, Chang RF: A fast finitestate algorithm for generating RGB palettes of color quantized images. J Inf Sci Eng 2004,20(4):771782.Google Scholar
 Hu YC, Lee MG: Kmeans based color palette design scheme with the use of stable flags. J Electron Imaging 2007,16(3):033003.View ArticleGoogle Scholar
 Celebi ME: Fast color quantization using weighted sortmeans clustering. J Opt Soc Am A 2009,26(11):24342443.View ArticleGoogle Scholar
 Celebi ME: Improving the performance of Kmeans for color quantization. Image Vis Comput 2011,29(4):260271.MathSciNetView ArticleGoogle Scholar
 Uchiyama T, Arbib M: An algorithm for competitive learning in clustering problems. Pattern Recognit 1994,27(10):14151421.View ArticleGoogle Scholar
 Verevka O, Buchanan J: Local kmeans algorithm for colour image quantization. Proceedings of the Graphics/Vision Interface Conference 1995, 128135.Google Scholar
 Scheunders P: Comparison of clustering algorithms applied to color image quantization. Pattern Recognit Lett 1997,18(1113):13791384.View ArticleGoogle Scholar
 Celebi ME: An effective color quantization method based on the competitive learning paradigm. Proceedings of the 2009 International Conference on Image Processing, Computer Vision, and Pattern Recognition 2009, 2: 876880.Google Scholar
 Celebi ME, Schaefer G: Neural gas clustering for color reduction. Proceedings of the 2010 International Conference on Image Processing, Computer Vision, and Pattern Recognition 2010, 429432.Google Scholar
 Kok CW, Chan SC, Leung SH: Color quantization by fuzzy quantizer. Proceedings of the SPIE Nonlinear Image Processing IV Conference 1993, 235242.View ArticleGoogle Scholar
 Cak S, Dizdar E, Ersak A: A fuzzy colour quantizer for renderers. Displays 1998,19(2):6165.View ArticleGoogle Scholar
 Ozdemir D, Akarun L: Fuzzy algorithm for color quantization of images. Pattern Recognit 2002,35(8):17851791.View ArticleMATHGoogle Scholar
 Kim DW, Lee KH, Lee D: A novel initialization scheme for the fuzzy cmeans algorithm for color clustering. Pattern Recognit Lett 2004,25(2):227237.View ArticleGoogle Scholar
 Schaefer G, Zhou H: Fuzzy clustering for colour reduction in images. Telecommun Syst 2009,40(12):1725.View ArticleGoogle Scholar
 Dekker A: Kohonen neural networks for optimal colour quantization. Netw Comput Neural Syst 1994,5(3):351367.View ArticleMATHGoogle Scholar
 Papamarkos N, Atsalakis A, Strouthopoulos C: Adaptive color reduction. IEEE Trans Syst Man Cybern Part B 2002,32(1):4456.View ArticleMATHGoogle Scholar
 Chang CH, Xu P, Xiao R, Srikanthan T: New adaptive color quantization method based on selforganizing maps. IEEE Trans Neural Netw 2005,16(1):237249.View ArticleGoogle Scholar
 Bezdek JC: Pattern Recognition with Fuzzy Objective Function Algorithms. Springer; 1981.View ArticleMATHGoogle Scholar
 Ruspini EH: Numerical methods for fuzzy clustering. Inf Sci 1970,2(3):319350.View ArticleMATHGoogle Scholar
 Ghosh J, Liu A: The Top Ten Algorithms in Data Mining. Chapman and Hall/CRC; 2009:2135. Ch. KMeansView ArticleGoogle Scholar
 Aloise D, Deshpande A, Hansen P, Popat P: NPhardness of euclidean sumofsquares clustering. Mach Learn 2009,75(2):245248.View ArticleGoogle Scholar
 Mahajan M, Nimbhorkar P, Varadarajan K: The planar kmeans problem is NPhard. Theor Comput Sci 2011, in press.Google Scholar
 Lloyd S: Least squares quantization in PCM. IEEE Trans Inf Theory 1982,28(2):129136.MathSciNetView ArticleMATHGoogle Scholar
 Celebi ME, Kingravi H, Celiker F: Fast colour space transformations using minimax approximations. IET Image Process 2010,4(2):7080.MathSciNetView ArticleGoogle Scholar
 Selim SZ, Ismail MA: Kmeanstype algorithms: A generalized convergence theorem and characterization of local optimality. IEEE Trans Pattern Anal Mach Intell 1984,6(1):8187.View ArticleMATHGoogle Scholar
 Bottou L, Bengio Y: Advances in Neural Information Processing Systems. Volume 7. MIT Press; 1995:585592. Ch. Convergence Properties of the KMeans AlgorithmsGoogle Scholar
 Csiszar I, Tusnady G: Information geometry and alternating minimization procedures. Stat Decis 1984, (Suppl 1):205237.Google Scholar
 Kolen JF, Hutcheson T: Reducing the time complexity of the fuzzy cmeans algorithm. IEEE Trans Fuzzy Syst 2002,10(2):263267.View ArticleGoogle Scholar
 Franzen RW:Kodak Lossless True Color Image Suite. 1999. [http://www.r0k.us/graphics/kodak/]Google Scholar
 Dekker A:NeuQuant: Fast HighQuality Image Quantization. 1994. [http://members.ozemail.com.au/~dekker/NEUQUANT.HTML]Google Scholar
 Hu YC, Su BH: Accelerated pixel mapping scheme for colour image quantisation. The Imaging Sci J 2008,56(2):6878.View ArticleGoogle Scholar
 Linde Y, Buzo A, Gray R: An algorithm for vector quantizer design. IEEE Trans Commun 1980,28(1):8495.View ArticleGoogle Scholar
 Phillips S: Acceleration of kmeans and related clustering algorithms. Proceedings of the 4th International Workshop on Algorithm Engineering and Experiments 2002, 166177.View ArticleGoogle Scholar
 Kanungo T, Mount D, Netanyahu N, Piatko C, Silverman R, Wu A: An efficient kmeans clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 2002,24(7):881892.View ArticleMATHGoogle Scholar
 Elkan C: Using the triangle inequality to accelerate kmeans. Proceedings of the 20th International Conference on Machine Learning 2003, 147153.Google Scholar
 Lai J, Liaw YC: Improvement of the kmeans clustering filtering algorithm. Pattern Recognit 2008,41(12):36773681.View ArticleMATHGoogle Scholar
 Hamerly G: Making kmeans even faster. Proceedings of the 2010 SIAM International Conference on Data Mining 2010, 130140.View ArticleGoogle Scholar
 Cheng TW, Goldgof DB, Hall LO: Fast fuzzy clustering. Fuzzy Sets Syst 1998,93(1):4956.View ArticleMATHGoogle Scholar
 Hoppner F: Speeding up Fuzzy cmeans: using a hierarchical data organisation to control the precision of membership calculation. Fuzzy Sets Syst 2002,128(3):365376.MathSciNetView ArticleMATHGoogle Scholar
 Eschrich S, Ke J, Hall LO, Goldgof DB: Fast accurate fuzzy clustering through data reduction. IEEE Trans Fuzzy Syst 2003,11(2):262270.View ArticleGoogle Scholar
 Chen YS, Chen BT, Hsu WH: Efficient fuzzy cmeans clustering for image data. J Electron Imaging 2005,14(1):013017.View ArticleGoogle Scholar
 Hathaway RJ, Bezdek JC: Extending fuzzy and probabilistic clustering to very large data sets. Comput Stat Data Anal 2006,51(1):215234.MathSciNetView ArticleMATHGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.