Texture Classiﬁcation using Sparse Frame Based Representations

A new method for supervised texture classiﬁcation, denoted Frame Texture Classiﬁcation Method (FTCM), is proposed. The method is based on a deterministic texture model in which a small image block, taken from a texture region, is modelled as a sparse linear combination of frame elements. FTCM has two phases: In the design phase a frame is trained for each texture class based on given texture example images. The design method is an iterative procedure in which the representation error, given a sparseness constraint, is minimized. In the classiﬁcation phase each pixel in a test image is labelled by analyzing its spatial neighborhood. This block is represented by each of the frames designed for the texture classes under consideration


I. INTRODUCTION
Most surfaces exhibit texture.For human beings it is quite easy to recognize different textures, but it is more difficult to precisely define a texture.Under all circumstances, a texture may be regarded as a region where some elements or primitives are repeated and arranged according to a placement rule.
Tuceryan and Jain [1] list more possible definitions, and give a comprehensive overview of texture classification.Possible applications can be grouped into: 1) texture analysis, i.e. find some appropriate properties for a texture, 2) texture classification, i.e. identify the texture class in a homogeneous region, 3) texture segmentation, i.e. find a boundary map between different texture regions of an image.The boundary map may be used for object recognition and scene interpretation in areas such as medical diagnostics, geophysical interpretation, industrial automation and image indexing.Finally, 4) texture synthesis, i.e. generate artificial textures to be used for example in computer graphics or image compression.Some examples of applications are presented in [2]- [6].
Typically, texture classification algorithms have two main parts: A local feature vector is found, which is subsequently used for texture classification or segmentation.The methods for feature extraction may be loosely grouped as statistical, geometrical, model-based, and signal processing (filtering) methods [1].For the filtering methods the feature vectors are often built as variance estimates, local energy measures, for each of the sub-bands of a filter bank.Also, there are numerous classification or pattern recognition methods available: The Bayes classifier is probably the most common one [7], [8].The min-or max-selector is a simple one that can be used if each entry in the feature vector measures the similarity to, or corresponds to, a texture class.Nearest neighbor classifier, vector quantization (codebook vectors represent each class) [9] and learning vector quantization (LVQ) (codebook vectors define the decision borders) [10]- [12], neural networks, watershed-based algorithm [13], and support vector machines (SVM) [14] are other methods.
One approach to texture classification may be to focus on the feature extraction part, and make it easy to decide the texture class from the feature vector [15].The opposite approach is to make the feature extraction as simple as possible, for example by feeding the gray-level values for the pixels in image blocks directly to the classifier [16].The FTCM belongs to the first approach, as the overall classification scheme is quite similar to the scheme used in [11], the main distinction being that we have replaced the filter part by a sparse representation part.On the other hand we also recognize relationships to the opposite approach.The SVM scheme as used in [16], finds a set of support vectors for each texture and this set identifies a hyperplane which separates the given texture from the rest of the textures, while FTCM finds a set of frame vectors for each texture and this set is trained to efficiently represent the given texture by a sparse linear combination and thus identifying the texture.Also, FTCM has much in common with texture classification using vector quantization [9].Actually, FTCM may be regarded as a generalization of the vector quantization approach.This paper is organized as follows: Sparse frame based representations are briefly explained in Section II.Section III presents the texture model and gives a motivation for the Frame Texture Classification Method.FTCM is a supervised texture classification method and it has two main parts: Firstly, training is done to build the frames based on some example images for each texture class, see Section IV.Secondly, in Section V, we describe the classification or segmentation using these frames to label the pixels of a test image.Finally, in Section VI, the experimental results are presented both for synthetic textures based on the texture model, and for natural textures.

II. SPARSE FRAME BASED REPRESENTATIONS
A set of N-dimensional vectors, spanning the space R N , {f k } K k=1 , where K ≥ N , is a frame.In this paper frames are represented as follows: A frame is given by a matrix F of size N × K, K ≥ N , where the columns are the frame vectors, f k .A column vector of N signal samples is formed from a 2 dimensional image block (size N 1 × N 2 ) that is simply rearranged into a column vector (length . The column vector is denoted x l to indicate that it is one out of L available signal blocks, such signal (image) blocks can be represented by a weighted sum of frame vectors ( This is a signal expansion that, depending on the selection of weights, w l (k), may be an exact or an approximate representation of the signal block.The weights, w l (k), can be represented by a column vector, w l , of length K.It is convenient to collect the L signal vectors and the corresponding weight vectors into matrices, The synthesis equation ( 1) may now be written as In a sparse representation many of the weights in the signal expansion (1) are zero.To quantify the degree of sparseness we use the number s, which is the number of non-zero weights allowed in the sparse representation of each signal block, x l .s is the same for all signal blocks.
A frame can be designed or trained to give a good sparse representation of a set of L training vectors.A linear combination of basis vectors from an arbitrary basis of R N can be used to represent each vector in the training set.Such representations will in general be dense, i.e. they have N nonzero coefficients.A large frame, using all the L training vectors as frame vectors, can be used to give the ultimate sparse representation, each of the training vectors can be represented by only one frame vector.In this work we use rather small frames where N < K << L, typically 2N ≤ K ≤ 4N .
Each of the training vectors can now be well approximated by a sparse linear combination of the frame vectors, allowing only s non-zero weights to be used in the expansion.The K frame vectors can be designed to minimize the sum of representation errors for a given sparseness.
The problem of finding the sparse weight vector, for a given sparseness, such that the 2-norm 1 of the residual is minimized, is an NP-hard problem [17].Many practical solutions employ greedy vector selection algorithms, such as Matching Pursuit (MP), Orthogonal Matching Pursuit (OMP), and Order Recursive Matching Pursuit (ORMP).When reading this paper, it is not necessary to know (the details of) these methods.They are thoroughly described elsewhere, [18]- [24].All we need to know is that the vector selection algorithm used here, which by the way is ORMP, finds the weights in a sparse representation.

III. THE TEXTURE MODEL.
Textures are often described by random models and statistical properties, [25]- [27].Random models often seem to capture the essential properties of the textures quite well, as can be seen from the textures synthesized by these models [28], and obviously most natural textures have a random element.We will here present a deterministic texture model which will fit many periodic textures quite well.Based on this model the Frame Texture Classification Method (FTCM) emerges as a natural method for texture classification.The main result of this section is that it is reasonable to model a small texture image block as a sparse linear combination of frame elements.The results oriented reader may wish to jump to Section IV.
The idea behind the proposed texture model is quite simple: A texture is modelled as a tiled floor, where all tiles are identical.The color, or gray-level, at a given position on the floor is given by an underlying continuous periodic two-dimensional function which we denote c(x, y), an image is a regular sampling of this function.In this section we will show that all image blocks can be represented as a linear combination of only four elements, where the four elements are taken from a set, i.e. a 1 In this paper we use the 2-norm, x 2 = N n=1 x(n) 2 , for vectors and the trace or Frobenius norm for matrices, A 2 = i j A(i, j) 2 .frame, with a finite number of elements.The FTCM directly uses this model.In the training phase it finds a frame for each texture and in the classification phase representations, or approximations, of blocks from a test image are found as linear combinations of four elements.Because of this close connection we may say that the model explains the good performance of FTCM, or alternatively, the good performance of FTCM validates the model.
One period of the periodic function c(x, y) defines a quadratic tile where each side have unit length, i.e. c(x, y) = c(x − x , y − y ).In this model the function is defined by a finite number of control points placed on the tile.This is illustrated in Fig. 1 where two complete tiles and parts of their neighboring tiles are shown.The 16 control points on each tile are regularly distributed on a 4 × 4 grid, the control points can be labelled c ij where only the indexes are shown in the figure .Generally, in this model, the M = M 1 M 2 control points are placed on a rectangular M 1 × M 2 grid.
The color of any point on a tile (on the floor) is given as a bilinear interpolation of the closest control points, i.e. c(x, y) = a 1 c i 1 j 1 + a 2 c i 2 j 2 + a 3 c i 3 j 3 + a 4 c i 4 j 4 .The bilinear interpolation is actually a convex combination, with a 1 + a 2 + a 3 + a 4 = 1 and 0 ≤ a k ≤ 1.For example, the color value for the center of a tile in Fig. 1 is c(x, y We also note that some parts of c(x, y) within a tile need control points from neighboring tiles in forming the interpolation.We let the coordinate system be aligned to match a tile, such that the center of the first tile is given by (x, y) = ( 12 , 1 2 ), and the corners are (0, 0), (1, 0), (0, 1), and (1, 1).
Samples of c(x, y) on a rectangular sampling grid, not necessarily aligned with the coordinate system implied by the first tile, constitute the digital texture image.By choosing • the number and positions of control points in a tile, • the gray-level value (color) of each of the control points, April 29, 2005 DRAFT • the orientation of the sampling grid relative to the coordinate system aligned with the tiles, denoted by angle α, and finally, • the distance between neighboring sampling points, denoted by δ, in the sampling grid, we obtain a digital texture image.2), . . ., x( 9)] T .How the numbering is done is not important, but we may assume that x( 1) is the upper left pixel and the rest are numbered columnwise.We note that the location of pixel x( 1) may be anywhere on the floor, but since translations by unit lengths up and down will give exactly the same value for x(1), and also the vector x will be unchanged by such translations, the location of x( 1) can be restricted to be on the first tile.
Having the texture image specified as above, i.e. by control points and by a sampling grid, we realize that all possible vectors x can be formed by translating the position of x( 1) within a tile.An infinite number of different vectors x can be formed.For gray-level images this set of vectors is a subset of the space R N .We may say that this set defines the texture.The challenge now is to make an efficient description of this set, in a way that makes it easy to decide whether a test vector belongs to this set or not.In the following we argue that all vectors from this infinite set, corresponding to a specific texture, can be represented as a linear (convex) combination of four frame vectors taken from a finite subset of vectors containing at most M N 2 vectors, where again M denotes the number of control points in each tile.This finite set is a frame and its elements are frame vectors.Note that the frame vectors span the space R N , but adding a sparseness constraint during representation makes them "span" only a subspace, which contains all the x vectors.This subspace is the union of a finite number of s-dimensional spaces, where s is the number of frame vectors allowed in the sparse representation, here s = 4.
In Fig. 2 the marked upper left pixel, x(1), is above and to the right of control point c 13 .Its value is a linear combination of the values in the four neighboring control points c 13 , c 23 , c 14 and c 24 .If x( 1) is translated anywhere within the small box with these control points as corners, it is still a linear combination of the same control points.At a corner x(1) will take the value of the control point.This observation can also be stated as follows: Within a small rectangular box of the tile, the value x(1) will be a linear combination of its values at the corner points.This is true as long as no horizontal or vertical line through any control point passes through the small box.The same statement is obviously also valid for another pixel, for example x(2) below x(1).The left part of Fig. 3 illustrates the situation when we consider two points simultaneously.The points are entries in the vector x = [x(1), x(2)] T , in this example N = 2. Translating this vector means that we translate both its entries the same distance vertically and horizontally.The position of x( 2) is given by the position of x(1) and their relative distance given by the sampling grid.This implies that the positions of all entries in x, and thus the value of x, are given by the position of x( 1) within the tile.In the figure a box is plotted around x(1), such that when x(1) moves within this box x(2) moves within the box plotted around x(2).The neighboring control points will not change for either of the pixels.This can also be stated as follows: Placing x(1) within a small rectangular box of the tile the value of vector x will be a linear combination of its values at the corner points.This is true as long as the box around x( 1) is so small that all of the entries of the vector do not involve new control points.The dotted lines in the right part of Fig. 3 divide the tile into such boxes.
Placing x(1) on an intersection between the dotted lines, the corresponding vector x can be stored as a frame vector f k .Collecting all these frame vectors into a frame we observe that any x generated by this texture model can be represented as a linear combination of four frame vectors.
This reasoning can easily be extended to a larger vector x of length N .We will now find how many small boxes the tile should be divided into for this case.First we move x(1), and the sampling grid to which x( 1) is attached, vertically within the tile.Everywhere when the position of an entry of vector x crosses one of the horizontal lines that can be drawn through a control point, we draw a horizontal line through x( 1).This will give at most M 2 N horizontal lines.Then we move x(1) horizontally within the tile.Everywhere when the position of an elements of vector x crosses one of the vertical lines that can be drawn through a control point, we draw a vertical line through x( 1).This will give at most M 1 N vertical lines.Placing x(1) at one of the M 1 N M 2 N = M N 2 intersections between a horizontal and vertical line, we will have a corresponding vector x.These vectors constitute the elements of a finite frame.All vectors x, with x(1) anywhere on the tile, and which are the elements of the set that defines this specific texture image, can be represented as a linear (convex) combination of four frame vectors taken from the frame containing at most M N 2 vectors.
To take advantage of this model in a practical way some shortcuts are taken.First, we note that finding the correct frame for an example texture is not possible unless we have available the model parameters and even then the number of frame vectors will often be quite large.By using fewer frame vectors, K M N 2 , we accept that the test vector will only be approximated by the sparse representation.Secondly, only a limited number of combinations of the frame vectors should be used in the sparse representation.In this model the frame vectors are the x vectors taken when x( 1) is placed on the corners of the many small boxes that a tile can be divided into.The four frame vectors used in a sparse representation should belong together, they should be the four corners of one of these small boxes.By allowing any combination of the frame vectors to be used, we do not have to consider a relative position of the frame vectors.Thirdly, the representation (approximation) according to the model should strictly be a bilinear interpolation between four points.It would be just as reasonable to define the periodic function c(x, y) by a linear interpolation between three control points (in a triangular grid).
Taking these three shortcuts, we can use the frame design method, first presented in [29] and used for texture images in [30], to design a frame that represents a texture class.The method is briefly described in the next section.

Frame parameters
Preprocessing Training Block size: and K Number of frame vectors to use: s The training example texture images The training vectors, X.
One frame is trained for each texture class.The very first step in the FTCM training phase is to decide the frame parameters.These parameters can be chosen quite freely: • The shape, usually rectangular, and size of the block around each pixel.The pixels within this block are organized as a column vector of length N .
• The number of vectors in the frame, K, may be chosen quite freely.As a rule of thumb, found from the comprehensive experiments done, we may use N ≤ K ≤ 5N .Having set the frame parameters, the next step is to build the training vectors from the texture example images.As suggested before, this can be as simple as rearranging the pixels from small image blocks, which may partly overlap each other, into column vectors, or it can be more involved.
The sets of training vectors are arranged into N × L matrices, as in Equation ( 2), and denoted X (i)   for texture class i = 1, 2, . . ., C. Later, during classification, the test vectors should of course be formed by the same procedure as for the training vectors.
In the training the parameter set, N , K and s, is fixed.For each frame to design, F (i) , we use the corresponding set of training vectors, X (i) , generated from the example images.For notational convenience we skip the superscript indexes below.As explained in Section II the synthesis equation can be written as X = FW.We want to find the frame, F, of size N × K, and the sparse coefficient vectors, w l , that minimize the sum of the squared errors.The objective function to be minimized is Finding the optimal solution to this problem is difficult if not impossible.We split the problem into two parts to make it more tractable, similar to what is done in the GLA design algorithm for VQ codebooks [31].The iterative solution strategy presented below results in good, but in general suboptimal, solutions to the problem.
The algorithm starts with a user supplied initial frame F 0 , usually K arbitrary vectors from the set of training vectors, and then improves it by iteratively repeating two main steps: 1) W t is found by vector selection using frame F t .The objective function is J(W) = X − F t W 2 , and a sparseness constraint is imposed on W.
2) F t+1 is found from X and W t , where the objective function is J(F) = X − FW t 2 .This gives: Then we increment t and go to step 1.
t is the iteration number.The first step is suboptimal due to the use of practical vector selection algorithms, while the second step finds the F that minimizes the objective function.
In a texture classification context the frame concept has been used together with the discrete wavelet transform, see [7], [14], [32], [33].We must point out that the frame in FTCM has a different role.
In the discrete wavelet frame transform context the frame is used as the analysis filter bank, the frame arises when the wavelet sub-bands are not down-sampled.If a perfect reconstruction synthesis filter bank exists, many can exist [34], the outputs of the analysis filter bank can be regarded as an alternative representation of the image.In FTCM the analysis filter bank is replaced by a matching pursuit algorithm, and the frame is used to synthesize the signal as in (1).Also, the FTCM uses
• • • ?-Fig. 5.The setup for the classification approach in FTCM, this setup is similar to a common setup in texture classification used in [11].
several frames, each giving one element of the feature vector, as opposed to the filter bank approach where each sub-band gives one element of the feature vector.

V. CLASSIFICATION
Texture classification of a test image, containing regions of different textures, is the task of classifying each pixel of the test image to belong to a certain texture.This is done by generating test vectors from the test image.The classifying process for the FTCM is illustrated in Fig. 5.
A test vector is represented in a sparse way using each of the different frames that were trained for the textures under consideration, the set of C frames {F (i) }.Each sparse representation of each test vector x l gives a representation error, r l .Each test vector x l corresponds to a pixel of the test image.Classification consists of selecting the index i for which the norm squared of the representation error, r l , is minimized.Direct classification based on the norm squared of the representation error for each test vector (pixel) gives quite large classification errors, but the results can be substantially improved by smoothing the error images.Smoothing is reasonable since it is likely that neighboring pixels belong to the same texture.For smoothing Randen and Husøy [11] concluded that the separable Gaussian low-pass filter is the better choice, and this is also the filter used here.The unit pulse response for the 1-D kernel of this filter is The parameter σ gives the bandwidth of the smoothing filter.The effect of smoothing is mainly that more smoothing gives lower resolution and better classification within the texture regions.The cost is often more classification errors along the borders between different texture regions.
To improve texture segmentation a nonlinearity may be included before the smoothing filter is applied, [35].The nonlinearity is applied on r (i) l 2 , i.e. a scalar property is calculated by a nonlinear function f ( r 2 ).The function may be the square root to get the magnitude of the error, or the inverse sine of the magnitude which gives the angle between signal vector and its sparse approximation, or a logarithmic operation.Experiments we have done [30] indicate that usually the logarithmic nonlinearity is the better choice.

A. Synthesized textures
The experiments presented here demonstrate the close connection between the texture model and the FTCM.Let us define two tiles that both give braided textures, tile A defined by a 4 × 4 (M = 16) grid of control points and tile B defined by a 6 × 6 (M = 36) grid of control points.The intensity values for the control points are: 0.5 0 0.5 0 1 0 1 1 0.5 0 0.5 0 0.5 0.5 0 0.5 0.5 0 0.5 0.5 0 0.5 0.5 0 0.5 0.5 0 0.5 0.5 0 0.5 0.5 0 0.5 0.5 0 From Fig. 6 we see that the black and white bands are wider on tile A than on tile B, tile B will have more of the gray background.Based on these tiles we define six textures using different values for the sample distance δ and the rotation angle α.We generate example images of each texture,  which are used for training of the frames.We also make a test image, Fig. 6, consisting of segments from all the six texture classes.Visually the textures seems quite similar and are quite difficult to distinguish from each other just by looking at them.Many frames were designed, using different sets of frame parameters, for each of the six textures.
We always used image blocks of size 5 × 5 to form the training vectors of length N = 25, while the number of frame vectors K and sparseness s varied.We used these frames to classify the test image, the results are shown in Fig. 7.Here we have used a quite narrow low-pass filter, σ = 2, and the classification results are almost perfect.For most cases the number of wrongly classified pixels is less than 1%, often less than 0.5%, which means that only some few pixels along the texture borders are wrongly classified.Even the vector quantization case, s = 1, does quite well when the number of frame (codebook) vectors, K, is large.We observe that that the smaller frames, K ≤ 50, do quite well for sparseness choices s = 3 and s = 4, which is the sparseness suggested by the model of Section III.Also without filtering (results not shown here) more than 90% of the pixels were correctly classified for s > 1 and K ≥ 150, while for s = 1 and K = 200 70% of the pixels were correctly classified.Without filtering we clearly saw that as the number of frame vectors increased the results improved, as we would expect from the model.
The conclusion so far is not surprising: When the textures are generated in accordance with the model, texture classification using FTCM, motivated by the model, achieves excellent results.

B. Natural textures
We also test the FTCM on some real data, and we choose to use the nine test images of Randen and Husøy [11].These consist of 77 different natural textures, taken from three different and commonly used texture sources: The Brodatz album, the MIT Vision Texture database, and the MeasTex Image Texture Database.The test images are denoted (a) to (i) and are shown in Fig. 11 in [11], where also a more detailed description of the test images can be found2 .Due to space considerations only test  (N, K, s).The number of vectors to use in the sparse approximation, s, is along the x-axis.Here, the width of the low pass filter is given by σ = 8. image (c) is shown in this paper, see the left part of Fig. 10.The same test images were also used in other papers [8], [13], [16], [36], [37].
The procedures of Sections IV and V were used: The first step is to design the C = 77 class specific frames from the example images of all the texture classes under consideration.Many different frame parameter sets were used in our experiments.This was done to find which parameter sets perform best on natural textures.The texture classification capabilities of the FTCM were tested using the procedure from Section V.
The nonlinearity was logarithmic and Gaussian smoothing filters were used, the bandwidths used were in the range σ = 2 to σ = 16.To find the best parameter sets we performed experiments whose results are summarized in Fig. 8, where the mean classification error rate of the nine test images are shown for all the 36 different frame parameter sets, and in Fig. 9 where 6 parameter sets are used with varying degrees of smoothing.We see that having s = 3 or s = 4 gives the smallest classification error rate for all the frame sizes investigated.This is in line with the results on synthetic textures and  the model presented in Section III.For the tests with the FTCM and s = 3 or s = 4 the number of wrongly classified pixels is almost halved compared to the cases when s = 1 and when compared to the results of [11].We also note that the frame size in FTCM is important, especially for the cases where s > 1.The model suggests that the number of frame vectors to use should be quite large, and these results show that the classification result gets better as the number of frame vectors, K, increases.Practical reasons stop us from using larger values of K.
Another interesting observation is: The number of vectors used in the representation, s, should be increased when the parameter N is increased.For N = 25 the frames where s = 3 perform best, while for N = 49 the frames where s = 4 perform best.This observation can be explained by the fact that when N is larger the number of vectors to select must be larger to have the same sparseness ratio, s/N , or to have a reasonably good representation of the test vectors.
The effect of the smoothing filter is illustrated in Fig. 10: Little smoothing, σ = 4, gives many error regions scattered in the test image, while more smoothing, σ = 12 gives better classification within the texture regions, but the cost is often more classification errors along the borders between texture regions.Fig. 10 also shows that the fine texture in the lower region is easier to identify than the coarser textures in the rest of the test image.
As a last step we compare the results of FTCM with other methods.Table I shows the classification errors, given as percentage of wrongly classified pixels, for different methods (rows) and the nine test images (a) to (i).Some of the best classification results from [11] are shown in the upper part of Table I.The same test images were also used in other papers [8], [13], [16], [36], [37], and results from these are shown in the next part of the table.It should be noted, however, that these latter results are not necessarily directly comparable since we do not know the exact experiment setup used.The lower part of Table I shows the results for some of the parameter sets used in the FTCM.
The methods from [11] listed in Table I are now briefly explained: "f8a" and "f16b" use subband energies of textures filtered through a tree structured bank of quadrature mirror filters (QMF).
The filters are finite impulse response (FIR) filters of lengths 8 and 16, respectively.The method denoted "Daub-4" use the Daubechies filters of length 4, and the same structure as used for the QMF filters.The referred results use the non-dyadic subband decomposition illustrated in Fig. 6d in [11].
The methods denoted "J M S " and "J U " are FIR filters optimized for maximal energy separation, [15].The last two methods use co-occurrence and autoregressive features.For more details of the classification methods referred and results of more methods we recommend [11].For the methods in the middle part of Table I please consult the given references.
The results for the vector quantization case, FTCM with s = 1, give an average error rate of approximately 30 percent, Fig. 8, which is comparable to the best results of [11].The mean for the method "f16b" was 25.9 percent wrongly classified pixels, while the parameter see 49 × 50 for N × K and σ = 12 gave 25.4 percent wrongly classified pixels, see Table I.Even though the means are comparable, the results for the individual test images vary significantly.For the test image (h) the result is 39.8 for the "f16b" filtering method, and 29.6 for FTCM with frame size 49 × 50 and σ = 12, while for the test image (i) the results are 28.5 and 37.1 respectively.Generally, we note that the different filtering methods and the autoregressive method perform better on test image (i) than on test image (h), and that the co-occurrence method and the FTCM (two exceptions in Table I) perform better on test image (h) than on test image (i).
The conclusion of the experiments can be summarized as follows: For the nine test images used, the FTCM performs very well.There is little improvement achieved when increasing the block size from 5 × 5 to 7 × 7 pixels.It is better to increase the number of frame vectors, K = 200 is marginally better than K = 100 as can be seen from Table I.The number of frame vectors to use in the sparse representation should be s = 3 or s = 4 according to the model, and this is confirmed by the experiments both on synthetic and natural textures.The optimal width of the low pass filter, given by σ, is more dependent on the texture characteristics and boundaries between texture patches in the test image, than the frame parameters, for example the fine textures in test image (a) are best classified using a small value of σ.The average result for these test images is best for 10 ≤ σ ≤ 12.The experiments here indicate that a frame size of 25 × 200, s = 3, and σ = 10 is a good choice.

VII. CONCLUSION
In this paper we have presented the Frame Texture Classification Method for supervised texture segmentation of images.Both methods for training based on texture example images and for classification of test images were described, together with a theoretical model motivating the method.The method is conceptually simple and straightforward, but it is computationally demanding, especially the training part.The classification results are excellent.The FTCM provides superior classification performance, for many test images the number of wrongly classified pixels is more than halved, compared to the many methods presented in the large comparative study of Randen and Husøy [11].The results presented also compare favorably with those presented in several other recent contributions. Method

Fig. 1 .
Fig. 1.Two complete tiles of a tiled floor.The control points are marked and labelled.

Fig. 2
illustrates sampling, in this example we have δ = 0.187 and α = 15 degrees.The texture model described above has the capability of generating a wide variety of textured images, some examples are shown in Fig. 6.We will now look closer on a small block of pixels from the texture image, in Fig. 2 a 3 × 3 block (N = 9 pixels) is marked.This block forms a size N vector, x = [x(1), x(

Fig. 2 .Fig. 3 .
Fig. 2. A part of a tiled floor with sample points.The control points are marked as dots, and the sample points (center of the image pixels) as small circles.

Fig. 4 .
Fig.4.The setup for training of frames in FTCM is very similar to the general frame design setup,[30].

•
The sparseness to use, represented by the number of frame vectors used in the sparse representation, s.The main objective is to choose a value of s that provides a good discrimination of the different textures.The experiment part of this paper confirms that the model suggested values s = 3 and s = 4 are suitable values.April 29, 2005 DRAFT

Fig. 6 .
Fig.6.The synthesized test image on the top and its reference below.The reference tells how the different regions of synthesized test image is built.

Fig. 7 .
Fig.7.Error rate, i.e. number of mislabelled pixels divided by total number of pixels, in classification of the test image in Fig.6.Here we have low-pass filtering with a quite narrow filter, σ = 2.

Fig. 8 .
Fig. 8. Average error rate, i.e. number of mislabelled pixels divided by total number of pixels, in classification of the natural texture test images.Each point represent a unique frame parameter set, (N, K, s).The number of vectors to use in

5 × 5
and 7 × 7 pixel blocks were used giving training and test vectors of lengths N = 25 and N = 49.The number of frame vectors in each frame were K = {25, 50, 100, 200}for N = 25 and K = {50, 100} for N = 49.This gives six different sizes for the frames.The numbers of frame vectors in the sparse representation were from s = 1 to s = 6.For each parameter set a frame was designed for all the texture classes of interest, the number of training vectors was L = 10000.The design of all the frames needed several days of computer time, one to five minutes for each frame, but this task must be done only once.

Fig. 9 .Fig. 10 .
Fig. 9. Average error rate in classification of the natural texture test images.Each line represent a unique frame parameter set,(N, K, s).Note the small range for the y-axis.The bandwidth of the smoothing filter, σ, is along the x-axis.

TABLE I CLASSIFICATION
ERRORS, GIVEN AS PERCENTAGE WRONGLY CLASSIFIED PIXELS, FOR DIFFERENT METHODS AND NATURAL TEST IMAGES.THE RESULTS IN THE MIDDLE PART ARE NOT NECESSARILY DIRECTLY COMPARABLE TO THE REST.