More and more attention has been paid to the invariant texture analysis, because the training and testing samples generally have not identical or similar orientations, or are not acquired from the same viewpoint in many practical applications, which often has negative influences on texture analysis. Local binary pattern (LBP) has been widely applied to texture classification due to its simplicity, efficiency, and rotation invariant property. In this paper, an integrated local binary pattern (ILBP) scheme including original rotation invariant LBP, improved contrast rotation invariant LBP, and direction rotation invariant LBP is proposed which can effectively overcome the deficiency of original LBP that is ignoring contrast and direction information. In addition, for surmounting another major drawback of LBP such as locality which can result in the lack of shape and space expression of the holistic texture image, Zernike moment features are fused into the improved LBP texture features in the proposed method because they comprise orthogonal and rotation invariant property and can be easily and rapidly calculated to an arbitrary high order. Experimental results show that the proposed method can be remarkably superior to the other state-of-the-art methods when rotation invariant texture features are extracted and classified.

1 Introduction

Texture analysis is an attractive topic in image processing and pattern recognition. It plays a vital role in many important applications such as object tracking or recognition, remote sensing, image retrieval based on similarity, and so on [1–4]. Guo et al. [5] summarized four primary problems about texture analysis which are respectively image classification based on texture content, image segmentation of homogeneous texture regions, texture synthesis for graphics applications, and shape information acquisition from texture cue.

It is a very difficult problem to analyze existing texture in the real world mainly because of some uncertain factors such as inhomogeneity, illumination changes, and variability of texture appearance, etc. In the early stage, researchers focus on using statistical features to classify texture images. Haralick et al. [6] firstly proposed to use cooccurrence statistics to describe texture features. In the nineties, the Gabor filtering method of Manjunath and Ma [7] is credited as the current excellent technique in texture analysis. Although these methods obtained good performance, generally they need be made an explicit or implicit assumption that the training and testing samples have identical or similar orientations or are acquired from the same viewpoint [8]. In many practical applications, however, this assumption often cannot be guaranteed. Based on the practical experience, this phenomenon can be found that no matter how to rotate the texture images, these texture images always can be exactly classified from human vision point of view. Therefore, invariant texture analysis is highly demanded in both theoretical research and practical application.

More and more attention has been paid to the invariant texture analysis. An excellent review is summarized by Zhang and Tan [8]. Among these methods, Kashyap and Khotanzad [9] firstly researched rotation invariant texture classification by using a circular autoregressive model whose parameters are invariant to image rotation. Choe and Kashyap [10] proposed an autoregressive fractional difference model to possess rotation invariant parameters. Hidden Markov model [11] also was used to explore rotation invariant texture classification. In addition, wavelet analysis is an excellent tool to obtain rotation invariant texture feature. For example, Jafari-Khouzani and Soltanian-Zadeh [12] proposed to extract wavelet energy features containing the texture orientation information to classify the texture images. In addition, a polar, analytic form of a two-dimensional Gabor wavelet [13] was used to deduce rotation invariant texture feature. Recently, some methods based on statistical learning was proposed by Varma and Zisserman [14, 15], in which a rotation invariant texton library is first built using a training set, and then a testing texture image is classified according to its texton distribution. Crosier and Griffin [16] use basic image features (BIF) for texture classification and obtain excellent results. Furthermore, some pioneering work on scale and affine invariant texture classification has been done by using fractal analysis [17] and affine adaptation [18].

Local binary pattern (LBP) has been being reputable due to its effectiveness, speed, and rotation invariant property since it was mentioned by Harwood et al. [19]. Later it was introduced to the public by Ojala et al. [20]. Many researchers developed LBP methods based on Ojala’s idea. For example, Zhao et al. [21], Maani et al. [22], and Ahonen et al. [23] respectively improved the LBP method using frequency domain analysis methods. Mäenpää [24] pointed out that texture can be regarded as a two-dimensional phenomenon characterized by two orthogonal properties: patterns and the strength of the patterns, and these two measures are supplementary to each other in a very useful way. However, it is ‘the strength of the pattern’ that the original LBP ignores besides direction information. Guo et al. [5] proposed an adaptive LBP method including the directional statistical information of texture for rotation invariant texture classification. Motivated by their work, original rotation invariant LBP, improved contrast rotation invariant LBP, and direction rotation invariant LBP are combined, called integrated LBP (ILBP) shown using the dashed line and box in Figure 1, to represent the texture information of the image, which can effectively overcome the inherent deficiency of original LBP that is ignoring contrast and direction information.

Although an LBP descriptor can get an excellent performance, it only describes the difference of local gray level and lacks the shape and space expression of the holistic texture image. Furthermore, compared to homogeneous textures such as bricks or sands which have the uniform statistical features, inhomogeneous textures like clouds or flowers generally cannot be extracted robust texture features using conventional algorithms focusing on homogeneous textures [25]. In effectively making up the missed shape and space information of the holistic texture image when LBP texture features are extracted, Zernike moment is a desirable choice.

Moments and functions of moments have been successfully utilized as pattern features in many applications such as image recognition [26] and image retrieval [25] which can capture global information of the image. Zernike moments are deduced based on the theory of orthogonal polynomials. Khotanzad and Hong [26] have suggested that orthogonal moments like Zernike moments are better than other types of moments in terms of information redundancy and image representation. Compared to other orthogonal moments, Zernike moments are possessed of rotation invariant property and can be easily and rapidly calculated to an arbitrary high order.

Therefore, a promising rotation invariant texture classification method is proposed which combines ILBP features with Zernike moment rotation invariant features. These two features respectively describe local and holistic information of texture images. Using the fusion strategy effectively, excellent performances are obtained by means of elaborate experiments and comprehensive texture databases including the Columbia-Utrecht Reflection and Texture (CUReT) database [27], the Outex database [28], and the KTH-TIPS database [29]. The framework of the proposed method is shown using a solid line and box in Figure 1.

The rest of the paper is organized as follows. Section 2 explains the original LBP. Section 3 presents the proposed method in which contrast and direction information of LBP are considered, and shape and space information of the holistic image obtained by Zernike moments are fused during the course of feature extraction. The experimental results of the proposed method and the other compared methods are shown in Section 4. Finally, a conclusion is drawn.

2 Original LBP texture model

2.1 Basic LBP model

Ojala et al. [20] used LBP as a texture descriptor of the image as shown in Figure 2, which is composed by central pixel and neighborhoods. Considering the central pixel as the threshold of texton, LBP code can be described using the following equation:

where s(x) is a signal function, and s\left(x\right)=\left\{\begin{array}{cc}\hfill 1\hfill & \hfill x\ge 0\hfill \\ \hfill 0\hfill & \hfill x<0\hfill \end{array}\right.\text{.} (x_{
c
}, y_{
c
}) is the allowable position as the central pixel. g_{
c
} is the central pixel, g_{
p
} is the pixel value of neighborhood, P is the number of the neighbors.

By making statistics about the frequencies of the occurred LBP codes at all allowable positions in the image, the texture spectrum histogram S[h] (h = 0, 1, …, 2^{P}) can be obtained using the following equation:

where f\left({x}_{c}\mathit{,}{y}_{c}\right)=\left\{\begin{array}{c}\hfill 1\hfill \\ \hfill 0\hfill \end{array}\right.\begin{array}{c}\hfill LBP\left({x}_{c}\mathit{,}{y}_{c}\right)=h\hfill \\ \hfill \mathrm{otherwise}\hfill \end{array},u\times v is the size of image.

Subsequently, Ojala et al. [1] improved the square LBP to be a circular form with discretionary radius R and neighborhoods P. Supposing that the coordinate of central pixel g_{
c
} is (x_{
c
}, y_{
c
}), then the coordinate of the neighbor g_{
p
} is (x_{
c
} + R cos(2πi/P), y_{
c
} − R sin(2πi/P)). The pixel values of the neighbors which are not in the image grids can be calculated using an interpolation method. The relative position of central pixel and neighbors is shown in the Figure 3.

2.2 Uniform and rotation invariant LBP

A hidden trouble exists in the abovementioned LBP. As the number of neighbors increases, the dimension of the histogram grows rapidly. For example, if P is 16, then the dimension of the histogram is 2^{16} = 65,536. Therefore, the texture spectrum is so long that it is inconvenient to be applied in practice.

In the LBP code, the number of spatial transitions (bitwise 0/1 changes) can be described as:

When U(LBP_{P,R}) ≤ 2, the LBP pattern is defined as uniform patterns \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{u2} which has P(P − 1) + 2 discriminative patterns [1]. Although the histogram spectrum feature can be simplified using the uniform pattern, this processing way is feasible. By experiments and observation, uniform LBPs are fundamental properties of texture, providing the vast majority of patterns, sometimes over 90%. Detailed experimental results are listed in Section 4.

Furthermore, by observing, it is not difficult to find that no matter how to rotate the LBP, its structure is identical, which means that the original LBP and the rotated LBP have the same order and bitwise 0/1 changes as shown in Figure 4. For obtaining the rotation invariant texture description, Ojala et al. [1] gave the following definition:

where ri means the rotation invariance, ROR(x, p) represents that the LBP code x is rotated p times around the center pixel. That is to say, using the LBP with the minimal decimal value stands for other LBPs belonging to the same family. Figure 4 shows some LBPs pertaining to the same family. The rotation invariant uniform LBP \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} can be calculated using the following equation:

where riu2 means rotation invariant uniform pattern which has P + 2 discriminative patterns. Thus the dimension of texture spectrum histogram is greatly simplified. By making statistics about the frequencies of the occurred \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} at all allowable pixel positions in the image, the texture spectrum histogram S_{original} can be obtained.

3 Integrated LBP and Zernike moments model

As mentioned above, attention is paid to the detailed information when texture features are extracted by LBP. But the major drawback of LBP texture analysis is its locality. Zernike moment features are just opposite. That is to say, Zernike moments emphasize holistic and shape information of images but lack specific information. Therefore, LBP and Zernike moments complement each other in terms of information description of images. What is more, these two measure ways can be both described as a histogram spectrum, so it is very convenient to fuse them.

3.1 Integrated rotation invariant LBP model

Other two kinds of rotation invariant LBPs are proposed besides the original rotation invariant pattern \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}. They are respectively contrast rotation invariant LBPs represented by C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} and direction rotation invariant LBP represented by O\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}}. These three kinds of rotation invariant LBPs are collectively referred to as an ILBP model.

3.1.1 Contrast rotation invariant LBP

Although rotation invariant pattern \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} can obtain an excellent performance, this kind of LBP texture representation only describes the change between the central pixel and neighbors. As to how much change occurs between them on earth, \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} cannot give an explicit description. For example, both of the central pixels are 50 in two local textons whose neighbors are respectively {82,90,30,75,124,69,39,104} and {79,68,24,82,136,73,45,233}. Although their LBP codes are both {1,1,0,1,1,1,0,1}, the absolute values of their contrast change between the central pixel and neighbors are different which are respectively {32,40,20,25,74,19,11,54} and {29,18,26,32,86,23,5,183}. For supplementing these missed information, contrast rotation invariant LBP is added to describe the texture images besides the original \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}. Using C_{
p
} represents the absolute value of contrast change between the central pixel and neighbors in every texton, i.e., C_{
p
} = |g_{
p
} − g_{
c
}|; LBP of C_{
p
} can be obtained by the following equation:

where μ_{
C
} is the mean of the absolute value C_{
p
} of contrast change between the central pixel and neighbors in every texton, and {\mu}_{C}=\frac{1}{P}{\displaystyle \sum _{p=0}^{P-1}{C}_{p}}\text{.}. If the similar processing method such as (5) is applied to C _ LBP_{P,R}, the contrast rotation invariant C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} can be obtained. By making statistics about the frequencies of the occurring C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} at all allowable pixel positions in the image, the texture spectrum histogram S_{
C
} can be obtained.

3.1.2 Direction rotation invariant LBP

For the stochastic texture images as shown in Figure 5a, the direction information is not apparent. But for the periodic or partly periodic texture images as shown in Figure 5b, the direction information is obvious. In the real world, most of the texture images contain the directional cue, so supplementing direction information in the discriminative features is worth trying.

The mean μ_{
Op
} and variance σ_{
Op
} of C_{
p
} in whole texture image are used to describe the direction information along the orientation 2πp/P. The specific equations are shown below.

Therefore, two vectors μ_{
O
} = [μ_{O 1}, μ_{O 2}, …, μ_{
OP
}] and σ_{
O
} = [σ_{O 1}, σ_{O 2}, …, σ_{
OP
}] representing direction information can be obtained. Figure 6 shows an example of directional information μ_{
O
} and σ_{
O
} about one texture image and corresponding rotated image with a 90° angle, respectively. By the observation, it can be found that μ_{
O
} and σ_{
O
} contain strong directional information and can be used to revise the histogram spectrum feature of the images so that more similarities between the image and its rotated images are mined. μ_{
O
} and σ_{
O
} can be respectively converted into rotation invariant LBP using the means of μ_{
O
} and σ_{
O
} as the thresholds. Direction rotation invariant information {O}_{\mu}\_{\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}} and {O}_{\sigma}\_{\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}} of the holistic texture image can be obtained using the following equations:

where {\overline{\mu}}_{\mathit{Op}}=\frac{1}{P}{\displaystyle \sum _{p=0}^{P-1}{\mu}_{\mathit{Op}}},{\overline{\sigma}}_{\mathit{Op}}=\frac{1}{P}{\displaystyle \sum _{p=0}^{P-1}{\sigma}_{\mathit{Op}}}{O}_{\mu}\_{\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}} and {O}_{\sigma}\_{\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}} are used to together represent direction rotation invariant O\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}} of the whole texture image. As to how to revise the histogram spectrum feature of the image using direction rotation invariant O\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}}, the processing method will be detailedly introduced in the following section.

3.2 Rotation invariant Zernike moments model

Although LBP is an excellent method in both performance and efficiency, it ignores the shape and space information of the holistic texture image. For supplementing the missed information, Zernike moment rotation invariant features are used and fused. Because the basis set of ordinary moments is not orthogonal, Zernike [30] introduced a set of complex polynomials which makes a complete orthogonal set denoted by {V_{
nm
}(x, y)} over the interior of the unit circle, i.e., x^{2} + y^{2} = 1. The form of these polynomials is described as:

where n is positive integer or zero, m is positive and negative integers subject to constraints that n − |m| is even, and |m| ≤ n. ρ is the length of vector from origin to (x, y) pixel, and θ is the angle between vector ρ and x axis in counterclockwise direction, and R_{
nm
}(ρ) is radial polynomial shown as the following equation:

where {\delta}_{\mathit{ab}}=\left\{\begin{array}{cc}\hfill 1\hfill & \hfill a=b\hfill \\ \hfill 0\hfill & \hfill \mathrm{otherwise}\hfill \end{array}\right. Zernike moments are the projection of the image function onto these orthogonal basis functions. So Zernike moment of n th order with the repetition m for a texture image f(x, y) is:

When calculating the Zernike moments of a given image, the center of the image is taken as the origin and pixel coordinates are mapped into the unit circle. The pixels falling outside the circle are not used, and A_{
nm
} = A_{n,− m}. By theoretical testifying, Zernike moments have the rotation invariant property, that is to say, if the Zernike moments of an image and its rotated image with an angle θ are respectively denoted using A_{
nm
} and {A}_{\mathit{nm}}^{\prime}, they have the following relation:

If the image is preprocessed using some simple methods [26], Zernike moments are also invariant to translation and scale besides rotation. Using (15), the Zernike moments of different orders can be obtained such as A_{00}, A_{11}, A_{20}, A_{22}, and so on. The vector S_{
Z
} composed of Zernike moments of different orders is used as the histogram spectrum feature to describe the image information, and the specific form is:

After the ILBP and Zernike moment features of the image are respectively obtained through the above description, the fusion feature is constructed and revised, then a final classification decision is made.

3.3.1 Construction of fusion feature

Because the features of LBP and Zernike moments are both histogram spectrum form, it is very convenient to fuse them. In fact, a lot of experiments are made including serial, parallel, and jointly methods. However, the serial method can obtain more stable and excellent performance. The serial method is very simple and can be described as:

where F denotes the fused histogram spectrum feature. Actually, the histogram spectrum S_{original} of original rotation invariant \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} and the histogram spectrum S_{
C
} of contrast rotation invariant C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} can also be serially fused. The related experimental results will be given in Section 4.

3.3.2 Revise of fusion feature

In the preceding section, a method for acquiring directional information of the image is proposed. Here the revise method of fused histogram spectrum feature using the direction rotation invariant O\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}} including {O}_{\mu}\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}} and {O}_{\sigma}\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}} will be elaborated. The equation is described as:

where F′ is the revised fusion histogram spectrum feature. μ(O_{
μ
}) and σ(O_{
μ
}) are respectively the mean and variance of the direction rotation invariant {O}_{\mu}\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}} training images; μ(O_{
σ
}) and σ(O_{
σ
}) are respectively the mean and standard of the direction rotation invariant {O}_{\sigma}\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}} of all training images. c_{1} and c_{2} are positive parameters. In fact, besides fusion histogram spectrum feature F, O\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}} can also revise other histogram spectrum features such as S_{original} generated by original rotation invariant \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}, S_{
C
} generated by contrast rotation invariant C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}, even S_{
Z
} calculated by rotation invariant Zernike moments.

3.4 Classifier and multiscale fusion idea

Nearest neighbor is a kind of effective and simple classification criterion. There are many good measures to estimate the difference and similarity between two histograms such as log-likelihood ratio and chi-square statistic [1]. The chi-square distance function in the experiments is chosen due to its excellent performance in terms of both speed and good recognition rates which is described as:

where d is the chi-square distance between the revised fusion histogram {\mathit{F}}_{\mathrm{train}}^{\mathit{\prime}} of the training image and the revised fusion histogram {\mathit{F}}_{\mathrm{test}}^{\mathit{\prime}} of the testing image. Subscript i is the corresponding bin, and N is the number of bins.

In fact, multiscale fusion idea could be used to improve the classification accuracy in the proposed method, i.e., multiple descriptors of various (P, R) are used simultaneously. Because different scale operators support different structure space of the image, multiple scale descriptors can capture richer and completer texture information.

4 Experimental results

Many experiments have been elaborately designed and executed with the aim of demonstrating the effectiveness of the proposed method.

4.1 The database

Two large and comprehensive texture databases in the study are chosen which are respectively the CUReT database [27], the Outex database [28], and the KTH-TIPS database [29]. The CUReT database includes 61 classes of real-world textures, and each corresponds to different combinations of illumination and viewing angle. The same as the literature proposed by Guo [5], 92 sufficiently large images in each class with a viewing angle less than 60° are selected in the experiments. Among them, the first 23 images in each class were used as training images. Therefore, there are 1,403 (61 × 23 = 1,403) training models and 4,209 (61 × 69 = 4,209) testing samples. This design may be regarded as an analog about the situation with a small number of and less comprehensive training images.

In the Outex database, each texture is captured using six spatial resolutions (100, 120, 300, 360, 500, and 600 dpi), nine rotation angles (0°, 5°, 10°, 15°, 30°, 45°, 60°, 75°, and 90°), and three different simulated illuminants (‘horizon’, ‘inca’, and ‘TL84’). The experimental images include canvas (46 classes), cardboard (1 classes), carpet (12 classes), chips (12 classes), and wallpaper (17 classes), i.e., 99 classes texture images all together in the Outex database. Each class texture images contains 27 images (3 illuminants, 9 angles, and spatial resolution of 600 dpi). The first 9 images (‘horizon’ illuminant, 9 angles, and spatial resolution of 600 dpi) in each class are chosen as training images. Therefore, there are 891 (99 × 9 = 891) training models and 1,782 (99 × 18 = 1,782) testing samples.

The KTH-TIPS database contains 10 texture classes such as crumpled aluminum foil, sponge, brown bread, etc. Each texture is captured under 9 scales, 3 different illumination directions, and 3 different poses. Therefore, there are 81 images per material. The first 21 images in each class are chosen as training images. Therefore, there are 210 (10 × 21 = 210) training models and 600 (10 × 60 = 600) testing samples.

The proposed method are compared with the state of the art LBP methods including \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}[1], variance method (VAR_{P,R}) [1], \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}/\mathrm{V}\mathrm{A}{\mathrm{R}}_{P\mathit{,}R}[1], adaptive LBP method \left(\mathrm{ALBP}{\mathrm{F}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\right)[5] and LBP histogram Fourier (LBPHF) method [21] (concatenating sign LBP histogram Fourier and magnitude LBP histogram Fourier). Because VAR_{P,R} and \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}/\mathrm{V}\mathrm{A}{\mathrm{R}}_{P\mathit{,}R} were set as 128 and 16 bins. All the images are converted to grey scale. For removing the effect of global intensity and contrast, each texture image was normalized to have an average intensity 128 and a standard deviation 20 [1].

4.2 The feasibility of uniform LBP

For showing the effectiveness on dimensionality reduction using \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{u}2}, the proportions of frequencies of \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{u}2} are calculated. Some statistic results are shown in Table 1, and the images are selected from the Outex database.

As can be seen from the Table 1, the uniform LBP occupies the vast majority of a local binary pattern, sometimes over 90%. Therefore, it is feasible to use the uniform LBP to reduce the dimensionality of histogram spectrum.

4.3 Experimental results on CUReT database

In the experiments, different combination on three kinds of rotation invariant LBP operators and rotation invariant Zernike moments are compared. ‘/O’ denotes revising the histogram spectrum by direction rotation invariant LBP. ‘C’ represents capturing the histogram spectrum features by contrast rotation invariant LBP. ‘Z’ is Zernike moments method. And ‘_’ denotes connecting two or three kinds of histogram spectrum features in series. For example, \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_Z represents serially connecting original rotation invariant \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}, contrast rotation invariant C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} and Zernike moments rotation invariant A_{
nm
}. \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_Z/O represents revising the fusion feature \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_Z by direction rotation invariant O\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}}. The number 5, 8, or 10 denotes the order of Zernike moments. VZ_ MR4 and VZ_ MR8 respectively denote MR4 and MR8 of MR filter banks method. Table 2 lists experimental results on CUReT database using different methods.

As can be seen from the Table 2, firstly, the recognition rate of contrast rotation invariant C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} (represented by ‘C’ in the Table 2) alone is worse than that of original rotation invariant \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}. For example, the recognition rates of \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} can respectively reach at 62.25%, 64.93%, and 68.33% when P and R are respectively (8,1), (16,2), and (24,3). Whereas the results of C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} are respectively 52.58%, 51.41%, and 50.18% in the same case. It shows that the information which is contained by \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} is richer than that contained by C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}.

Secondly, the role of contrast information, not only VAR_{P,R} but also C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} decreases as the number of neighbors and the size of texton increase. It states that the reliability of difference value between the central pixel and the neighbors reduces as the size of texton augments. But the recognition rate of original rotation invariant \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} grows as the number of the neighbors and the size of texton increase.

Thirdly, among the compared methods with respect to LBP, LBPHF and adaptive LBP method obtain better results. And for non-LBP method, the results of MR8 method are better than ones of MR4 because of the richer feature representation.

Fourthly, for Zernike moment features, the recognition rate grows as the order increases. The reason for this phenomenon is that the higher the order is, the richer the detailed information contained by the Zernike moment histogram spectrum is. Fourthly, directional information can improve the recognition results of different features including LBP, Zernike moments, and fusion histogram spectrum.

Finally, fusion modes can effectively boost the recognition results. For example, when P = 8 and R = 1, the recognition rates are respectively 62.25%, 52.58%, and 36.07% obtained alone by \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}, C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} and Zernike moments (10 order). However, when fusion features \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C and \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_Z are used, the recognition rates can reach at 67.31% and 76.36%, respectively.

By applying the multiscale idea mentioned above in Section 3, better results can be obtained. For example, recognition rates respectively reach at 77.33% and 81.94% when different radius and different neighbors fusion features {\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_{C}_{8,1+16,2+24,3} and {\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_{Z}_{8,1+16,2+24,3} are used. And recognition rates respectively reach at 81.84% and 78.33% when different radius and same neighbors fusion features {\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_{C}_{16,1+16,2+16,3} and same radius and different neighbors fusion features {\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_{Z}_{8,2+16,2+24,2} are used. Here, Zernike moment features are gotten using 10 order moments, and different scale fusion features are obtained by simply connecting the histogram features of different scales. Better performance can be expected if more ingenious fusion strategies are used [31]. Because the results on LBPHF method are more stable among these compared methods, we also calculated the recognition rate of different radius and different neighbors fusion features LBPHF_{8,1 + 16,2 + 24,3} which reaches at 71.77%.

4.4 Experimental results on Outex database

In this section, all the experiments are done using the same methods, and the results are listed in Table 3. Because the images in the Outex database are larger than those in the CURet database, the results of many methods show ‘out of memory’ besides those of VAR_{P,R} and \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}/\mathrm{V}\mathrm{A}{\mathrm{R}}_{P\mathit{,}R}, when the number of neighbors P is 24 and radius R is 3. Therefore, the results on the scale of P = 24 and R = 3 are not listed in Table 3.

As can be seen from the Table 3, firstly, the results of original rotation invariant \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} are better than those of contrast rotation invariant C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}. Secondly, the results of \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} improve as the number of neighbors and the size of texton increase; however, the results of C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} are the opposite.

Thirdly, for the Zernike moment method, the recognition rate grows as the order increases. The change trend is the same as that of the CUReT database. In addition, it can be found that the results of Zernike moments are very excellent mainly because of two factors. On the one hand, angle changes are highly emphasized for the images in the Outex database. On the other hand, Zernike moment features are possessed of a rotation invariant property and can well describe shape and space information of the image, so they are very suitable to be used to recognize the images with different rotation angles. It is the direction information of the image that has been fully mined by Zernike moments; therefore, the proposed direction rotation invariant LBP can hardly affect the original feature histogram.

Finally, the fusion method can remarkably improve the results. For example, when P and R are respectively 16 and 2, \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} and C\_\mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} respectively obtain the recognition rate of 31.03% and 15.38%, but fusion features \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C and \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_Z can respectively reach at 32.72% and 71.16%. Here, Zernike moments are calculated using a 10-order parameter. However, it can be found that the fusion results are worse than the results of Zernike moments. It is not difficult to explain this phenomenon from the signal processing point of view. When the quality difference between two signal sources is too big, then the fusion result would be bad because the relatively worse signal may disturb the relatively better signal resembling the noise. Therefore, the recognition rates of fusion feature \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_Z are worse than those of alone Zernike moments but greatly better than those of alone texture feature such as \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} or contrast LBP and even the fusion feature \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C.

Multiscale method in the Outex database is also tried, and an excellent performance is obtained. For example, the recognition rates can respectively reach at 72.17%, 68.86%, and 74.41% when different radius and different neighbors fusion feature \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_{Z}_{8,1\phantom{\rule{0.5em}{0ex}}+\phantom{\rule{0.5em}{0ex}}16,2}, different radius and same neighbors fusion features \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_{Z}_{16,1\phantom{\rule{0.5em}{0ex}}+\phantom{\rule{0.5em}{0ex}}16,2} and same radius and different neighbors fusion features \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_{Z}_{8,2\phantom{\rule{0.5em}{0ex}}+\phantom{\rule{0.5em}{0ex}}16,2} are used. Here, Zernike moments are calculated using a 10-order parameter. Furthermore, we also calculated the recognition rate of the LBPHF method with different radius and different neighbors fusion features LBPHF_{8,1 + 16,2} which reaches at 56.73%.

4.5 Experimental results on KTH-TIPS database

In this section, all the experiments are done using the same methods, and the results are listed in Table 4. Because the trends of most of the results are similar to those of the CURet and Outex databases, here, only some different phenomena are given. Firstly, the recognition rates of \mathrm{ALBP}{\mathrm{F}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2} and LBPHF methods decrease as the number of the neighbors and the size of texton increase. Secondly, compared with the results on the CURet and Outex databases, the role of contrast information is very obvious, sometimes even better than the ones of \mathrm{L}\mathrm{B}{\mathrm{P}}_{P,R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}. The reason may be that the images in the KTH-TIPS database contain sharp scale changes.

In addition, the multiscale method can further improve the results. For example, the recognition rates can respectively reach at 64.50%, 62.33%, and 63.83% when different radius and different neighbors fusion features {\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_{Z}_{8,1+16,2+24,3}\text{,} different radius and same neighbors fusion features {\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_{Z}_{16,1+16,2+16,3}\text{,} and same radius and different neighbors fusion features {\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\_C\_{Z}_{8,2+16,2+24,2} are used. Here, Zernike moments are calculated using a 10-order parameter. Furthermore, we also calculated the recognition rate of the LBPHF method with different radius and different neighbors fusion features LBPHF_{8,1 + 16,2 + 24,3} which reaches at 55.83%.

In a word, the proposed method in this paper obtained more exact, stable, and robust results compared with other methods including {\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}\mathrm{u}2},{\mathrm{V}\mathrm{A}\mathrm{R}}_{P\mathit{,}R},{\mathrm{L}\mathrm{B}\mathrm{P}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}/{\mathrm{V}\mathrm{A}\mathrm{R}}_{P\mathit{,}R},{\mathrm{ALBPF}}_{P\mathit{,}R}^{\mathrm{r}\mathrm{i}\mathrm{u}2}\text{,} LBPHF and MR methods. Although the results of alone Zernike moment features in the Outex database are very outstanding, they are not stable compared with the proposed method because the results in the CUReT and KTH-TIPS databases are very bad. In addition, multiscale idea can further notably improve the recognition results.

5 Conclusions

LBP is an excellent tool for texture classification because of its simplicity, efficiency, and rotation invariant property. However, two mainly adverse factors weaken its performance, which are respectively ignoring contrast and direction information and lacking the shape and space expression of the holistic texture image. To effectively make up for the missed information, the rotation invariant contrast and direction information are added to the original rotation invariant LBP texture feature, which is called ILBP. In addition, Zernike moments are fused into the improved LBP texture features when representing images because they can effectively describe shape and space information of the holistic image, are possessed of orthogonal and rotation invariant properties, and can be easily and rapidly calculated to an arbitrary high order. Experimental results show that the proposed method can obtain a superior performance in terms of the large and comprehensive CUReT, Outex, and KTH-TIPS texture databases compared with other classic LBP and non-LBP methods, and multiscale idea can further remarkably improve the recognition results.

References

Ojala T, Pietikainen M, Mäenpää T: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Machine Intell. 2002, 24: 971-987. 10.1109/TPAMI.2002.1017623

Zhang L, Zou B, Zhang J, Zhang Y: Classification of Polarimetric SAR Image based on Support Vector Machine using Multiple-Component Scattering Model and Texture Features. EURASIP J. Adv. Signal Proc. 2010, 3: 1-10.

Wang Y, He DJ, Yu CC, Jiang TQ, Liu ZW: Multimodal biometrics approach using face and ear recognition to overcome adverse effects of pose changes. J. Electron. Imaging 2012, 21: 043026-1-043026-11.

Guo ZH, Zhang L, Zhang D, Zhang S: Rotation invariant texture classification using adaptive LBP with directional statistical features. In Proceedings of the 7th IEEE International Conference on Image Processing. IEEE, Hong Kong, China; 2010:285-288.

Manjunath B, Ma W: Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Machine Intell. 1996, 18: 837-842. 10.1109/34.531803

Varma M, Zisserman A: Texture classification: Are filter banks necessary? In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, Madison, USA; 2003:691-698.

Varma M, Zisserman A: A statistical approach to texture classification from single images. Int. J. Comput. Vision 2005, 62: 61-81. 10.1007/s11263-005-4635-4

Harwood D, Ojala T, Pietikäinen M, Kelman S, Davis L: Texture classification by center-symmetric auto-correlation, using Kullback discrimination of distributions. Pattern Recogn. Lett. 1995, 16: 1-10. 10.1016/0167-8655(94)00061-7

Ojala T, Pietikäinen M, Harwood D: A comparative study of texture measures with classification based on featured distributions. Pattern Recogn. 1996, 29: 51-59. 10.1016/0031-3203(95)00067-4

Mäenpää T Ph.D. dissertation . In The local binary pattern approach to texture analysis-extensions and applications. Dept. Elect. Inf. Eng., University of Oulu, Oulu, Finland; 2003.

Dana KJ, van Ginneken B, Nayar SK, Koenderink JJ: Reflectance and texture of real world surfaces. ACM Trans. Graphic. 1999, 18: 1-34. 10.1145/300776.300778

Ojala T, Mäenpää T, Pietikäinen M, Viertola J, Kyllönen J, Huovinen S: Outex-new framework for empirical evaluation of texture analysis algorithm. In Proceedings of the International Conference on Pattern Recognition. IEEE, Quebec, Canada; 2002:701-706.

Zernike F: Diffraction theory of the cut procedure and its improved form, the phase contrast method. Physica 1934, 1: 689-704. 10.1016/S0031-8914(34)80259-5

The authors sincerely thank Postdoctor Zhenhua Guo from Tinghua University and Professor Guoying Zhao from University of Oulu for sharing the source codes on adaptive LBP method and LBP histogram Fourier method. This work was supported by the national natural science foundation of China (NSFC) under Grant No. 61171068.

Author information

Authors and Affiliations

Department of Computer and Information engineering, Beijing Technology and Business University, Beijing, China

Yu Wang & Yi Chen

Department of Mechanical Engineering, Yanshan University, Qinhuangdao City, Hebei Province, China

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Wang, Y., Zhao, Y. & Chen, Y. Texture classification using rotation invariant models on integrated local binary pattern and Zernike moments.
EURASIP J. Adv. Signal Process.2014, 182 (2014). https://doi.org/10.1186/1687-6180-2014-182