Various approaches have been proposed in the literature for texture characterization of images. Some of them are based on statistical properties, others on fractal measures and some more on multi-resolution analysis. Basically, these approaches have been applied on mono-band images. However, most of them have been extended by including the additional information between spectral bands to deal with multi-band texture images. In this article, we investigate the problem of texture characterization for multi-band images. Therefore, we aim to add spectral information to classical texture analysis methods that only treat gray-level spatial variations. To achieve this goal, we propose a spatial and spectral gray level dependence method (SSGLDM) in order to extend the concept of gray level co-occurrence matrix (GLCM) by assuming the presence of texture joint information between spectral bands. Thus, we propose new multi-dimensional functions for estimating the second-order joint conditional probability density of spectral vectors. Theses functions can be represented in structure form which can help us to compute the occurrences while keeping the corresponding components of spectral vectors. In addition, new texture features measurements related to (SSGLDM) which define the multi-spectral image properties are proposed. Extensive experiments have been carried out on 624 textured multi-spectral images for use in prostate cancer diagnosis and quantitative results showed the efficiency of this method compared to the GLCM. The results indicate a significant improvement in terms of global accuracy rate. Thus, the proposed approach can provide clinically useful information for discriminating pathological tissue from healthy tissue.

1. Introduction

Recently, spectral imaging technology has become a topic of growing interest in color-reproduction, remote-sensing, medical imaging, and other systems. These increased research efforts are likely to propagate into other application areas such as computer vision and pattern recognition. Usually, texture analysis is an efficient measure to estimate the structural orientation, roughness, smoothness or regularity differences of diverse regions in an image scene.

The traditional popular tool used for texture-based segmentation and classification is the gray level dependence method (GLCM) [1, 2], but some other techniques have been proposed with success: description through wavelet coefficient statistics [3], the Markov random field [4] or Markov chain [5] models. Nevertheless, texture analysis remains a difficult problem when applied to color [6], multi or hyperspectral images; where image pixel takes its values in a multidimensional space. For this purpose, a great deal of work has been done for modeling the multispectral and hyperspectral texture analysis [7–12].

The main idea of extracting texture from hyperspectral images is the use of combined spectral and spatial information. Numerous approaches have been conducted on the use of Gabor filters [13], co-occurrence matrices [14–19], mathematical morphology [20–22].

In [23], the authors used the potential of the spectral/textural approach to improve the classification accuracy of intra-urban land cover types; Claussi [24] studied the effect of gray quantization on the ability of co-occurrence probability statistics; Kiema [25] examined the gray-level co-occurrence based texture image fused to thematic mapper (TM) imagery to expand the object feature base to include both spectral and spatial features; while Bau and Healey [26] used a bank of rotation scale invariant Gabor feature vectors to represent the spectral/spatial properties of a region.

Besides, color opponent features were first introduced in color texture characterization with fairly good performance [27] and later extended to deal with multi-band texture images [28]. Other methods combine color and texture information for the segmentation of color images [29]. Moreover, several researches studied the spatial interaction within each channel and interaction between spectral channels, applying gray level texture techniques to each channel independently [30], or using 3D colour histograms as a way to combine information from all colour channels [31].

Recently, it has been shown that the concept of spatial co-occurrence matrix could be generalized by assuming texture joint information between spectral bands [17, 32].

However, the main problem of texture analysis of multi-band images is related to the high dimension of the data and its high correlation.

Driven by classification or discrimination accuracy, one would expect that, as the number of multi or hyperspectral bands increases, the accuracy of classification should also increase. Nonetheless, this is not the case in a model-based analysis [33–35]. Redundancy in data can cause convergence instability of models. Furthermore, variations due to noise in redundant data propagate through a classification or discrimination model. Thus, processing a large number of multi or hyperspectral bands can result in higher classification inaccuracy than processing a subset of relevant bands without redundancy [34, 35].

One way of overcoming this problem is to adopt a proper selection band method before applying classification task. The reason is that, in a selection band procedure, the amount of data is reduced into a lower dimensional subspace without practically losing relevant information [36]. In addition, computational requirements for processing large hyperspectral data sets might be prohibitive and a method for selecting a data subset is therefore sought [33].

Several approaches have been investigated by looking into how to remove information re-dundancy resulting from highly correlated bands [33, 37–40]. Most of the methods usually involve two separate tasks: (a) selecting the bands that can indicate the particular material well, feature bands selection; and (b) removing the feature bands contributing redundant information, redundancy reduction [41]. Recently, Information theory has also been used in feature bands selection [42, 43], it consists of analyzing the amount of information in a subset of features (bands), measuring the degree of independence between image bands as a relevance criterion.

In this article, we investigate the problem of the analysis of multispectral image textures allowing a joint spectral and spatial analysis of texture. Therefore, we aim to add spectral information to classical texture analysis methods that only treat gray-level spatial variations. We try to apply the proposed method on prostate cancer textured multispectral images. These images have been chosen to reflect different grades of malignancy in prostatic tissues, and they correspond to different structural patterns as well as apparent textures.

The remainder of this article is organized as follows: Section 2 describes a new method for multi and hyperspectral texture analysis, based on joint spatial and spectral information; while Section 3 is concerned with some comparative results. Finally, Section 4 gives a conclusion of the article.

2. Spatial and spectral gray level dependence method (SSGLDM)

2.1. N-mode flattening matrix of a tensor

A multi-band image can be represented as a 3-D data array also called third order tensor \mathcal{X}\phantom{\rule{2.77695pt}{0ex}}\epsilon \phantom{\rule{2.77695pt}{0ex}}{\mathbb{R}}^{{I}_{1}\times {I}_{2}\times {I}_{3}}, where the three entries are related to pixel localization and spectral band, and each element of could be arranged as {x}_{{i}_{1}{i}_{2}{i}_{3}}, i_{1} = 1, ..., I_{1}, i_{2} = 1, ..., I_{2}, i_{3} = 1, ..., I_{3}; ℝ is the real manifold.

A tensor can be transformed into a n-mode matrix. The n-mode flattened matrix X_{
n
} of tensor \mathcal{X}\phantom{\rule{2.77695pt}{0ex}}\epsilon \phantom{\rule{2.77695pt}{0ex}}{\mathbb{R}}^{{I}_{1}\times {I}_{2}\times {I}_{3}} is a I_{
n
}×M_{
n
} matrix where:

M_{
n
} = I_{
p
}× I_{
q
}, with p, q ≠ n.

By definition, X_{
n
} is generated by the column vectors of the n-mode flattened matrix. columns are I_{
n
} dimensional vectors obtained from by varying the index i_{
n
} and keeping the other indices fixed [44]. These vectors are called "n-mode vectors". The three n-mode flattening of a third-order tensor are shown in Figure 1.

2.2. SSGLDM algorithm

The idea of the SSGLDM is based on the estimation of the second-order joint conditional probability density multi-dimensional functions P (V, V_{Δ}|d, θ).

Each P (V, V_{Δ}|d, θ) is the probability that two spectral vectors V and V_{Δ} with components (i_{1}, ..., i_{
k
}, ..., i_{
n
}), and (j_{1}, ..., j_{
l
}, ..., j_{
n
}) respectively, occur for a given distance d and direction θ (see Figure 2), where n is the number of spectral bands, and (i_{(k)}, j_{(l)}) ε [1, N_{
g
}]^{2}, N_{
g
} is the number of gray levels.

In the whole article, we consider multi-spectral data as a third-order tensor denoted by , in which the entries are accessed via three index.

Unlike the spatial gray level dependence method in which the estimated second-order joint conditional density functions can be described in a matrix, we propose to represent P (V, V_{Δ}|d, θ) in a structure form of an array I_{occu} which is the vector of occurrences and a matrix {M}_{V,{V}_{\Delta}} which is the matrix of V and V_{Δ} components. Each row in the matrix {M}_{V,{V}_{\Delta}} corresponds to V and V_{Δ} components. The use of this structure can help us to compute the occurrences while keeping the corresponding components vectors of V and V_{Δ}. For example if we take the first value of I_{occu} (i.e I_{occu}(1)), the corresponding components vectors of V and V_{Δ} are in the first row of {M}_{V,{V}_{\Delta}} (see Figure 3), so{I}_{\text{occu}}\left(1|d,\phantom{\rule{2.77695pt}{0ex}}\theta \right)=P\left({V}^{\left(1\right)},\phantom{\rule{2.77695pt}{0ex}}{V}_{\Delta}^{\left(1\right)}|d,\phantom{\rule{2.77695pt}{0ex}}\theta \right) where {V}^{\left(1\right)}={M}_{V,{V}_{\Delta}}\left(1,1:n\right) and {V}_{\Delta}^{\left(1\right)}={M}_{V,{V}_{\Delta}}\left(1,\phantom{\rule{2.77695pt}{0ex}}n+1:2\times n\right). (1: n): designs columns 1 to n.

In order to compute the matrix {M}_{V,{V}_{\Delta}}, we define two sub-tensors , extracted from the tensor data for each direction θ as shown in Figure 4. Using the 3-mode flattening matrix of , , we obtain respectively A_{3}, B_{3}. Let T denotes the matrix that horizontally concatenates {\mathbf{A}}_{3}^{t} and {\mathbf{B}}_{3}^{t}, and vertically concatenates {\mathbf{B}}_{3}^{t} and {\mathbf{A}}_{3}^{t}, where{\mathbf{A}}_{3}^{t}, {\mathbf{B}}_{3}^{t} denote the transposed matrices of A_{3}, B_{3} respectively. T can be represented as follows:

The use of this procedure yields us to calculate the spatial and spectral gray level dependence for both θ and (θ + π).

Finally, the matrix {M}_{V,{V}_{\Delta}} is obtained by keeping only the rows of T with no repetitions, however, the number of repetitions for each row is represented in an array of occurrences called I_{occu}. At the end of the process, I_{occu} is normalized with respect to the size of {M}_{V,{V}_{\Delta}}, so that each component represents the probability of occurrence of a given combination (V, V_{Δ}). The spatial and spectral gray level dependence algorithm can be given as follows:

STEP 1: For a given distance d and angle θ, extract the two sub-tensors and from the tensor data .

STEP 2: Compute the 3-mode flattening matrices A_{3}, B_{3}.

STEP 3: Build the matrix T.

STEP 4: Compute {M}_{V,{V}_{\Delta}} by keeping only the rows T with no repetitions.

STEP 5: Compute I_{occu}.

2.3. Proposed generalized textures features related to SSGLDM

Haralick texture features comprise 14 features as summarized in [1]. However, the most used in literature are energy, entropy, inertia, local homogeneity and correlation. In this article we propose seven new generalized texture features.

Suppose that {V}^{\left(k\right)}=\left({i}_{1}^{\left(k\right)},\phantom{\rule{2.77695pt}{0ex}}{i}_{2}^{\left(k\right)},\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}{i}_{n}^{\left(k\right)}\right), and {V}_{\Delta}^{\left(k\right)}=\left({j}_{1}^{\left(k\right)},\phantom{\rule{2.77695pt}{0ex}}{j}_{2}^{\left(k\right)},\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}{j}_{n}^{\left(k\right)}\right), where 1 ≤k≤m, and 1\le m\le {N}_{g}^{2n} is the length of the vector of occurrences, N_{
g
} is the number of gray levels and n is the number of bands.

For convenience, P\left({V}^{\left(k\right)},\phantom{\rule{2.77695pt}{0ex}}{V}_{\Delta}^{\left(k\right)}|d,\phantom{\rule{2.77695pt}{0ex}}\theta \right) will be written as P\left({V}^{\left(k\right)},\phantom{\rule{2.77695pt}{0ex}}{V}_{\Delta}^{\left(k\right)}\right) in the following equations. Energy:

This descriptor is also known as contrast or difference moment. It is a measure of image intensity contrast or the spatial/spectral variations present in an image to show the texture fineness.

The inertia value is high when similar spectral vectors are adjacent to each other in the input image and provides a measure of coarseness (similarity).

The correlation descriptor measures the linear dependency of spectral vectors in the co-occurrence matrix.

In practice, as the number of bands increases, the joint probability P\left({V}^{\left(k\right)},\phantom{\rule{2.77695pt}{0ex}}{V}_{\Delta}^{\left(k\right)}\right) decreases. Consequently, the values of P\left({V}^{\left(k\right)},\phantom{\rule{2.77695pt}{0ex}}{V}_{\Delta}^{\left(k\right)}\right) become nonsignificant which implies a poor overall classification [17, 32].

To overcome this issue, we suggest to apply the proposed SSGLDM on a subset of bands in the multi-spectral image, instead of processing the whole set of bands. The subset of bands can be chosen to remove information redundancy resulting from highly correlated bands.

2.4. Band selection by minimization of dependent information

In [43], the extraction of selected subsets of spectral image bands can be obtained by means of a criterion based on the minimization of the dependent information (MDI), which consists of a relation between the joint entropy and the union of the conditional entropies of the considered set of image bands. This criterion could be defined by the following expression:

Where {A}_{{i}_{1}} is a random variable representing the image band i_{1}. {A}_{{i}_{2}},...,{A}_{{i}_{n}} are the complementary variables of {A}_{{i}_{1}};H\left({A}_{1},...,{A}_{n}\right) is the joint entropy, which represents the total amount of joint information of set of variables A_{1}, ..., A_{
n
}; and H\left({A}_{{i}_{1}}|{A}_{{i}_{2}},\phantom{\rule{2.77695pt}{0ex}}\dots ,{A}_{{i}_{n}}\right) is the conditional entropy that represents the amount of independent information in image band {A}_{{i}_{1}} having measured the rest image bands {A}_{{i}_{2}},...,{A}_{{i}_{n}}.

This technique behaves as an unsupervised feature selection criterion, providing very satisfactory results with respect to classification accuracy when using the selected bands, even outperforming the other supervised methods used in the comparison in most situations [43]. These performances justify our choice of using MDI [43] for selecting the best relevant bands in the second part of our experiments.

3. Experimental results

The first part of this section is concerned with multi-spectral texture characterization. For this purpose, an extensive experiments are carried out on many multi-spectral images for use in prostate cancer diagnosis. Furthermore, the results of proposed method (SSLGMD) are compared with GLCM, followed by performance comparative study and discussion. The second part deals with the curse-of-dimensionality problem. Thus, we extract the best relevant bands of multispectral prostate cancer using MDI technique proposed by Sotoca et al. [43]. Thereafter, we provide some experimental results, in order, to see the impact of the selected bands using MDI criterion on the classification accuracy performance using proposed method and GLCM respectively.

3.1. Data set

Over the last decade, multi-spectral imagery in prostate cancer has become a very useful tool for analyzing and diagnosing pathologies. The prostate cancer is the second most common cancer in men after the skin cancer, and it is also the second in the list of causes for cancer death after the lung cancer. However, the most known method for prostate cancer diagnosis is the prostate-specific antigen (PSA) blood test. In the case of a positive diagnosis, the urologist will often advise a needle biopsy, in which a tiny piece of a tissue is taken from the prostate for analysis [45]. From the analysis, different grades of malignancy correspond to different structural patterns as well as to apparent textures. After the analysis, an experienced pathologist selects the images to reflect the different patterns in prostatic tissues, four major groups usually should be identified (see Figure 5).

● Stroma: STR (normal muscular tissue)

● Benign prostatic hyperplasia: BPH (a benign condition)

● Prostatic intraepithelial neoplasia: PIN (a precursor state for cancer).

● Prostatic carcinoma: PCa (abnormal tissue development corresponding to cancer).

The data set consists of 624 textured multi-spectral images, of 128 by 128, with each image taken with 16 spectral channels (from 500 to 650 nm) with a 5 nm step [45]. The images were captured using a classical microscope and CCD camera. A liquid crystal tunable filter (LCTF) was inserted in the the optical path between the light source and the chilled CCD camera. The LCTF has a bandwidth accuracy of 5 nm.

The captured images have been chosen to reflect different grades of malignancy in prostatic tissues and they are labeled into four classes: 176 cases of cancer (Pca), 160 cases of BPH, 144 cases of PIN and 144 cases of Stroma (STR).^{a} The samples were routinely viewed at prostatic section seen at low power (40x objective magnification) by two highly experimented independent pathologists. The pathologist initially views slides at low power, thereby enabling the location of potentially abnormal regions. Subsequent analysis of these regions at high power enables the histological grading of these areas.

The X-Y resolution depends on the magnification chosen, which is usually high for visualization purposes. In this study, the images of 128 by 128 pixels were captured at an X-Y resolution of 0.12 µ/pixel.

Usually, when a biopsy is submitted for analysis, it is very rare that the pathologist finds that the sample is perfectly normal. There must be at least some benign condition that would explain the high levels of PSA that usually justify needle biopsy. So the main issue is to identify benign from malignant and premalignant conditions.

For this purpose, Figure 6 summarizes the different steps used in this article for classification process. The proposed methodology is very important task in prostate cancer diagnosis and could be viewed as a computer-aided system to automatically classify pathological prostate images, since each image can be classified into an appropriate class of prostatic tissue.

3.2. Features extraction (Step 1)

The goal of this step of analysis is to characterize the four different groups of multi-spectral images by extracting features in order to make classification possible.

All generalized and traditional features mentioned in Sections 3.4 and 3.5 have been used for our experiments, and the best performances have been achieved by choosing (d = 1) and the four principal directions (θ = 0°, θ = 45°, θ = 90°, θ = 135°).

We note that, the problem of spatial selection relationships (i.e d and θ) for defining co-occurrence matrices is addressed in [46]. These matrices are maximally reflecting the structure of the underlying texture.

3.3. Features selection for prediction procedure (Step 2)

Feature selection technique is applied to reduce the number of features before applying the classification task. Irrelevant features may have negative effects on a prediction procedure. Moreover, the computational complexity of a classification algorithm may suffer from the curse of dimensionality caused by several features. Features can be selected in many different ways. One scheme is to select features that correlate strongest to the classification variable.

This has been called maximum-relevance selection. Many heuristic algorithms can be used, such as the sequential forward, backward, or floating selections [42].

On the other hand, features can be selected to be mutually far away from each other, while they still have high correlation to the classification variable. This scheme, termed as minimum-redundancy-maximum-relevance selection [42], has been found to be more powerful than the maximum relevance selection. This justifies the use of this technique in our experiments.

We note that, this technique is not applied when the number of feature is small.

3.4. Classification process (Step 3)

A classification process usually involves training and testing data which consist of some data instances. Each instance in the training set contains one "target value" (class labels) and several "attributes" (features).

3.4.1. Using SVM for classification

In [47], several neural networks were compared to the SVM for the classification of hyper-spectral data. The robustness of SVMs was demonstrated and the best results were obtained using a non-linear SVM. In addition, [47] studied Radial Basis Functions (RBF) classifiers and SVM and they noted the favorable behavior of the SVM, from both a theoretical and a practical point of view (see Appendix).

Visually, the images illustrated in Figure 5 are very similar, so the use of SVM for classification issue could be suitable for our applications.

3.5. Results and discussion

The assessment of the classification performance was made using 3-fold cross-validation. Thus, data were randomly splited into 3-sets of a roughly equal size. Splitting was carried out such that the proportion of samples per class was roughly equal across the sets. Each run of the 3-fold cross-validation algorithm consisted of a classifier designed on two data subsets (training) while testing was performed on the remaining subset; this is repeated three times. The SVM optimization was implemented using LIBSVM library through its Matlab interface [48]. The penalty term C of Gaussian kernel, was fixed to 200 and σ^{2} was selected by using a fivefold cross validation [48]. So that, fivefold cross validation was applied on training data in order to estimate σ^{2} which gives the highest classification accuracy rate.

Firstly, we compared the performance of SSGLDM with GLCM by the overall classification rate as a criterion of the comparison, with reference to a set of images manually classified by experts. Like in the GLCM, in which the co-occurrence matrix is computed for each band, we can apply SSGLDM for each subset of bands in the image. Let S be a subset of α bands in the image (α ≤ N_{b}) where N_{
b
} is the number of image bands. For example, if S = (B_{
l
}, B_{l+1}) is a subset of two bands; the SSGLDM can be applied N_{
b
}− 1 times in the image by varying l from 1 to N_{
b
}− 1. Thus, GLCM can be seen as a special case of SSGLDM when S contains only one band.

The number of features is computed for each band in the GLCM, so the total number of features is 7 × (4 directions) × 16 bands = 448. However in the case of SSGLDM, the number of features depends on the choice of subset of bands (S). For example, if S contains two bands, the number of features is 7 × (4 directions) × (N_{
b
}− 1) bands = 420, where N_{
b
} = 16. The minimum redundancy-maximum relevance selection algorithm [42] was used to ensure good classification accuracy by selecting 40 better features from all features generated by SSGLDM.

Table 1 shows the classification results obtained by using SSGLDM, for different subsets of bands (S) for both 32 and 64 gray levels quantification.

It can be clearly observed from Table 1 that the SSGLDM yields better results than the GLCM when S is less than five bands. Best results are obtained with S = 3 bands, for both 32 and 64 gray levels. However, with 16 bands we obtained a classification accuracy rate around 93%. This may derive from the high correlation of the data. Moreover, as the set of bands increases, the joint probability P (V, V_{Δ}|d, θ) decreases. Consequently, the values of P (V, V_{Δ}|d, θ) become nonsignificant which implies a poor overall classification.

To illustrate the classification of different classes of prostate cancer, we show the confusion matrices obtained by using SSGLDM for different choices of S. Tables 2a,b depict the results using GLCM (S = 1 band) where Tables 2c,d give the corresponding results using SSGLDM for (S = 16 bands). As can be seen from these tables, the use of SSGLDM with (S = 16) yields worst results in terms of global accuracy rate. However, Tables 2e,f show the achieved improvements when using SSGLDM with (S = 3 bands). Note that in all the cases, BPH and Stroma classes present the highest error rate in terms of classification due to the similarities between two classes; however the use of SSGLDM with (S = 3 bands) reduces significantly the error rate in these classes.

On the other hand, tests of the computation time were performed for the both SSGLDM and GLCM. For this purpose, we used a PC Intel^{®}Core^{(TM)}2 CPU, 2.66 GHz and 3.58 GB Ram. The two methods were implemented using Matlab 7.1. Times are given for the computation of the co-occurrence matrices and all the features of each method (SSGLDM and GLCM), on images 128 × 128 mentioned in Section 3.1. As shown in on Table 3 The execution time using SSGLDM with (S = 2) is almost the same as GLCM.

Note also that the time keeps growing, when increasing the set of bands during the process. However, it decreases for (S = 16), because the SSGLDM was applied just one time for the whole set of spectral bands. Thus, one can conclude that the use SSGLDM with (S = 2) is a quit good compromise between classification results and computation.

3.5.1. Receiver operating characteristic (ROC)

In order to plot ROC curves, the classifier should be tested using different parameters re-sulting from different values for false alarm (false positive rate FPR) and sensitivity (true positive rate TPR) rates.

Let π_{
i
} be the prior probabilities in each class i. In the case of uniform prior probabilities, π_{
i
} can be written as: \forall i\phantom{\rule{2.77695pt}{0ex}}{\pi}_{i}=\frac{1}{N}, where: N is the number of classes.

Let us suppose a two-class prediction problem (binary classification), in which the outcomes are labeled either as positive (p) or negative (n) class. This is achieved by considering that BPH as the negative diagnosis while PIN and PCa together form the positive diagnosis outcome (the classification of Stroma is relatively simple because of its homogeneous nature at low resolution) [49, 50].

Since there are two classes, prior probabilities are linked by the relation:

{\pi}_{\text{positive}}=1-{\pi}_{\text{negative}}

(10)

ROC curves could be plotted by varying the positive values π_{positive}.

Figure 7 shows a comparative study of the ROC curves using GLCM and SSGLDM respectively, for both 32 and 64 gray levels. Dark, blue and red curves indicate the ROC curves of classified image using SSGLDM (S = 3 bands), GLCM and SSGLDM (S = 16 bands) respectively. These results indicate that for all possible values of prior probability π_{positive} the SSGLDM features with (S = 3) perform better when compared with that derived from GLCM for both 32 and 64 gray levels. The results demonstrate also that our new proposed technique improves ability to distinguish cancer prostate tissues from healthy ones.

However, one can note, that the major issue arise from our proposed method is related to the high dimension of the data and its high correlation. This fact is clearly seen from Table 1 and 2 where classification accuracy is degraded while using SSGLDM with 16 bands. Thus, the main question to be solved is: does the band selection procedure improve the overall classification rate when using SSGLDM? How would one obtain the optimal number that maximizes classification accuracy and minimize computational requirements?

In the remainder of this section, we show the influence of selected band method using MDI criterion (Section 2.4) on the overall classification rate by applying the SSGLDM.

3.6. SSGLDM using band selection

In order to exploit inter-band correlation for reducing the multi-spectral band representation, the unsupervised band selection technique by Sotoca et al. [43] has been used to extract the best relevant bands of multispectral prostate cancer database.

All classification rates shown in this section were computed by using 3-fold cross-validation, such as described before.

Note that the same classification procedure used in Section 5.4 was applied for each data cube constructed by the selected bands. However, in the case of SSGLDM the features selection technique [42] was not used because of the small number of features (Figure 6).

To see the impact of the selected bands using MDI criterion on the classification accuracy performance, we tested the SSGLDM with a variable number of selected bands of multispectral images. Therefore, the plot of Figure 8 shows the result when using 2, 3, 4, 5, 6 and 16 selected bands as input data. A clear improvement of the classification accuracy can be observed by using five selected bands. However, the SSGLDM yields worst results, when more than six selected bands are used. Thus, processing a large number of multispectral bands can result in higher classification inaccuracy than processing a subset of relevant bands without redundancy.

On the other hand, the classical GLCM provides lower classification accuracy than SSGLDM. However, when more bands selected are used, GLCM performs better. This is mainly due to the use of minimum redundancy-maximum-relevance selection [42] that reduces the number of features and ensure a good classification accuracy.

To gain an insight into the classification of different classes of prostate cancer, the confusion matrices of both SSGLDM and GLCM using five selected bands are also given in Table 4. As can be seen from this table, the use of band selection procedure before applying SSGLDM reduces significantly the error rate in classes and insures a good discrimination power between different types of tissues.

Figure 9 illustrates the ROC curves obtained with the two methods GLCM and SSGLDM using five selected bands as input data. This clearly demonstrates that our new proposed SSGLDM results in an improved ability to distinguish cancer prostate tissues from healthy ones.

Finally, Table 5 summarizes the processing time for both GLCM and SSGLDM (implemented using Matlab 7.1) versus selected bands. This experiment is conducted on an Intel^{®}Core^{(TM)}2 CPU, 2.66 GHz and 3.58 GB Ram. As mentioned before, times are given for the computation of co-occurrence matrices and all features for each method.

It's clear from the table that the use of band selection technique reduces significantly the time computation. On the other hand, comparing the results of two methods, SSGLDM ran much faster than the GLCM mainly because SSGLDM features are extracted from the whole data cube of selected bands, unlike the GLDM features which are computed from each selected band.

4. Conclusion

This article describes a new method to generalize the concept of spatial gray level dependence method by assuming the presence of texture joint information between spectral bands. Two ways have been suggested to implement the proposed spatial and spectral gray level dependence method (SSGLDM): (a) applying SSGLDM for each subset of bands in the multi-spectral image; and (b) making a connection between band selection and SSGLDM by using MDI criterion before applying SSGLDM. Extensive experiments have been carried out on many multi-spectral images for use in prostate cancer diagnosis and quantitative results showed the efficiency of this method compared to the Gray GLCM. SSGLDM has also pro-vided better performances in terms of classification accuracy and computational complexity. Finally, due to the aspect of this area of research, many issues could be suggested. Open problems that can be investigated in the future include the following:

(1)

The new texture characterization method described in this article focuses on second order statistics. Therefore, the way forward could be to investigate alternative methods using higher-order statistics.

(2)

In this work, the most time consuming task was, by far, the computation of the generalized co-occurrence matrix, which mainly depends on spectral vector-pairs distances and a large numbers of spectral bands. The nature of the calculation makes it suitable for parallel processing because the same calculations are performed on successive image blocks.

(3)

The generalized multi-band texture method is proposed in this article to solve the characterization of multi-band texture images problem. Medical data sets that use multi-spectral data have been used to evaluate our proposed algorithm. In future, to apply our proposed algorithms to other applications such as hyper-spectral satellite imagery or skin cancer detection.

Endnote

^{a}The data set used in this study were provided from Pathology department team at Queen's university of Belfast under the direction of Prof. Hamilton.

Appendix

Support vector machines (SVM)

The aim of SVM is to produce a model (based on the training data) that predicts target value of data instances in the testing set which are given only by the attributes. Given a labeled training data set {(x_{1}, y_{1}), ..., (x_{
n
}, y_{
n
})}, where x_{
i
}∈ℝ^{n} and y_{
i
}∈ {− 1, 1}, the SVM [47, 51] require the solution of the following optimization problem:

In the case of a nonlinear classification of samples, the training vectors x_{
i
} are mapped into a higher (maybe infinite) dimensional space by the function φ, w is the vector of hyperplane coefficients (orientation), b is a bias term. The regularization parameter C controls the generalization capabilities of the classifier and it must be selected by the user, and ξ_{
i
} are positive slack variables enabling to deal with permitted errors. The decision function is found by solving the convex optimization problem

where α_{
i
} are the Lagrange coefficients. It is worth noting that all φ mappings used in the SVM learning occur in the forme of inner product. This allows us to define the classical Gaussian kernel K_{
σ
} given by this formula:

where the norm is the Euclidean norm and σ∈ℝ^{+} tunes the flexibility of the kernel. The classification of a sample × is achieved by looking to which side of the hyperplane it belongs

The SVMs are mainly a nonparametric method, yet some parameters need to be tuned before the optimization. In the case of Gaussian kernel, there are two parameters: C, which is the penalty term, and σ, which is the width of the exponential.

References

Haralick RM, Shanmugam K, Dinstein IH: Textural features for image classification. IEEE Trans Syst Man Cybern 1973, SMC-3(6):610-621.

Soh LK, Tsatsoulis C: Texture analysis of SAR sea ice imagery using gray level co-occurrence matrices. IEEE Trans Geosci Remote Sens 1999, GeoRS-37(2):780-795.

Do M, Vetterli M: Wavelet-based texture retrieval using generalized gaussian density and kullback-leibler distance. IEEE Trans Image Process 2002, IP 11(2):146-158.

Hazel G: Multivariate gaussian MRF for multispectral scene segmentation and anomaly detection. IEEE Trans Geosci Remote Sens 2000, GeoRS 38(2):1199-1211.

Derrode S, Mercier G, LeCaillec JM, Garello R: Estimation of sea ice SAR clutter statistics from Pearson's system of distributions. In International Geoscience and Re-mote Sensing Symposium (IGARSS'01). Volume 1. Sydney, Australia; 2001:190-192.

Benediktsson JA, Pesaresi M, Arnason K: Classification and feature extraction for re-mote sensing images from urban areas based on morphological transformations. IEEE Trans Geosci Remote Sens 2003, 41(9):1940-1949. 10.1109/TGRS.2003.814625

Kondepudy R, Healey G: Use of invariants for recognition of three-dimensional color textures. J Opt Soc Am A Opt Image Sci 1994, 11(11):3037-3049. 10.1364/JOSAA.11.003037

Rajadell O, Garca-Sevilla P, Pla F: Textural features for hyperspectral pixel classification. In proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis. Póvoa de Varzim, Portugal; 2009:208-216.

Rellier G, Descombesm X, Falzon F, Zerubia J: Texture feature analysis using a gauss-Markov model in hyperspectral image classification. IEEE Trans Geosci Remote Sens 2004, 42(7):1543-1551.

Lepisto L, Kunttu I, Autio J, Visa A: Classification method for colored natural textures using gabor filtering. In 12th International Conference on Image Analysis and Processing. Mantova, Italy; 2003:397-401.

Hauta-Kasari M, Parkkinen J, Jaaskelainen T, Lenz R: Generelized cooccurrence matrix for multispectral texture analysis. In Proceedings of the 13th International Conference on pattern Recognition, ICPR'96. Volume 2. Vienna, Austria; 1996:785-789.

Khelifi R, Adel M, Bourennane S: Texture classification for multi-spectral images using spatial and spectral gray level differences. In IEEE International Conference on Image Processing Theory, Tools and Applications, IPTA'10. Paris, France; 2010:330-333.

Khelifi R, Adel M, Bourennane S: Generalized gray level dependence method for prostate cancer classification. In IEEE International Workshop on Systems, Signal Pro-cessing and their Applications (WOSSPA). Tipaza, Algeria; 2011:295-298.

Lepisto L, Kunttu I, Autio J, Visa A: Rock image classification using non-homogeneous textures and spectral imaging. In International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision. Plzen, Czech Republic; 2003:82-86.

Tsai F, Chang CK, Rau JY, Lin TH, Liu GR: 3D computation of gray level co-occurrence in hyperspectral image cubes. Proceedings of the 6th international conference on Energy minimization methods in computer vision and pattern recognition 2007, 4679: 429-440. 10.1007/978-3-540-74198-5_33

Fauvel M, Benediktsson JA, Chanussot J, Sveinsson JR: Spectral and spatial classification of hyperspectral data using svms and morphological profiles. IEEE Trans Geosci Remote Sens 2008, 46(11):3804-3814.

Palmason JA, Benediktsson JA, Sveinsson JR, Chanussot J: Classification of hyperspectral data from urban areas using morphological preprocessing and independent component analysis. In Geoscience and Remote Sensing Symposium. Seoul, Korea; 2005:176-179.

Plaza A, Martinez P, Perez R, Plaza J: Spatial/spectral endmember extraction by multidimensional morphological operations. IEEE Trans Geosci Remote Sens 2002, 40(9):2025-2041. 10.1109/TGRS.2002.802494

Puissant A, Hirsch J, Weber C: The utility of texture analysis to improve per-pixel classification for high to very high spatial resolution imagery. Int J Remote Sens 2005, 26(4):733-745. 10.1080/01431160512331316838

Claussi D: An analysis of co-occurrence texture statistics as a function of grey level quantization. Can J Remote Sens 2002, 28(1):45-62. 10.5589/m02-004

Kiema J: Texture analysis and data fusion in the extraction of topographic objects from satellite imagery. Int J Remote Sens 2002, 23(4):767-776. 10.1080/01431160010026005

Bau T, Healey G: Rotation and scale invariant hyperspectral classification using 3D Gabor filters. Proc SPIE Int Soc Opt Eng 2009, 7334(15):73340B-73340B-13.

Jaim A, Healey G: A multiscale representation including oppponent color features for texture recognition. IEEE Trans Image Process 1998, 7(1):124-128. 10.1109/83.650858

Shi M, Healey G: Hyperspectral texture recognition using a multiscale opponent repre-sentation. IEEE Trans Geosci Remote Sens 2003, 41(5):1090-1095. 10.1109/TGRS.2003.811076

Dubuisson-Jolly MP, Gupta A: Color and texture fusion: application to aerial image segmentation and GIS updating. Image Vis Comput 2000, 18(10):823-832. 10.1016/S0262-8856(99)00050-5

Khelifi R, Adel M, Bourennane S: Spatial and spectral dependence co-occurrence method for multi-spectral image texture classification. In IEEE International Conference on Image Processing. Hong Kong, China; 2010:917-9200.

Chang CI, Du Q, Sun TL, Althouse LG: A joint band prioritization and band-decorrelation approach to band selection for Hyperspectral image classification. IEEE Trans Geosci Remote Sens 1999, 37(6):2631-2641. 10.1109/36.803411

Du H, Qi H, Wang X, Snyder WE: Band selection using independent component analysis for Hyperspectral image processing. In Proceedings of 32nd Applied Imagery Pattern Recognition Workshop. Washington DC; 2003:93-98.

Sotoca JM, Pla F, Klaren AC: Unsupervised band selection for Multispectral images using information theory. In Proceedings of International Conference on Pattern Recognition. Cambridge, UK; 2004:510-513.

Wang H, Angelopoulou E: Sensor band selection for multispectral imaging via average normalized information. J Real-Time Image Process 2006, 1(2):109-121. 10.1007/s11554-006-0014-9

Peng H, Long F, Ding C: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 2005, 27(8):1226-1238.

Sotoca JM, Pla F, Sanchez JS: Band selection in multispectral images by minimization of dependent information. IEEE Trans Syst Man Cybern Part C Appl Rev 2007, 37(2):258-267.

Letexier D, Bourennane S, Talon JB: Nonorthogonal tensor matricization for hyper-spectral image filtering. IEEE Geosci Remote Sens Lett 2008, 5(1):3-7.

Bouatmane S, Roula MA, Bouridane A, Al-Maadeed S: Round-Robin sequential forward selection algorithm for prostate cancer classification and diagnosis using multispectral imagery. In J Mac Vis Appl. Volume 1. Springer Verlag; 2010:1-14.

Roula MA: Machine vision and texture analysis for the automated identification of tissue patterns in prostatic neoplasia. 2004.

Letexier D, Bourennane S: Multidimensional wiener filtering using fourth order statistics of hyperspectral images. In IEEE International Conference on Acoustics, Speech and Signal Processing. Las Vegas, Nevada, USA; 2008:917-920.

I would to acknowledge contribution from Dr. MA Roula and the Pathology department team at the Queen's university of Belfast under the direction of Prof. Hamilton for kindly providing maging data used in this study.

Author information

Authors and Affiliations

Groupe GSM, Institut Fresnel, Ecole Centrale Marseille D. U. de Saint Jérôme Av. Escadrille Normandie, 13397, Marseille, France

Open Access
This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (
https://creativecommons.org/licenses/by/2.0
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Khelifi, R., Adel, M. & Bourennane, S. Multispectral texture characterization: application to computer aided diagnosis on prostatic tissue images.
EURASIP J. Adv. Signal Process.2012, 118 (2012). https://doi.org/10.1186/1687-6180-2012-118