Improvement for detection of microcalcifications through clustering algorithms and artificial neural networks

Quintanilla-Domínguez, Joel; Ojeda-Magaña, Benjamín; Marcano-Cedeño, Alexis; Cortina-Januchs, María G; Vega-Corona, Antonio; Andina, Diego

doi:10.1186/1687-6180-2011-91

Research
Open access
Published: 24 October 2011

Improvement for detection of microcalcifications through clustering algorithms and artificial neural networks

Joel Quintanilla-Domínguez^1,3,
Benjamín Ojeda-Magaña^1,2,
Alexis Marcano-Cedeño¹,
María G Cortina-Januchs^1,3,
Antonio Vega-Corona³ &
…
Diego Andina¹

EURASIP Journal on Advances in Signal Processing volume 2011, Article number: 91 (2011) Cite this article

5402 Accesses
9 Citations
Metrics details

Abstract

A new method for detecting microcalcifications in regions of interest (ROIs) extracted from digitized mammograms is proposed. The top-hat transform is a technique based on mathematical morphology operations and, in this paper, is used to perform contrast enhancement of the mi-crocalcifications. To improve microcalcification detection, a novel image sub-segmentation approach based on the possibilistic fuzzy c-means algorithm is used. From the original ROIs, window-based features, such as the mean and standard deviation, were extracted; these features were used as an input vector in a classifier. The classifier is based on an artificial neural network to identify patterns belonging to microcalcifications and healthy tissue. Our results show that the proposed method is a good alternative for automatically detecting microcalcifications, because this stage is an important part of early breast cancer detection.

1 Introduction

Breast cancer is one of the most serious types of cancer that affects women around the world. It is also one of the leading causes of mortality in middle-aged and elderly women. The International Agency for Research on Cancer (IARC) estimates that more than 1 million cases of breast cancer occur world-wide each year, with some 580,000 cases occurring in developed countries and the remainder in developing countries. The risk of a woman developing breast cancer during her lifetime is approximately 11% [1]. Early detection of breast cancer is of vital importance to successful of treatment, with the main goal of increasing the probability of survival for patients. Currently, the most reliable and practical method for early detection and screening of breast cancer is mammography. Microcalcifications (MCs) can be an important early sign of breast cancer; they appear as bright spots of calcium deposits. Individual MCs are sometimes difficult to detect because of the surrounding breast tissue and variations in shape, orientation, brightness and diameter [2]. MCs are potential primary indicators of malignant types of breast cancer. Therefore, their detection can be important in preventing and treating the disease. However, it is still difficult to detect all MCs in mammograms because of the poor contrast against the tissue that surrounds them.

Many methodologies have been presented by different authors to detect the presence of MCs in mammograms. These methodologies involve image processing techniques, pattern recognition methods and artificial intelligence approaches. Vega-Corona et al. [3] proposed a method for detecting MCs in digitized mam-mograms. The method consists of image enhancement by adaptive histogram equalization to improve the visibility of MCs with respect to the background, processing by multiscale wavelets and gray-level statistical techniques for feature extraction, clustering by the k-means algorithm for MC detection and, finally, using feature selection and a classifier based on a general regression neural network (GRNN) and multilayer perceptron (MLP) to classify MCs. Papadopoulos et al. [4] compared five image enhancement algorithms for improving MC cluster detection in mammography. Halkiotis et al. [5] proposed mathematical morphology for MC extraction from a non-uniform background; in this scheme, a set of features is extracted from original mammograms to test two classifiers based on artificial neural networks, such as MLP, and a radial basis function (RBF) neural network. Fu et al. [6] proposed a method based on two stages. The purpose of the first stage is to locate the suspected MCs; this stage is based on mathematical morphology and border detection to segment the MCs. The second stage is based on feature extraction and selection from the MCs located in the first stage; in the final part of this latter stage, these features are used as an input vector to test two classifiers based on a GRNN and support vector machine (SVM).

In this paper, a method for detecting MCs in the regions of interest (ROIs) extracted from digitized mammograms is presented. The main purpose of this method is to provide an automatic MC detection system that can help radiologists to improve the diagnosis of breast cancer at an early stage. The method is based on image processing, pattern recognition and artificial intelligence techniques. The different stages of the method are as follows: image enhancement based on mathematical morphology operations, a novel image sub-segmentation approach based on possibilistic fuzzy c-means (PFCM) algorithm, which is compared with image segmentation by the k-means algorithm, feature extraction based on window-based features such as the mean and standard deviation and, finally, the use of a classifier based on an artificial neural network (ANN) to automatically detect MCs. Figure 1 shows a block diagram of the proposed method.

2 ROI image enhancement

Over the past several years, methodologies have been developed for the detection and/or classification of MCs, but the interpretation of MCs continues to be a difficult task mainly because of their fuzzy nature, low contrast and low dis-tinguishability from their surroundings. The difficulty of MC detection depends on factors such as size, shape and distribution with respect to MC morphology. Another important factor that also makes MC detection difficult is the fact that MCs are often located across non-homogeneous backgrounds, and owing to their low contrast against the background, their intensity may be similar to that of noise or other structures [7, 8]. Therefore, in this paper, it is considered important to apply image enhancement.

Mathematical morphology is a discipline within the field of image processing that involves the structural analysis of images. The geometrical structure of an image is determined by locally comparing it with a predefined elementary set called a structuring element (SE). Image processing using morphological trans-formations is a process of information removal based on size and shape. In this process, irrelevant image content is selectively eliminated; thus, essential image features can be enhanced. Morphological operations are based on the relationships between two sets: an input image, I, and a processing operator, the SE, which is usually much smaller than the input image. By selecting the shape and size of a structuring element, different results can be obtained in the output image. The fundamental morphological operations are erosion and dilation.

The contrast can be defined as the difference in intensity between an image structure and its background. By combining morphological operations, several image processing tasks can be performed; however, in this work, we focus on those morphological operations that achieve contrast enhancement. In [8], a contrast enhancement technique using mathematical morphology is presented, called morphological contrast enhancement. Morphological contrast enhancement is based on morphological operations known as top-hat and bottom-hat transforms, which were proposed in [9]. A top-hat is a residual filter that preserves those features in an image that can fit within the structuring element and removes those that cannot; in other words, the top-hat transform is used to segment objects that differ in brightness from the surrounding background in images with uneven background intensity. The top-hat transform is defined by the following equation:

I_{T} (x, y) = I (x, y) - [(I (x, y) ⊖ S E) \otimes S E]

(1)

where I(x,y) is the input image, I_T(x,y) is the transformed image, SE is the structuring element, Ө represents the morphological erosion operation, ⊕ represents the morphological dilation operation, and - represents the image subtraction operation. [(I(x, y) Ө SE) ⊕ SE] is also known as the morphological opening operation. In previous works such as [8, 10], this technique was used to obtain satisfactory results in MC detection.

3 Image segmentation by partitional clustering algorithms

Image segmentation is an important task in the field of image processing and computer vision and involves the identification of objects or regions with the same features in an image. The aim of image segmentation is to divide an image into non-overlapping subregions that are homogeneous with respect to some features such as gray-level intensity or texture. The level to which the subdivision is carried out depends on the problem being solved [11].

Depending on the specific application, several methods based on different principles have been used for image segmentation, such as histogram thresholding [12, 13], edge detection [14, 15], region growing [16–18], fractal models [19–22], ANNs [23], swarm-based algorithms [24] and clustering techniques [3, 25–29].

In this paper, partitional clustering algorithms are considered for image segmentation, because of the great similarity between segmentation and clustering, although clustering was developed for feature space, whereas segmentation was developed for the spatial domain of an image.

The clustering techniques represent non-supervised pattern classification into groups or classes. The partitional clustering techniques are based on cluster analysis, which is the organization of a set of patterns (vector of measurements or a point in a d-dimensional space) into clusters based on similarity [30]. In the context of image segmentation, the set of patterns can be represented by an image in a d-dimensional space that depends on the number of features used to represent the pixels, where each point in this d-dimensional space will be named a pixel pattern. Within the same context, the clusters correspond to some semantic meaning in the image, which is referred to as an object. Therefore, the main goal of the clustering process is to obtain groups or classes from an unlabeled data set based on their similarities to facilitate further knowledge extraction. The similarity is evaluated according to a distance measure between the patterns and the prototypes or centers of the groups, and each pattern is assigned to the nearest or most similar prototype. However, this process must distribute all of the data to the different groups, even if some pixels are not very representative of the group as a whole [26]. In the field of medical imaging, segmentation plays an important role because it facilitates the delineation of anatomical structures and other regions that can be of interest. For the specific case of MC detection, several works based on image segmentation using partitional clustering algorithms have been proposed, such as [3, 25–27]. Two clustering techniques based on partitional clustering algorithms are compared in this paper to improve the MC detection.

3.1 k- means

The k-means or hard c-means (HCM) algorithm [31] is one of the simplest unsu-pervised learning algorithms that can solve the well-known clustering problem. The objective of the clustering algorithms is to cluster a given data set into several groups such that the data within a group are more similar to one another than those outside the group. Achieving such a partition requires a similarity measure that considers two vectors and returns a value reflecting their similarity. The k-means algorithm partitions a given data set into c clusters and computes cluster centers V = [v₁,v₂, ..., v_k], so that the following objective function can be minimized.

J (Z; U, V) = \sum_{i = 1}^{c} \sum_{k = 1}^{N} μ_{i k} {∥z_{k} - v_{i}∥}^{2}

(2)

where ||z_k- v_i||² is the chosen distance measure between a data point z_kand the cluster v_iis an indicator of the distance of the data points from their cluster prototypes. V = [v₁, v₂, ..., v_k] is the vector of prototypes of the c clusters, which are calculated according to:

v_{i} = \frac{1}{| A_{i} |} \sum_{z_{k} \in A_{i}} z_{k}

(3)

where |A_i| represents the number of data points belonging to cluster i.

To clarify, the procedures of the k-means algorithm are described as follows:

1.
Initialize the cluster center v _i, i = 1,..., c . This is typically achieved by randomly selecting c points from the data set.
2.
Determine u _ik , i = 1,2,.., c, k = 1,2 ,.., N, by equation (4)
$U = μ_{i k} = \{\begin{matrix} 1 & i f {∥z_{k} - v_{i}∥}^{2} \leq {∥z_{k} - v_{j}∥}^{2}, \forall j \neq i . \\ 0 & o t h e r w i s e . \end{matrix}$
(4)
3.
Compute the objective function according to (2). Stop if either it has converged or the improvement is below a threshold.
4.
Update the cluster center v _iusing (3), and then proceed to Step 2.

3.2 PFCM clustering algorithm

The PFCM is one of the most recently developed partitional clustering algorithms, which has the advantages of the fuzzy c-means (FCM) as well as the possibilistic c-means (PCM) algorithms. The FCM has a constraint that makes it very sensitive to outliers. To solve the problem of constraint of the FCM, Krisnapuram and Keller [32] developed the clustering algorithm PCM, which allows us to identify the degree of typicality that a data point has with respect to the group to which it belongs. The PCM has the problem, however, that sometimes the prototypes of clusters coincide, generating erroneous partitions of the feature space; for this reason, the PCM is not always successful. To solve the problems of the FCM (outlier sensitivity) and PCM (coincident clusters) clustering algorithms, Pal et al. [33] proposed a hybridized PFCM clustering model, where the function to be optimized is given by Equation 5:

J_{p f c m} (Z; U, T, V) = \sum_{i = 1}^{c} \sum_{k = 1}^{N} (a μ_{i k}^{m} + b t_{i k}^{η}) \times {∥z_{k} - v_{i}∥}^{2} + \sum_{i = 1}^{c} γ_{i} \sum_{k = 1}^{N} {(1 - t_{i k})}^{η},

(5)

and is subject to the constraints $\sum_{i = 1}^{c} μ_{i k} = 1 \forall k; 0 \leq μ_{i k}, t_{i k} \leq 1$ with the constants a > 0, b > 0, m > 1 and η > 1. The values of a and b represent the relative importance of membership and typicality values in the computation of the prototypes, respectively. The parameters m and η represent the absolute weight of the membership value and typicality value, respectively. To reduce the effect of outliers, one can set b > a and m > η.

Theorem PFCM[33]: If D_ikA= ||z_k-v_i|| > 0, for every i, k, m > 1, η > 1, and if

Z contains at least c distinct data points, then $(U, T, V) \in M_{f c m} \times M_{p c m} \times ℜ^{c \times N}$ may minimize J_pfcmonly if:

\begin{gathered} μ_{i k} = {({\sum_{j = 1}^{c} (\frac{D_{i k A_{i}}}{D_{j k A_{i}}})}^{2 ∕ (m - 1)})}^{- 1} \\ 1 \leq i \leq c; 1 \leq k \leq N \end{gathered}

(6)

\begin{gathered} t_{i k} = \frac{1}{1 + {(\frac{b}{γ_{i}} D_{i k_{A_{i}}}^{2})}^{1 ∕ (η - 1)}} \\ 1 \leq i \leq c; 1 \leq k \leq N \end{gathered}

(7)

v_{i} = \underset{1 \leq i \leq c .}{\sum_{k = 1}^{N} (a μ_{i k}^{m} + b t_{i k}^{η}) z_{k} / \sum_{k = 1}^{N} (a μ_{i k}^{m} + b t_{i k}^{η}),}

(8)

γ_{i} = K \frac{\sum_{k = 1}^{N} μ_{i k}^{m} {∥z^{k} - v_{i}∥}^{2}}{\sum_{k = 1}^{N} μ_{i k}^{m}}

(9)

The iterative process of this algorithm is presented in [33].

To segment the MCs in ROI images, a novel technique based on the PFCM clustering algorithm is used. This technique is called image sub-segmentation and was proposed by Ojeda-Magaña et al. [26].

Proposed approach for the detection of MCs by sub-segmentation

1.
Obtain the data vector.
2.
Assign a value to the parameters (a, b, m, η).
3.
Segment the image by taking into account the number of more representative regions, which in this case is two: suspicious region with the presence of the MCs (S ₁) and normal tissue (S ₂); the S ₂region is considered to be devoid of MCs.
4.
Run the PFCM algorithm to obtain:
- The membership matrix U.
- The typicality matrix T.
5.
Obtain the maximum typicality value for each pixel.
$T_{max} = {max}_{i} [t_{i k}], i = 1, . . ., c .$
(10)
6.
Select a value for the threshold α.
7.
With α and the T _max matrix, separate all of the pixels into two sub-matrices (T ₁ ,T ₂), with the first matrix:
$T_{1} = T_{max} \geq α$
(11)

containing the typical pixels of both regions (Stypical₁) and (Stypical₂), and the second matrix:
$T_{2} = T_{max} < α$
(12)

containing the atypical pixels of both regions (Satypical₁) and (Satypical₂); in this case, the atypical pixels are of most interest, especially the atypical pixels of (S₁).
8.
From the labeled pixels z _kof the T ₁ sub-matrix, the following subregions can be generated:
$T_{1} = S t y p i c a l_{1}, . . ., S t y p i c a l_{i}, i = 1, . . ., c .$
(13)

and from the T₂sub-matrix:
$T_{2} = S a t y p i c a l_{1 + i}, . . ., S a t y p i c a l_{2 i}, i = 1, . . ., c .$
(14)

such that each region S_i, i = 1,..., c is defined by:
$S_{i} = S t y p i c a l_{i} \cup S a t y p i c a l_{i + c .}$
(15)
9.
Select the sub-matrix T ₁ or T ₂ of interest for the corresponding analysis.

In this work, T₂is the sub-matrix of interest.

4 Microcalcification classification by ANN

Artificial neural networks (ANNs) are biologically inspired networks based on the neuron organization and decision-making process of the human brain [34]. In other words, they are mathematical models of the brain. ANNs are used in a wide variety of data processing applications where real-time data analysis and information extraction are required. One advantage of the ANNs approach is that most of the intense computation takes place during the training process. Once ANNs are trained for a particular task, operation is relatively fast and unknown samples can be rapidly identified in the field. An ANN can approximate the function of multiple inputs and outputs. As a consequence, ANNs can be used for a variety of applications, among which are classification in medical applications [3, 5, 23, 35], descriptive modeling, clustering, function approximation, time series prediction [36] and sonar or radar detection [37]. Classification is one of the most frequently encountered decision-making tasks in human activity. A classification problem occurs when an object needs to be assigned to a predefined group or class based on a number of observed patterns related to that object. In this paper, a classifier based on an ANN is used, with the aim of classifying patterns such as those that correspond to pixels belonging to healthy tissue or patterns that correspond to pixels belonging to microcalcifications, which we will call normal tissue class NT or MCs class, respectively. For this purpose, a multilayer perceptron (MLP) is used. The MLP is the most popular ANN for many practical applications, such as pattern recognition applications. The functionality of the MLP topology is determined by a learning algorithm, the back propagation (BP) [38], which is based on the method of steepest descent [39]. In upgrading connection weights, it is the algorithm most commonly used by the ANN scientific community.

5 Methodology and results

To test our method, a set of ten ROI images were selected from several mammograms of the mini-MIAS database provided by the Mammographic Image Analysis Society (MIAS) [40]. The size of each mammogram from this database is 1,024 × 1,024 pixels, with a spatial resolution of 200μ m/pixel. These mammograms were reviewed by an expert radiologist, and all abnormalities were identified and classified. The areas in which abnormalities such as MCs were located were taken as ROIs. In this work, ROI images measuring 256 × 256 pixels were used. Figure 2 a shows some ROI images used in this work.

5.1 Morphological enhancement

The morphological top-hat transform is used to enhance ROI images, with the aim of detecting objects that differ in brightness from the surrounding background; in our case, it was used to increase the contrast between the MCs and the background. During image enhancement, the same SE at different sizes, 3 × 3, 5 × 5, 7 × 7, was applied to perform the top-hat transform. The SE used in this work was a flat disk-shaped SE. Figure 2 shows the original ROI images processed by the top-hat transform with a SE of size 7 × 7.

5.2 Image segmentation by clustering

5.2.1 Data vector creation

A data vector Z for each ROI is generated for each of the images obtained from the previous stage. Thus, a unidimensional vector (x_se) is built by mapping the images to the pixels as follows:

{[{I_{T} (x, y)}_{1 \leq x \leq R, 1 \leq y \leq C}]}_{se} \to x_{se} = {\{x_{se}^{(q)}\}}_{q = 1, \dots, R \times C}

(16)

where se is the size of the SE, $x_{se}^{(q)}$ is the gray-level of the qth pixel of I_Twhen the image is decomposed row by row, and R and C correspond to the size of the image. Then, the data vector Z can be written as follows:

Z = {[x_{3 \times 3}, x_{5 \times 5}, x_{7 \times 7}]}^{T}

(17)

For data vector Z, two proposed clustering techniques are then applied to obtain a label for each pattern belonging to each cluster of the partition of feature space, where only one cluster corresponds to MCs, which generally appear in a group of just a few patterns (pixels), and the remaining clusters correspond to normal (healthy) tissue.

The initial conditions and results for each proposed clustering technique are presented below.

5.2.2 Segmentation by k-means

The initial conditions for this approach are as follows:

Cluster number: 2 to 4.
Prototypes: initialized as random values.
Distance measure: Euclidean distance function.

Figure 3 shows segmented ROI images with different cluster values obtained after applying the proposed k-means algorithm to the data vector Z.

5.2.3 Sub-segmentation by PFCM

In this case, the approach presented in Section 3.2 is applied, and the initial conditions are as follows:

Cluster number: 2.
Prototypes: initialized as random values.
Distance measure: Euclidean distance function.
a = 1, b = 2, m = 2, η = 2, α = 0.04, α = 0.03, α = 0.02.

Figure 4 shows segmented ROI images with different threshold values (α) obtained after applying the approach presented in Section 3.2 to the data vector Z.

According to the results obtained from the clustering process by k-means and PFCM, Table 1 shows the number of patterns assigned to classes MCs and NT, respectively, for our set of ten ROI images.

Table 1 Number of patterns assigned to MCs and NT

Full size table

5.2.4 Feature extraction

Two window-based features, such as the mean and standard deviation defined in Equations 18 and 19, respectively, are extracted.

I_{μ} (x, y) = \frac{1}{R \times C} \sum_{x = 1}^{R} \sum_{y = 1}^{C} f (x, y)

(18)

I_{σ} (x, y) = {[\frac{1}{R \times C} \sum_{x = 1}^{R} \sum_{y = 1}^{C} {(f (x, y) - I_{μ} (x, y))}^{2}]}^{1 ∕ 2}

(19)

where I_μ, I_σand f(x, y) represent the mean, standard deviation and the gray-level value of a pixel located in (x,y), respectively. These features are extracted from original ROI images within rectangular windows; in this work, we used three different pixel block windows with sizes (ws), 3 × 3, 5 × 5 and 7 × 7. In our work, each image obtained by this process is considered a feature that can be used to generate a set of patterns. In this set of patterns, there are patterns that represent the MCs and NT classes. We refer to this set of patterns as the feature vector (FV). We know a priori that, for each image used in this work, there are pixels belonging to the MCs and NT class. This FV is considered an input vector for the classifier. The FV is formed as follows:

FV = [i_{μ 3 \times 3}, i_{σ 3 \times 3}, i_{μ 5 \times 5}, i_{σ 5 \times 5}, i_{μ 7 \times 7}, i_{σ 7 \times 7}]

(20)

where:

{[{I_{T} (x, y)}_{1 \leq x \leq R, 1 \leq y \leq C}]}_{w s} \to i_{μ_{w s}} = {\{i_{μ_{w s}}^{(q)}\}}_{q = 1, \dots, R \times C}

(21)

{[{I_{σ} (x, y)}_{1 \leq x \leq R, 1 \leq y \leq C}]}_{w s} \to i_{σ_{w s}} = {\{i_{σ_{w s}}^{(q)}\}}_{q = 1, \dots, R \times C}

(22)

The labels of the two classes of the FV were obtained by the previous process. Owing to the large number of patterns that do not belong to the MCs class, with respect to the number of patterns that do belong to the MCs class, balancing was performed. Table 2 shows the subsets of the patterns for the MCs and NT classes.

Table 2 Results of balancing

Full size table

5.3 Microcalcification classification by ANN

A MLP was used to classify the patterns as NT or MCs, with the purpose of automatically identifying MCs in ROIs extracted from mammograms. To comparatively evaluate the performance of the classifiers, in this particular case, different network structures were trained and tested with the same training data set and the same testing data set. The best obtained results possessed the following structure and parameters:

1.
Number of input neurons equal to the number of attributes in FV: 6.
2.
Number of hidden layers: 1.
3.
Hidden neurons: see Table 4).
4.
Output neurons: 1 (all classifications present two classes).
5.
Learning rate: 1.
6.
Activation function is sigmoidal with values between [0,1].
7.
All weights randomly initialized.
8.
Training phase: back propagation (BP).
9.
Test training conditions:
1. (a)
  epochs: 2000.
2. (b)
  mean squared error (MSE): 0.001.

In this paper, we used patterns extracted from the FV set to train and test our classifiers: 80% of the patterns were used for training, and 20% of the patterns were used for testing (see Table 3).

Table 3 Number of patterns used for training and testing for each classifier

Full size table

Table 4 shows the optimal network structure and parameters for each FV.

Table 4 The best network structure and parameters for each database

Full size table

A confusion matrix to determine the probability of MC detection versus the probability of false MC detection was built. Table 5 shows the performance of the classifiers presented in this work. The performance of the proposed method was evaluated by means of ROC (receiver operating characteristics) curve analysis. The ROC curve is a two-dimensional measure of classification performance and is widely used in biomedical applications to assess the performance of diagnostic tests. The ROC curve is a plot of the sensitivity versus specificity for the different possible cut-points of a diagnostic test. Figure 5 shows the ROC curve and the area under the curve (AUC) for the classifiers with different network structures used in this work.

Table 5 Confusion matrices and performance of the classifiers

Full size table

Finally, Figure 6 shows the results of MC detection in the ROIs using the methodology proposed in this paper.

6 Discussion and conclusions

According to the performance of the classifiers as determined by means of the ROC curves (Figure 5) and the final images obtained (Figure 6), the proposed method is a promising alternative for automatically detecting MCs in ROIs extracted from digitized mammograms. This method involves several techniques that contribute to the MCs detection stage. The image segmentation stage is one of the most difficult stage when using partitional clustering algorithms, because these clustering algorithms are applied in the features space. Therefore, if the image contains noise or is not very homogeneous, image segmentation by clustering can be inaccurate. Thus, an image processing technique based on mathematical morphology was used to solve this problem. In the segmentation stage, two partitional clustering algorithms were used: k-means and PFCM. The k-means is the most popular technique, and its advantages and drawbacks are well known. With the PFCM, a new method for image segmentation called image sub-segmentation was used, in which the degrees of typicality of each data point were used to partition an image into two regions: one region with tissue suspected of harboring MCs and the other with normal (healthy) tissue. Then, the most atypical data points (pixels) of each region were identified; these data include possible abnormalities present in these regions, especially the region suspected of possessing MCs, because these atypical data, or abnormalities, represent the pixels belonging to potential MCs. For the ROI images used in this paper, both clustering algorithms used to perform image segmentation gave good results, although these results depend largely on good feature extraction and, in this paper, on the image enhancement stage. Once the MCs were detected from the original ROIs, window-based features such as the mean and standard deviation were extracted, which were then used as input vectors in a classifier. To perform this classification task, ANNs proved to be an excellent alternative. In this paper, a classifier based on the MLP was used. In the ROI images, the MCs class represented a lower percentage of pixels with respect to the number of pixels belonging to the healthy or normal tissue class. Therefore, balancing between patterns belonging to the MCs class and to the NT class was performed to obtain better results during the classification stage. Finally, according to the results obtained by applying our proposed method to these ROI images, the implemented method can detect pixels corresponding to microcalcifications or healthy tissue, thus fulfilling the aim of this paper.

References

Pal N, Bhowmick B, Patel S, Pal S, Das J: A multi-stage neural network aided system for detection of microcalcifications in digitized mammograms. Neurocomputing 2008,71(13-15):2625-2634. 10.1016/j.neucom.2007.06.015
Article Google Scholar
Wei L, Yang Y, Nishikawa R: Microcalcification classification assisted by content-based image retrieval for breast cancer diagnosis. Pattern Recognit 2009,42(6):1126-1132. 10.1016/j.patcog.2008.08.028
Article Google Scholar
Vega-Corona A, Álvarez A, Andina D: Feature Vectors Generation for Detection of Microcalcifications in Digitized Mammography Using Neural Net-works, vol. 2687. Artificial Neural Nets Problem Solving Methods, LNCS 2003, 583-590.
Chapter Google Scholar
Papadopoulos A, Fotiadis D, Costaridou L: Improvement of microcalcification cluster detection in mammography utilizing image enhancement techniques. Comput Biol Med 2008,38(10):1045-1055. 10.1016/j.compbiomed.2008.07.006
Article Google Scholar
Halkiotis S, Botsis T, Rangoussi M: Automatic detection of clustered mi-crocalcifications in digital mammograms using mathematical morphology and neural networks. Signal Proces 2007,87(7):1559-1568. 10.1016/j.sigpro.2007.01.004
Article MATH Google Scholar
Fu J, Lee S, Wong S, Yeh J, Wang A, Wu H: Image segmentation feature selection and pattern classification for mammographic microcalcifications. Com-put Med Imaging Graph 2005,29(6):419-429.
Article Google Scholar
Cheng H, Cai X, Chen X, Hu L, Lou X: Computer-aided detection and classification of microcalcifications in mammograms: a survey. Pattern Recognit 2003,36(12):2967-2991. 10.1016/S0031-3203(03)00192-4
Article MATH Google Scholar
Wirth M, Fraschini M, Lyon J: Contrast enhancement of microcalcifications in mammograms using morphological enhancement and non-flat structuring elements. 17th IEEE Symposium on Computer-Based Medical System 2004, 134-139.
Chapter Google Scholar
Meyer F: Iterative image transformations for an automatic screening of cervical smears. J Histochem Cytochem 1979,27(1):128-135. 10.1177/27.1.438499
Article Google Scholar
Stojić T, Reljin B: Enhancement of microcalcifications in digitized mam-mograms: Multifractal and mathematical morphology approach. FME Trans 2010, 38: 1-9.
Google Scholar
Gonzalez R, Woods R: Digital Image Processing. Prentice Hall; 2002.
Google Scholar
Ge J, Hadjiiski M, Sahiner B, Wei J, Helvie M, Zhou C, Chan H: Computer-aided detection system for clustered microcalcifications: comparison of per-formance on full-field digital mammograms and digitized screen-film mam-mograms. Phys Med Biol 2007,52(4):981-1000. 10.1088/0031-9155/52/4/008
Article Google Scholar
Wu Y, Huang Q, Peng Y, Situ W: Detection of microcalcifications in digital mammograms based on dual-threshold. International Workshop on Digital Mammography, IWDM 2006, 4046: 347-354. LNCS 10.1007/11783237_47
Article Google Scholar
Veni G, Regentova E, Zhang L: Detection of Clustered Microcalcifications with Susan Edge Detector, Adaptive Contrast Thresholding and Spatial Filters. Lecture notes in computer science 2008, 5112: 837-843. 10.1007/978-3-540-69812-8_83
Article Google Scholar
Jevtic A, Quintanilla-Dominguez J, Cortina-Januchs M, Andina D: Edge detection using ant colony search algorithm and multiscale contrast enhancement. Systems, Man and Cybernetics, SMC 2009. IEEE International Conference on, 2009 2193-2198.
Google Scholar
Qiu G, Jianhua Z, Shengyong C, Todd-Pokropek A: Automatic segmentation of micro-calcification based on sift in mammograms. International Conference on BioMedical Engineering and Informatics 2008, 2: 13-17.
Google Scholar
Bankman I, Nizialek T, Simon I, Gatewood O, Weinberg I, Brody W: Segmentation algorithms for detecting microcalcifications in mammograms. IEEE Trans Inf Technol Biomed 1997,1(2):141-149. 10.1109/4233.640656
Article Google Scholar
Rojas-Domínguez A, Nandi AK: Toward breast cancer diagnosis based on automated segmentation of masses in mammograms. Pattern Recognit 2009,42(6):1138-1148. 10.1016/j.patcog.2008.08.006
Article Google Scholar
Stojić T, Reljin I, Reljin B: Adaptation of multifractal analysis to segmentation of microcalcifications in digital mammograms. Phys A Stat Mech Appl 2006, 367: 494-508.
Article Google Scholar
Piuela JA, Andina D, McInnes K, Tarquis AM: Wavelet analysis in a structured clay soil using 2-d image. Nonlinear Proces Geophys 2007, 14: 425-234. 10.5194/npg-14-425-2007
Article Google Scholar
Dar-Ren C, Ruey-Feng C, Chii-Jen C, Minng-Feng H, Shou-Jen K, Che CS-T, Shin-Jer H, Moon WK: Classification of breast ultrasound image using fractal features. Clin Imaging 2005,29(4):235-245. 10.1016/j.clinimag.2004.11.024
Article Google Scholar
Li H, Liu R, Lo S: Fractal modeling and segmentation for the enhancement of microcalcifications in digital mammograms. IEEE Trans Med Imaging 1997, 166: 785-798.
Article Google Scholar
Verma B: Impact of multiple clusters on neural classification of rois in digital mammograms. International Joint Conference on Neural Networks 2009, 3220-3223.
Google Scholar
Jevtic A, Quintanilla-Dominguez J, Barron-Adame J, Andina D: Image segmentation using ant system-based clustering algorithm. Soft Computing Models in Industrial and Environmental Applications, 6th International Conference SOCO 2011 2011., 87:
Google Scholar
Bhattacharya M, Das A: Fuzzy logic based segmentation of microcalcifi-cation in breast using digital mammograms considering multiresolution. Int Mach Vis Image Process Conf 2007, 98-105.
Google Scholar
Ojeda-Magaña B, Quintanilla-Domínguez J, Ruelas R, Andina D: Images sub-segmentation with the pfcm clustering algorithm. 7th IEEE International Conference on Industrial Informatics 2009, 499-503.
Google Scholar
Bougioukos P, Glotsos D, Kostopoulos S, Daskalakis A, Kalatzis I, Dim-itropoulos N, Nikiforidis G, Cavouras D: Fuzzy c-means-driven fhce contextual segmentation method for mammographic microcalcification detection. Imaging Sci J 2009,58(3):146-154.
Article Google Scholar
Yang G, Zhou G, Yin Y, Yang X: K-means based fingerprint segmentation with sensor interoperability. EURASIP J Adv Signal Process 2010., 2010:
Google Scholar
Mu J, Liu X, Kamocka M, Xu Z, Alber M, Rosen E, Chen D: Segmentation, reconstruction, and analysis of blood thrombus formation in 3d 2-photonmicroscopy images. EURASIP J Adv Signal Process 2010., 2010:
Google Scholar
Jain A, Murty M, Flynn P: Data clustering: a review. ACM Comput Surv 1999., 31:
Google Scholar
MacQueen J: Some methods for classification and analysis of multivariate observations. Fifth Berkeley Symposium on Mathematical Statistics and Probability 1967, 1: 281-297.
MathSciNet MATH Google Scholar
Krishnapuram R, Keller JM: A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1993,1(2):98-110. 10.1109/91.227387
Article Google Scholar
Pal N, Pal S, Keller J, Bezdek J: A possibilistic fuzzy c-means clustering algorithm. IEEE Trans Fuzzy Syst 2005,13(4):517-530.
Article MathSciNet Google Scholar
Andina D, Pham D: Computational Intelligence for Engineering and Manufacturing. 1st edition. Springer; 2007.
MATH Google Scholar
Papadopoulos A, Fotiadis D, Likas A: An automatic microcalcification detection system based an a hybrid neural network classifier. Artif Intell Med 2002,25(2):149-167. 10.1016/S0933-3657(02)00013-1
Article Google Scholar
Cortina-Januchs M, Barrón-Adame J, Vega-Corona A, Andina D: Prevision of industrial SO₂pollutant concentration applying anns. 7th IEEE International Conference on Industrial Informatics, INDIN 2009, 510-515.
Google Scholar
Andina D, Sanz-Gonzalez J: On the problem of binary detection with neural networks. Circuits and Systems Proceedings., Proceedings of the 38th Midwest Symposium on 1995, 1: 554-557.
Article Google Scholar
Basheer I, Hajmeer M: Artificial neural networks: fundamentals, computing, design, and application. J Microbiol Methods 2000,43(1):3-31. 10.1016/S0167-7012(00)00201-3
Article Google Scholar
Hagan M, Demuth H, Beale M: Neural Network Design. PWS Pub Co, Boston; 1996.
Google Scholar
Suckling J, Parker J, Dance D: The mammographic image analysis society digital mammogram database. Exerpta Medica International Congress Series 1994, 1069: 375-378.
Google Scholar

Download references

7 Acknowledgements

The authors wish to thank the Group for Automation in Signal and Communications (GASC) of the Technical University of Madrid, the Laboratorio de Inteligencia Computacional (LABINCO) of the Guanajuato University, The National Council for Science and Technology (CONACyT), the Department of Project Engineering (CUCEI) of the University of Guadalajara and the Ph.D. Bernhard Angele.

Author information

Authors and Affiliations

Technical University of Madrid, 28040, Madrid, Spain
Joel Quintanilla-Domínguez, Benjamín Ojeda-Magaña, Alexis Marcano-Cedeño, María G Cortina-Januchs & Diego Andina
University of Guadalajara, 45101, Zapopan Jalisco, Mexico
Benjamín Ojeda-Magaña
University of Guanajuato, 36885, Salamanca Guanajuato, Mexico
Joel Quintanilla-Domínguez, María G Cortina-Januchs & Antonio Vega-Corona

Authors

Joel Quintanilla-Domínguez
View author publications
You can also search for this author in PubMed Google Scholar
Benjamín Ojeda-Magaña
View author publications
You can also search for this author in PubMed Google Scholar
Alexis Marcano-Cedeño
View author publications
You can also search for this author in PubMed Google Scholar
María G Cortina-Januchs
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Vega-Corona
View author publications
You can also search for this author in PubMed Google Scholar
Diego Andina
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joel Quintanilla-Domínguez.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Quintanilla-Domínguez, J., Ojeda-Magaña, B., Marcano-Cedeño, A. et al. Improvement for detection of microcalcifications through clustering algorithms and artificial neural networks. EURASIP J. Adv. Signal Process. 2011, 91 (2011). https://doi.org/10.1186/1687-6180-2011-91

Download citation

Received: 15 May 2011
Accepted: 24 October 2011
Published: 24 October 2011
DOI: https://doi.org/10.1186/1687-6180-2011-91

Improvement for detection of microcalcifications through clustering algorithms and artificial neural networks

Abstract

1 Introduction

2 ROI image enhancement

3 Image segmentation by partitional clustering algorithms

3.1 k- means

3.2 PFCM clustering algorithm

4 Microcalcification classification by ANN

5 Methodology and results

5.1 Morphological enhancement

5.2 Image segmentation by clustering

5.2.1 Data vector creation

5.2.2 Segmentation by k-means

5.2.3 Sub-segmentation by PFCM

5.2.4 Feature extraction

5.3 Microcalcification classification by ANN

6 Discussion and conclusions

References

7 Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Rights and permissions

About this article

Cite this article

Share this article

Keywords