Improvement for detection of microcalcifications through clustering algorithms and artificial neural networks
© Quintanilla-Dominguez et al; licensee Springer. 2011
Received: 15 May 2011
Accepted: 24 October 2011
Published: 24 October 2011
A new method for detecting microcalcifications in regions of interest (ROIs) extracted from digitized mammograms is proposed. The top-hat transform is a technique based on mathematical morphology operations and, in this paper, is used to perform contrast enhancement of the mi-crocalcifications. To improve microcalcification detection, a novel image sub-segmentation approach based on the possibilistic fuzzy c-means algorithm is used. From the original ROIs, window-based features, such as the mean and standard deviation, were extracted; these features were used as an input vector in a classifier. The classifier is based on an artificial neural network to identify patterns belonging to microcalcifications and healthy tissue. Our results show that the proposed method is a good alternative for automatically detecting microcalcifications, because this stage is an important part of early breast cancer detection.
Breast cancer is one of the most serious types of cancer that affects women around the world. It is also one of the leading causes of mortality in middle-aged and elderly women. The International Agency for Research on Cancer (IARC) estimates that more than 1 million cases of breast cancer occur world-wide each year, with some 580,000 cases occurring in developed countries and the remainder in developing countries. The risk of a woman developing breast cancer during her lifetime is approximately 11% . Early detection of breast cancer is of vital importance to successful of treatment, with the main goal of increasing the probability of survival for patients. Currently, the most reliable and practical method for early detection and screening of breast cancer is mammography. Microcalcifications (MCs) can be an important early sign of breast cancer; they appear as bright spots of calcium deposits. Individual MCs are sometimes difficult to detect because of the surrounding breast tissue and variations in shape, orientation, brightness and diameter . MCs are potential primary indicators of malignant types of breast cancer. Therefore, their detection can be important in preventing and treating the disease. However, it is still difficult to detect all MCs in mammograms because of the poor contrast against the tissue that surrounds them.
Many methodologies have been presented by different authors to detect the presence of MCs in mammograms. These methodologies involve image processing techniques, pattern recognition methods and artificial intelligence approaches. Vega-Corona et al.  proposed a method for detecting MCs in digitized mam-mograms. The method consists of image enhancement by adaptive histogram equalization to improve the visibility of MCs with respect to the background, processing by multiscale wavelets and gray-level statistical techniques for feature extraction, clustering by the k-means algorithm for MC detection and, finally, using feature selection and a classifier based on a general regression neural network (GRNN) and multilayer perceptron (MLP) to classify MCs. Papadopoulos et al.  compared five image enhancement algorithms for improving MC cluster detection in mammography. Halkiotis et al.  proposed mathematical morphology for MC extraction from a non-uniform background; in this scheme, a set of features is extracted from original mammograms to test two classifiers based on artificial neural networks, such as MLP, and a radial basis function (RBF) neural network. Fu et al.  proposed a method based on two stages. The purpose of the first stage is to locate the suspected MCs; this stage is based on mathematical morphology and border detection to segment the MCs. The second stage is based on feature extraction and selection from the MCs located in the first stage; in the final part of this latter stage, these features are used as an input vector to test two classifiers based on a GRNN and support vector machine (SVM).
2 ROI image enhancement
Over the past several years, methodologies have been developed for the detection and/or classification of MCs, but the interpretation of MCs continues to be a difficult task mainly because of their fuzzy nature, low contrast and low dis-tinguishability from their surroundings. The difficulty of MC detection depends on factors such as size, shape and distribution with respect to MC morphology. Another important factor that also makes MC detection difficult is the fact that MCs are often located across non-homogeneous backgrounds, and owing to their low contrast against the background, their intensity may be similar to that of noise or other structures [7, 8]. Therefore, in this paper, it is considered important to apply image enhancement.
Mathematical morphology is a discipline within the field of image processing that involves the structural analysis of images. The geometrical structure of an image is determined by locally comparing it with a predefined elementary set called a structuring element (SE). Image processing using morphological trans-formations is a process of information removal based on size and shape. In this process, irrelevant image content is selectively eliminated; thus, essential image features can be enhanced. Morphological operations are based on the relationships between two sets: an input image, I, and a processing operator, the SE, which is usually much smaller than the input image. By selecting the shape and size of a structuring element, different results can be obtained in the output image. The fundamental morphological operations are erosion and dilation.
where I(x,y) is the input image, I T (x,y) is the transformed image, SE is the structuring element, Ө represents the morphological erosion operation, ⊕ represents the morphological dilation operation, and - represents the image subtraction operation. [(I(x, y) Ө SE) ⊕ SE] is also known as the morphological opening operation. In previous works such as [8, 10], this technique was used to obtain satisfactory results in MC detection.
3 Image segmentation by partitional clustering algorithms
Image segmentation is an important task in the field of image processing and computer vision and involves the identification of objects or regions with the same features in an image. The aim of image segmentation is to divide an image into non-overlapping subregions that are homogeneous with respect to some features such as gray-level intensity or texture. The level to which the subdivision is carried out depends on the problem being solved .
Depending on the specific application, several methods based on different principles have been used for image segmentation, such as histogram thresholding [12, 13], edge detection [14, 15], region growing [16–18], fractal models [19–22], ANNs , swarm-based algorithms  and clustering techniques [3, 25–29].
In this paper, partitional clustering algorithms are considered for image segmentation, because of the great similarity between segmentation and clustering, although clustering was developed for feature space, whereas segmentation was developed for the spatial domain of an image.
The clustering techniques represent non-supervised pattern classification into groups or classes. The partitional clustering techniques are based on cluster analysis, which is the organization of a set of patterns (vector of measurements or a point in a d-dimensional space) into clusters based on similarity . In the context of image segmentation, the set of patterns can be represented by an image in a d-dimensional space that depends on the number of features used to represent the pixels, where each point in this d-dimensional space will be named a pixel pattern. Within the same context, the clusters correspond to some semantic meaning in the image, which is referred to as an object. Therefore, the main goal of the clustering process is to obtain groups or classes from an unlabeled data set based on their similarities to facilitate further knowledge extraction. The similarity is evaluated according to a distance measure between the patterns and the prototypes or centers of the groups, and each pattern is assigned to the nearest or most similar prototype. However, this process must distribute all of the data to the different groups, even if some pixels are not very representative of the group as a whole . In the field of medical imaging, segmentation plays an important role because it facilitates the delineation of anatomical structures and other regions that can be of interest. For the specific case of MC detection, several works based on image segmentation using partitional clustering algorithms have been proposed, such as [3, 25–27]. Two clustering techniques based on partitional clustering algorithms are compared in this paper to improve the MC detection.
3.1 k- means
where |A i | represents the number of data points belonging to cluster i.
Initialize the cluster center v i , i = 1,..., c . This is typically achieved by randomly selecting c points from the data set.
- 2.Determine u ik , i = 1,2,.., c, k = 1,2 ,.., N, by equation (4)(4)
Compute the objective function according to (2). Stop if either it has converged or the improvement is below a threshold.
Update the cluster center v i using (3), and then proceed to Step 2.
3.2 PFCM clustering algorithm
and is subject to the constraints with the constants a > 0, b > 0, m > 1 and η > 1. The values of a and b represent the relative importance of membership and typicality values in the computation of the prototypes, respectively. The parameters m and η represent the absolute weight of the membership value and typicality value, respectively. To reduce the effect of outliers, one can set b > a and m > η.
Theorem PFCM: If D ikA = ||z k -v i || > 0, for every i, k, m > 1, η > 1, and if
The iterative process of this algorithm is presented in .
To segment the MCs in ROI images, a novel technique based on the PFCM clustering algorithm is used. This technique is called image sub-segmentation and was proposed by Ojeda-Magaña et al. .
Obtain the data vector.
Assign a value to the parameters (a, b, m, η).
Segment the image by taking into account the number of more representative regions, which in this case is two: suspicious region with the presence of the MCs (S 1) and normal tissue (S 2 ); the S 2 region is considered to be devoid of MCs.
- 4.Run the PFCM algorithm to obtain:
The membership matrix U.
The typicality matrix T.
- 5.Obtain the maximum typicality value for each pixel.(10)
Select a value for the threshold α.
- 7.With α and the T max matrix, separate all of the pixels into two sub-matrices (T 1 ,T 2), with the first matrix:(11)containing the typical pixels of both regions (Stypical1) and (Stypical2), and the second matrix:(12)
containing the atypical pixels of both regions (Satypical1) and (Satypical2); in this case, the atypical pixels are of most interest, especially the atypical pixels of (S1).
- 8.From the labeled pixels z k of the T 1 sub-matrix, the following subregions can be generated:(13)and from the T 2 sub-matrix:(14)such that each region S i , i = 1,..., c is defined by:(15)
Select the sub-matrix T 1 or T 2 of interest for the corresponding analysis.
In this work, T 2 is the sub-matrix of interest.
4 Microcalcification classification by ANN
Artificial neural networks (ANNs) are biologically inspired networks based on the neuron organization and decision-making process of the human brain . In other words, they are mathematical models of the brain. ANNs are used in a wide variety of data processing applications where real-time data analysis and information extraction are required. One advantage of the ANNs approach is that most of the intense computation takes place during the training process. Once ANNs are trained for a particular task, operation is relatively fast and unknown samples can be rapidly identified in the field. An ANN can approximate the function of multiple inputs and outputs. As a consequence, ANNs can be used for a variety of applications, among which are classification in medical applications [3, 5, 23, 35], descriptive modeling, clustering, function approximation, time series prediction  and sonar or radar detection . Classification is one of the most frequently encountered decision-making tasks in human activity. A classification problem occurs when an object needs to be assigned to a predefined group or class based on a number of observed patterns related to that object. In this paper, a classifier based on an ANN is used, with the aim of classifying patterns such as those that correspond to pixels belonging to healthy tissue or patterns that correspond to pixels belonging to microcalcifications, which we will call normal tissue class NT or MCs class, respectively. For this purpose, a multilayer perceptron (MLP) is used. The MLP is the most popular ANN for many practical applications, such as pattern recognition applications. The functionality of the MLP topology is determined by a learning algorithm, the back propagation (BP) , which is based on the method of steepest descent . In upgrading connection weights, it is the algorithm most commonly used by the ANN scientific community.
5 Methodology and results
5.1 Morphological enhancement
The morphological top-hat transform is used to enhance ROI images, with the aim of detecting objects that differ in brightness from the surrounding background; in our case, it was used to increase the contrast between the MCs and the background. During image enhancement, the same SE at different sizes, 3 × 3, 5 × 5, 7 × 7, was applied to perform the top-hat transform. The SE used in this work was a flat disk-shaped SE. Figure 2 shows the original ROI images processed by the top-hat transform with a SE of size 7 × 7.
5.2 Image segmentation by clustering
5.2.1 Data vector creation
For data vector Z, two proposed clustering techniques are then applied to obtain a label for each pattern belonging to each cluster of the partition of feature space, where only one cluster corresponds to MCs, which generally appear in a group of just a few patterns (pixels), and the remaining clusters correspond to normal (healthy) tissue.
The initial conditions and results for each proposed clustering technique are presented below.
5.2.2 Segmentation by k-means
Cluster number: 2 to 4.
Prototypes: initialized as random values.
Distance measure: Euclidean distance function.
5.2.3 Sub-segmentation by PFCM
Cluster number: 2.
Prototypes: initialized as random values.
Distance measure: Euclidean distance function.
a = 1, b = 2, m = 2, η = 2, α = 0.04, α = 0.03, α = 0.02.
Number of patterns assigned to MCs and NT
Number of patterns by k-means
Number of patterns by sub-segmentation with PFCM
5.2.4 Feature extraction
Results of balancing
Number of patterns by k-means
Number of patterns by sub-segmentation with PFCM
5.3 Microcalcification classification by ANN
Number of input neurons equal to the number of attributes in FV: 6.
Number of hidden layers: 1.
Hidden neurons: see Table 4).
Output neurons: 1 (all classifications present two classes).
Learning rate: 1.
Activation function is sigmoidal with values between [0,1].
All weights randomly initialized.
Training phase: back propagation (BP).
- 9.Test training conditions:
mean squared error (MSE): 0.001.
Number of patterns used for training and testing for each classifier
Numbers of Sample
The best network structure and parameters for each database
Data set (FV)
Confusion matrices and performance of the classifiers
Total class. Accuracy (%)
6 : 15 : 1
6 : 12 : 1
6 Discussion and conclusions
According to the performance of the classifiers as determined by means of the ROC curves (Figure 5) and the final images obtained (Figure 6), the proposed method is a promising alternative for automatically detecting MCs in ROIs extracted from digitized mammograms. This method involves several techniques that contribute to the MCs detection stage. The image segmentation stage is one of the most difficult stage when using partitional clustering algorithms, because these clustering algorithms are applied in the features space. Therefore, if the image contains noise or is not very homogeneous, image segmentation by clustering can be inaccurate. Thus, an image processing technique based on mathematical morphology was used to solve this problem. In the segmentation stage, two partitional clustering algorithms were used: k-means and PFCM. The k-means is the most popular technique, and its advantages and drawbacks are well known. With the PFCM, a new method for image segmentation called image sub-segmentation was used, in which the degrees of typicality of each data point were used to partition an image into two regions: one region with tissue suspected of harboring MCs and the other with normal (healthy) tissue. Then, the most atypical data points (pixels) of each region were identified; these data include possible abnormalities present in these regions, especially the region suspected of possessing MCs, because these atypical data, or abnormalities, represent the pixels belonging to potential MCs. For the ROI images used in this paper, both clustering algorithms used to perform image segmentation gave good results, although these results depend largely on good feature extraction and, in this paper, on the image enhancement stage. Once the MCs were detected from the original ROIs, window-based features such as the mean and standard deviation were extracted, which were then used as input vectors in a classifier. To perform this classification task, ANNs proved to be an excellent alternative. In this paper, a classifier based on the MLP was used. In the ROI images, the MCs class represented a lower percentage of pixels with respect to the number of pixels belonging to the healthy or normal tissue class. Therefore, balancing between patterns belonging to the MCs class and to the NT class was performed to obtain better results during the classification stage. Finally, according to the results obtained by applying our proposed method to these ROI images, the implemented method can detect pixels corresponding to microcalcifications or healthy tissue, thus fulfilling the aim of this paper.
The authors wish to thank the Group for Automation in Signal and Communications (GASC) of the Technical University of Madrid, the Laboratorio de Inteligencia Computacional (LABINCO) of the Guanajuato University, The National Council for Science and Technology (CONACyT), the Department of Project Engineering (CUCEI) of the University of Guadalajara and the Ph.D. Bernhard Angele.
- Pal N, Bhowmick B, Patel S, Pal S, Das J: A multi-stage neural network aided system for detection of microcalcifications in digitized mammograms. Neurocomputing 2008,71(13-15):2625-2634. 10.1016/j.neucom.2007.06.015View ArticleGoogle Scholar
- Wei L, Yang Y, Nishikawa R: Microcalcification classification assisted by content-based image retrieval for breast cancer diagnosis. Pattern Recognit 2009,42(6):1126-1132. 10.1016/j.patcog.2008.08.028View ArticleGoogle Scholar
- Vega-Corona A, Álvarez A, Andina D: Feature Vectors Generation for Detection of Microcalcifications in Digitized Mammography Using Neural Net-works, vol. 2687. Artificial Neural Nets Problem Solving Methods, LNCS 2003, 583-590.View ArticleGoogle Scholar
- Papadopoulos A, Fotiadis D, Costaridou L: Improvement of microcalcification cluster detection in mammography utilizing image enhancement techniques. Comput Biol Med 2008,38(10):1045-1055. 10.1016/j.compbiomed.2008.07.006View ArticleGoogle Scholar
- Halkiotis S, Botsis T, Rangoussi M: Automatic detection of clustered mi-crocalcifications in digital mammograms using mathematical morphology and neural networks. Signal Proces 2007,87(7):1559-1568. 10.1016/j.sigpro.2007.01.004View ArticleMATHGoogle Scholar
- Fu J, Lee S, Wong S, Yeh J, Wang A, Wu H: Image segmentation feature selection and pattern classification for mammographic microcalcifications. Com-put Med Imaging Graph 2005,29(6):419-429.View ArticleGoogle Scholar
- Cheng H, Cai X, Chen X, Hu L, Lou X: Computer-aided detection and classification of microcalcifications in mammograms: a survey. Pattern Recognit 2003,36(12):2967-2991. 10.1016/S0031-3203(03)00192-4View ArticleMATHGoogle Scholar
- Wirth M, Fraschini M, Lyon J: Contrast enhancement of microcalcifications in mammograms using morphological enhancement and non-flat structuring elements. 17th IEEE Symposium on Computer-Based Medical System 2004, 134-139.View ArticleGoogle Scholar
- Meyer F: Iterative image transformations for an automatic screening of cervical smears. J Histochem Cytochem 1979,27(1):128-135. 10.1177/27.1.438499View ArticleGoogle Scholar
- Stojić T, Reljin B: Enhancement of microcalcifications in digitized mam-mograms: Multifractal and mathematical morphology approach. FME Trans 2010, 38: 1-9.Google Scholar
- Gonzalez R, Woods R: Digital Image Processing. Prentice Hall; 2002.Google Scholar
- Ge J, Hadjiiski M, Sahiner B, Wei J, Helvie M, Zhou C, Chan H: Computer-aided detection system for clustered microcalcifications: comparison of per-formance on full-field digital mammograms and digitized screen-film mam-mograms. Phys Med Biol 2007,52(4):981-1000. 10.1088/0031-9155/52/4/008View ArticleGoogle Scholar
- Wu Y, Huang Q, Peng Y, Situ W: Detection of microcalcifications in digital mammograms based on dual-threshold. International Workshop on Digital Mammography, IWDM 2006, 4046: 347-354. LNCS 10.1007/11783237_47View ArticleGoogle Scholar
- Veni G, Regentova E, Zhang L: Detection of Clustered Microcalcifications with Susan Edge Detector, Adaptive Contrast Thresholding and Spatial Filters. Lecture notes in computer science 2008, 5112: 837-843. 10.1007/978-3-540-69812-8_83View ArticleGoogle Scholar
- Jevtic A, Quintanilla-Dominguez J, Cortina-Januchs M, Andina D: Edge detection using ant colony search algorithm and multiscale contrast enhancement. Systems, Man and Cybernetics, SMC 2009. IEEE International Conference on, 2009 2193-2198.Google Scholar
- Qiu G, Jianhua Z, Shengyong C, Todd-Pokropek A: Automatic segmentation of micro-calcification based on sift in mammograms. International Conference on BioMedical Engineering and Informatics 2008, 2: 13-17.Google Scholar
- Bankman I, Nizialek T, Simon I, Gatewood O, Weinberg I, Brody W: Segmentation algorithms for detecting microcalcifications in mammograms. IEEE Trans Inf Technol Biomed 1997,1(2):141-149. 10.1109/4233.640656View ArticleGoogle Scholar
- Rojas-Domínguez A, Nandi AK: Toward breast cancer diagnosis based on automated segmentation of masses in mammograms. Pattern Recognit 2009,42(6):1138-1148. 10.1016/j.patcog.2008.08.006View ArticleGoogle Scholar
- Stojić T, Reljin I, Reljin B: Adaptation of multifractal analysis to segmentation of microcalcifications in digital mammograms. Phys A Stat Mech Appl 2006, 367: 494-508.View ArticleGoogle Scholar
- Piuela JA, Andina D, McInnes K, Tarquis AM: Wavelet analysis in a structured clay soil using 2-d image. Nonlinear Proces Geophys 2007, 14: 425-234. 10.5194/npg-14-425-2007View ArticleGoogle Scholar
- Dar-Ren C, Ruey-Feng C, Chii-Jen C, Minng-Feng H, Shou-Jen K, Che CS-T, Shin-Jer H, Moon WK: Classification of breast ultrasound image using fractal features. Clin Imaging 2005,29(4):235-245. 10.1016/j.clinimag.2004.11.024View ArticleGoogle Scholar
- Li H, Liu R, Lo S: Fractal modeling and segmentation for the enhancement of microcalcifications in digital mammograms. IEEE Trans Med Imaging 1997, 166: 785-798.View ArticleGoogle Scholar
- Verma B: Impact of multiple clusters on neural classification of rois in digital mammograms. International Joint Conference on Neural Networks 2009, 3220-3223.Google Scholar
- Jevtic A, Quintanilla-Dominguez J, Barron-Adame J, Andina D: Image segmentation using ant system-based clustering algorithm. Soft Computing Models in Industrial and Environmental Applications, 6th International Conference SOCO 2011 2011., 87:Google Scholar
- Bhattacharya M, Das A: Fuzzy logic based segmentation of microcalcifi-cation in breast using digital mammograms considering multiresolution. Int Mach Vis Image Process Conf 2007, 98-105.Google Scholar
- Ojeda-Magaña B, Quintanilla-Domínguez J, Ruelas R, Andina D: Images sub-segmentation with the pfcm clustering algorithm. 7th IEEE International Conference on Industrial Informatics 2009, 499-503.Google Scholar
- Bougioukos P, Glotsos D, Kostopoulos S, Daskalakis A, Kalatzis I, Dim-itropoulos N, Nikiforidis G, Cavouras D: Fuzzy c-means-driven fhce contextual segmentation method for mammographic microcalcification detection. Imaging Sci J 2009,58(3):146-154.View ArticleGoogle Scholar
- Yang G, Zhou G, Yin Y, Yang X: K-means based fingerprint segmentation with sensor interoperability. EURASIP J Adv Signal Process 2010., 2010:Google Scholar
- Mu J, Liu X, Kamocka M, Xu Z, Alber M, Rosen E, Chen D: Segmentation, reconstruction, and analysis of blood thrombus formation in 3d 2-photonmicroscopy images. EURASIP J Adv Signal Process 2010., 2010:Google Scholar
- Jain A, Murty M, Flynn P: Data clustering: a review. ACM Comput Surv 1999., 31:Google Scholar
- MacQueen J: Some methods for classification and analysis of multivariate observations. Fifth Berkeley Symposium on Mathematical Statistics and Probability 1967, 1: 281-297.MathSciNetMATHGoogle Scholar
- Krishnapuram R, Keller JM: A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1993,1(2):98-110. 10.1109/91.227387View ArticleGoogle Scholar
- Pal N, Pal S, Keller J, Bezdek J: A possibilistic fuzzy c-means clustering algorithm. IEEE Trans Fuzzy Syst 2005,13(4):517-530.MathSciNetView ArticleGoogle Scholar
- Andina D, Pham D: Computational Intelligence for Engineering and Manufacturing. 1st edition. Springer; 2007.MATHGoogle Scholar
- Papadopoulos A, Fotiadis D, Likas A: An automatic microcalcification detection system based an a hybrid neural network classifier. Artif Intell Med 2002,25(2):149-167. 10.1016/S0933-3657(02)00013-1View ArticleGoogle Scholar
- Cortina-Januchs M, Barrón-Adame J, Vega-Corona A, Andina D: Prevision of industrial SO 2 pollutant concentration applying anns. 7th IEEE International Conference on Industrial Informatics, INDIN 2009, 510-515.Google Scholar
- Andina D, Sanz-Gonzalez J: On the problem of binary detection with neural networks. Circuits and Systems Proceedings., Proceedings of the 38th Midwest Symposium on 1995, 1: 554-557.View ArticleGoogle Scholar
- Basheer I, Hajmeer M: Artificial neural networks: fundamentals, computing, design, and application. J Microbiol Methods 2000,43(1):3-31. 10.1016/S0167-7012(00)00201-3View ArticleGoogle Scholar
- Hagan M, Demuth H, Beale M: Neural Network Design. PWS Pub Co, Boston; 1996.Google Scholar
- Suckling J, Parker J, Dance D: The mammographic image analysis society digital mammogram database. Exerpta Medica International Congress Series 1994, 1069: 375-378.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.