Skip to main content

Cluster Structure Inference Based on Clustering Stability with Applications to Microarray Data Analysis

Abstract

This paper focuses on the stability-based approach for estimating the number of clusters in microarray data. The cluster stability approach amounts to performing clustering successively over random subsets of the available data and evaluating an index which expresses the similarity of the successive partitions obtained. We present a method for automatically estimating by starting from the distribution of the similarity index. We investigate how the selection of the hierarchical clustering (HC) method, respectively, the similarity index, influences the estimation accuracy. The paper introduces a new similarity index based on a partition distance. The performance of the new index and that of other well-known indices are experimentally evaluated by comparing the "true" data partition with the partition obtained at each level of an HC tree. A case study is conducted with a publicly available Leukemia dataset.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ciprian Doru Giurcăneanu.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Giurcăneanu, C.D., Tăbuş, I. Cluster Structure Inference Based on Clustering Stability with Applications to Microarray Data Analysis. EURASIP J. Adv. Signal Process. 2004, 545761 (2004). https://doi.org/10.1155/S1110865704309078

Download citation

Keywords and phrases

  • clustering stability
  • number of clusters
  • hierarchical clustering methods
  • similarity indices
  • partition-distance
  • microarray data
\