X - International Journal of Information Science and Computer Mathematics (Closed Ed TRF)
Volume 1, Issue 2, Pages 89 - 96
(May 2010)
|
|
INCREMENTAL GENETIC k-MODES ALGORITHM FOR CLUSTERING CATEGORICAL DATA SETS
Dharmendra Kumar Roy and Lokesh Kumar Sharma
|
Abstract: Partitioning a set of objects in databases into homogeneous groups or clusters is a fundamental operation in data mining. In this paper, we present a clustering algorithm based on genetic k-modes paradigm that works well for data with categorical features. IGKMODE outperforms GKMODE when the mutation probability is small. The main idea of IGKMODE is to calculate the objective value Total Within-cluster Variation (TWCV) and to cluster centroids incrementally whenever the mutation probability is small. IGKMODE takes over the salient feature of GKMODE of always converging to the global optimum. It provides a better characterization of clusters. The performance of this algorithm has been studied on KDD Cup data sets. |
Keywords and phrases: data mining, clustering algorithm, categorical data. |
Communicated by Kewen Zhao |
Number of Downloads: 51 | Number of Views: 226 |
|