An Expansion of X-Means for Automatically Determining the Optimal Number of Clusters – Progressive Iterations of K-Means and Merging of the Clusters

T. Ishioka (Japan)


Non-hierarchical clustering, Information criterion, BIC, Feedback operation, Computer simulation


We expand a non-hierarchicalclustering algorithm that can determine the optimal number of clusters by using itera tions of -means and a stopping rule based on Bayesian Information Criterion (BIC). The procedure requires merg ing the clusters that a -means iteration has made to avoid unsuitable division caused by the division order. By us ing this additional merging operation, the case of adequate clustering was increased for various types of simulation runs. With no prior information about the number of clus ters, our method can get the optimal clustering based on information theory instead of on a heuristic method. The computational complexity of our method is Ç´Æ ÐÓ µ for the sample size Æ and the number of final clusters, .

Important Links:

Go Back