An Efficient Clustering Method for High-dimensional Data in Data Mining Applications

J.-W. Chang and Y.-K. Kim (Korea)

Keywords

Cell-based clustering methods, high-dimensional data, data mining

Abstract

Most clustering methods for the data mining applications do not work efficiently for dealing with large, high dimensional data because of the so-called `curse of dimensionality' [1] and the limitation of available memory. In this paper, we propose an efficient cell-based clustering method for handling a large of amount of high dimensional data. Our clustering method provides an efficient cell creation algorithm using a space-partitioning technique and a cell insertion algorithm to construct clusters as cells with more density than a given threshold. We also propose a new filtering-based index structure using an approximation technique. The experimental results show that our cell-based clustering method achieves better performance on cluster construction time and retrieval time than the CLIQUE method. Finally, our clustering method shows good performance on system efficiency being a measure to combine both precision and retrieval time.

Important Links:



Go Back