Frequent Pattern Mining from High-Dimensional Data using Record Space Search

K. Mori and R. Orihara (Japan)

Keywords

Data mining, association rule, frequent pattern mining, high-dimensional data, record space search, parallel pro cessing

Abstract

Traditional frequent pattern mining methods have a prob lem in that the order of calculation exponentially increases with high-dimensional data because of a search using com binations of attributes. The purpose of our work is to develop methods that efficiently extract frequent patterns from very high-dimensional data. We propose HD FPM that can solve the problem using a record space search and a minimum pattern length pruning. The record space search means the search using combinations of records. We can extract frequent patterns from attributes common to the combinations of records. We can also reduce a search space using a minimum pattern length pruning. Several experi ments on real microvax datasets show that HD FPM has better performance than previous closed frequent pattern mining algorithms such as Ft/close and CHARM in the case that minimum support is low. We also propose parallel HD IRPM that can solve the problem using vertical partitioning of a database and parallel processing. Our evaluation of parallel HD FPM performed with a real microarray dataset on 16 Pus has revealed that it is l 3 times faster than a se quential one. In conclusion, HD FPM and parallel HD FPM are effective algorithms for frequent pattern mining from high-dimensional data.

Important Links:



Go Back