EquipAsso: An Algorithm based on New Relational Algebraic Operators for Association Rules Discovery

R. Timarán Pereira and M. Millán (Columbia)

Keywords

Data Mining, Association Rules, Large Itemsets, Algebraic Operators, Primitives SQL.

Abstract

The task of search for interesting relationships among data has been always an research focus in data mining. The overall performance of mining association rules is determined by the discover the large itemsets, i.e., the sets of itemsets that have their support above a pre-determined minimum support. The algorithms proposed for association rules show different approaches to generate all large itemsets (Apriori, AprioriTid, AprioriHibrido, DHP (Standing for Direct Hashing and Pruning), DIC( Dynamic Itemset Counting), Partition, y FP-tree). However, none of these algorithms is based on operators of the relational algebra and all they have been implemented on outside of databases engines. In this paper, an algorithm called EquipAsso is proposed. EquipAsso is an algorithm for the discovery the large itemsets based on two new operators of relational algebra: Associator and EquiKeep. These two operators facilitate the search of large itemsets and they allow the tightly coupled integration of the algorithm with a relational database system. Associator y EquiKeep are implemented in the SQL SELECT clause as two new primitives Associator Range and EquiKeep On.

Important Links:



Go Back