Finding Digest of Rules: Towards Data-Driven Data Mining

M. Sapir and M. Teverovskiy (USA)


Induction, probabilistic patterns, exploratory data mining, interesting rules, non-redundant rules


We propose a new approach to learn a compact set of ro bust production rules from data. In most of other meth ods, selection of interesting rules is defined by subjective thresholds values of some criteria, such as support and con fidence. This leads to subjectivity of the inducted knowl edge. Here, we propose a way to avoid this subjectivity, to induct all the most important rules a given data support. First, on the set of rules, we define preference relationship, which incorporates comparison of the rules by both gen erality and validity. Then, we define digest of rules as a minimal subset of rules, which includes a preferable rule for any rule not in the digest. This way, the digest com prises the most important and diverse knowledge in data. We propose an algorithm of the search for the digest, show existence and uniqueness of the digest for a given data. The results are obtained for production rules, which generalize such types of rules as association rules and interval rules.

Important Links:

Go Back