Rule Induction using Probabilistic Approximations and Data with Missing Attribute Values

Patrick G. Clark and Jerzy W. Grzymala-Busse

Keywords

Data mining, rule induction, rough set theory, probabilistic approximations, parameterized approximations

Abstract

This paper presents results of experiments on rule induction from incomplete data (data with missing attribute values) using probabilistic approximations. Such approximations, broadly studied for many years, are fundamental concepts of variable precision rough set theory and similar models to deal with inconsistent data sets. Our main objective was to study how useful are probabilistic approximations that are different from ordinary lower and upper approximations. Our results are rather pessimistic: for eight data sets with two types of missing attribute values, in only one case out of 16 some of such probabilistic approximations were better than ordinary approximations. On the other hand, in another case, some probabilistic approximations were worse than ordinary approximations. Additionally, we studied how many different probabilistic approximations may exist for a given concept of a data set.

Important Links:



Go Back