The Definition and Estimation of Feature Salience in Databases

G. Richards, K. Brazier, and W. Wang (UK)

Keywords

Data mining, feature salience, decision tree.

Abstract

Feature salience estimation is the task of quantifying the relative importance of individual features, in the presence of other features, in determining the classes of records in a dataset. This is an important task in the field of knowledge discovery in databases as it can provide real insight into the data. It is often very valuable for users to be able to determine which are the most important features in a dataset. For example, medical practitioners may wish to identify the salience of risk factors for a disease. Furthermore, salience estimation can be used as a basis for feature subset selection. In this paper we present a simple definition of feature salience together with a method that can be used to estimate this salience. The effectiveness, and limits, of the method were investigated by applying it to five synthetic datasets with known properties.

Important Links:



Go Back