A Stability Index for Feature Selection

L.I. Kuncheva (UK)

Keywords

Pattern recognition, Feature selection, Stability index, Se quential Forward Selection (SFS)

Abstract

Sequential forward selection (SFS) is one of the most widely used feature selection procedures. It starts with an empty set and adds one feature at each step. The estimate of the quality of the candidate subsets usually depends on the training/testing split of the data. Therefore different sequences of features may be returned from repeated runs of SFS. A substantial discrepancy between such sequences will signal a problem with the selection. A stability index is proposed here based on cardinality of the intersection and a correction for chance. The experimental results with 10 real data sets indicate that the index can be useful for selecting the final feature subset. If stability is high, then we should return a subset of features based on their total rank across the SFS runs. If stability is low, then it is better to return the feature subset which gave the minimum error across all SFS runs.

Important Links:



Go Back