Feature Selection through Feature Clustering for Microarray Gene Expression Data

Choudhury M.M. Wahid, A.B.M. Shawkat Ali, and Kevin S. Tickle


Data mining, feature selection, gene expression data, k-means clustering


A subset of features from a large data set is sufficient to improve the classifier performance in the user end. In this paper we have presented a novel approach for feature selection based on feature clustering using the well known k-means philosophy for the high dimensional gene expression data. This novel cluster based feature selection approach is applied on micro array gene expression data classification, exclusively in various cancer patient identification problems. We have used the popular box and whisker plot to represent our experimental performance in terms of accuracy and computational time. The experimental outcome clearly shows the suitability of our algorithm in the micro array gene expression domain.

Important Links:

Go Back