Annotating Topic Hierarchy based on the Feature Selection Technique

S.S. Tan and T.E. Kong (Malaysia)

Keywords

Natural Language Processing, Feature Subset Selection, Hierarchical Classification, Machine Learning on Text Data,

Abstract

This paper proposes an approach of using feature selection technique to annotate a topic hierarchy. The fundamental idea behind this work is to select a set of keywords from the documents in a topic hierarchy to enrich the hierarchy’s concept. This approach is developed and tested on one of the existing web hierarchy, Yahoo! Shopping hierarchy. Our experiments are performed using a feature selection algorithm that combines both methods for feature selection in machine learning and text learning, to select a set of keywords for each node in the topic hierarchy. We believe that this set of keywords can provide us with more information about the node’s concept. Experimental evaluation on real world data collected from the web show that our approach gives promising results and can potentially be used to annotate a web hierarchy.

Important Links:



Go Back