Discovering Taxonomic Relationships from Textual Documents

H.-J. Kim and S.-G. Lee (Korea)

Keywords

information systems, topic hierarchy, term associations, subsumption relations

Abstract

This paper proposes a novel approach to automatically dis covering the hierarchical topic structure of a large doc ument set without any linguistic analysis. The method is based on term subsumption relationships using term co-occurrence, and attempts to discover hierarchical topic structures that are expressed by fuzzy subsumption rela tions. Despite its simplicity, results of experiments on well known document collections such as Yahoo! directory data demonstrate the high quality of the resulting hierarchies.

Important Links:



Go Back