Overlapping Clustering Methods for a Japanese Meta Search Engine

M. Ohta, H. Narita, K. Katayama, and H. Ishikawa (Japan)


Web Document Clustering, Overlapping Clustering, Meta Search Engine


We present overlapping clustering methods for a Japanese meta search engine as an alternative to a list of ranked re trieval results which most search engines adopt to present the retrieval results. Whereas ranked retrieval results help users locate a specific piece of information, clustered re trieval results with appropriate cluster labels aid them in grasping an overview of some general topic or find only interesting information related to the query. The proposed methods cluster the retrieval results dynamically accord ing to the following two steps: (1) cluster labels consisting of the most important feature terms extracted from the re trieval results are generated first; then (2) each result is clas sified into one or more (i.e., overlapping) generated clus ters based on its relevance to the feature term. To measure the quality of generated clusters, we also propose measures of retrieval effectiveness specifically extended for clustered retrieval results where the relevance to cluster labels, as well as to the original query, is considered. We compared the proposed methods with our previously proposed exclu sive clustering method and to Lycos Japan with a clustering function based on the proposed measures.

Important Links:

Go Back