Rules Revisited: Web Page Classification

A. Katsaris and I. Karali (Greece)


web page classification/categorization, heuristics, deriva tion rules


The importance of the problem of web page classification grows significantly with the continuous increase of the in formation available in the Internet. Web page classification serves two purposes: filtering the enormous search space on the Web by considering only relevant pages when at tempting to locate a specific kind of information, providing some semantic information when trying to access high pre cision results. To classify a Web page, its structure should be considered together with its text content. In this paper, we present our approach, which deals with the problem by using derivation rules and heuristics as well as analysis of the web page structure at a high semantic level. This ap proach was implemented in the ExpertCat system.

Important Links:

Go Back