Classification of Websites as Sets of Feature Vectors

H.-P. Kriegel and M. Schubert (Germany)

Keywords

web mining, website classification, sets of feature vectors.

Abstract

The world wide web is the largest source for all kind of information currently available. Due to its enormous size retrieving relevant information is a difficult task for which users often rely on directory services. A directory service provides a huge topic tree containing links for each topic. Due to the generality of the topics most links direct to web sites or domains, instead of single webpages. For main taining a directory service, automatic classification of new websites into the topics of the tree would be very benefi cial. Therefore, this paper introduces a new approach to website classification that is based on sets of feature vec tors. Compared to previous approaches our new method requires no preprocessing, but provides high accuracy in efficient time.

Important Links:



Go Back