Mining Tables and Lists on the Web for Desired Relations

Y. Wu (PRC) and H. Yokota (Japan)

Keywords

Web Data Mining, Information Recognition, Information Extraction, Relation Information

Abstract

There are all kinds of tables and lists on the Web. These tables and lists carry a lot of relation information. Using search engines, it is not easy to find them. In this paper, we propose a novel method to recognize and extract instance data that substantiate the desired relation between entities from the tables and lists on the Web. It is based on semantic and formal characters. We define models to represent a desired relation and a "repeated structure" like a table or a list on the Web, and introduce a set of functions to measure repeated structures to see if they contain a desired relation. We develop algorithms for training machine and mining the Web. Finally we give our experiment results.

Important Links:



Go Back