P2P based Information Retrieval on Research Support System Papits

T. Ozono and T. Shintani (Japan)


Distributed Information Retrieval, Thesaurus, P2P


We have developed a research activity support system for universities called Papits. This system can be used to find information and/or person by keywords and a “Know-Who” search mechanism. Information sources in Papits are distributed because the computers of members act as information sources. To find relevant information and people from such environment, a method for finding appropriate information sources is necessary. Peer-to peer(P2P) technologies make it possible us to find information from massively distributed information sources. For example, Napster and Gnutella can be used to find music files over the Internet without centralized servers storing music files. However, Napster and Gnutella only provide very simple music search functionalities, which are not enough to find research information and/or people. We introduce thesaurus as characteristics of information sources. Our assumption is that thesaurus of in formation sources can be used to characterize the information sources. Terms used in documents in a source differ from one another and the meanings of a term differ depending on the situation in which the term is used. Therefore the difference means a characteristic of the source. In this paper, we propose an algorithm for evaluating a usefulness of a source for a query based on a thesaurus. We will also show an experimental result, which indicates that our method is effective for selecting appropriate sources from distributed information sources.

