Improving the Retrieval Accuracy by Dynamically Adjusting Metadata for Document Databases

X. Chen and Y. Kiyoki (Japan)


Semantic information retrieval, document retrieval, vector space model


Information retrieval performed by vectorizing the retrieval candidates is widely applied to various databases in practice. We have proposed a method to implement document vectorizing based on the document classification. In our method, documents are vectorized and mapped onto a retrieval space. The retrieval process is implemented by selecting a subspace from the retrieval space according to the given query. Documents on the subspace are ranked based on their mapped values as the retrieval result. In this paper, we propose a method to improve the retrieval accuracy by dynamically adjusting weighting values to document vectors according to queries. Our method makes it possible to improve the retrieval accuracy by adding weighting values to the document vectors on the selected subspace. Because the weighting values are added only to the document vectors on the subspace, the calculation time is shorter than that of adding weighting values to the document vectors in the whole space. In our method, the ranking order of documents on the subspace is dynamically changed according to given queries. The document which is the most correlated to the given query is ranked at the top. Furthermore, we improve the process of subspace selection by using the weighting values during the subspace selecting. Our experimental results show that the efficiency of the subspace selection is greatly improved by using this method.

Important Links:

Go Back