Measurement of Similarity using Field Association Terms

E.-S. Atlam, K. Morita, and J. Aoe (Japan)

Keywords

Information Retrieval, FA terms, FASim, Recall, Precision.

Abstract

Information retrieval measured document similarity by considering all information in texts and are relatively inefficiency for processing large text collections in heterogeneous subject areas. This paper outlined a new text manipulation system FA-Sim that is useful for retrieving information in large heterogeneous texts and for recognizing content similarity in text excerpts. FA Sim is based on flexible text matching procedures carried out in various contexts and various field ranks. FA-Sim measures texts similarity by using specific Field Association (FA) terms instead of by comparing all text information. Similarity between texts is faster and higher by using FA-Sim than other two analysis methods. Therefore, Recall and Precision significantly improved by 39% 37% over these two traditional methods.

Important Links:



Go Back