Automatic Measuring of Semantic Distances between Word Senses in a Spanish Explanatory Dictionary

A. Gelbukh, G. Sidorov, and L. Chanona-Hernandez (Mexico)


Computational linguistics, word senses, distance inexplanatory dictionary, synonyms.


The problem of what is a semantic distance and how it should be measured is interesting and not very well investigated. Usually the distance is measured between words. We propose to measure the distances between different senses of the same word. One of the purposes of this measurement is evaluation of the plausibility of application of word sense disambiguation techniques in information retrieval. Namely, if word senses are too close (too similar), then, on the one hand, the user will be unable to distinguish them for his/her informational need, and, on the other hand, WSD methods will not be reliable. Another purpose is the ability to estimate the quality of a dictionary, i.e., if there are many close (similar) senses, then the dictionary should be revised. In our experiments, we used Anaya dictionary of Spanish language. Dictionary definitions were lemmatized. For measuring the distance, we calculated the literal matching between two senses and matching using synonyms. The synonyms were taken from the Spanish dictionary of synonyms. The results show that about 90% of senses are different (the distance is rather long), still about 10% are rather similar (the distance is short). Thus, in general, the WSD techniques seem to be useful in information retrieval, but in case of the Anaya dictionary about 10% of definitions of similar senses should be revised.

