A. Gelbukh, G. Sidorov, and L. Chanona-Hernandez (Mexico)
Computational linguistics, word senses, distance inexplanatory dictionary, synonyms.
The problem of what is a semantic distance and how it
should be measured is interesting and not very well
investigated. Usually the distance is measured between
words. We propose to measure the distances between
different senses of the same word. One of the purposes of
this measurement is evaluation of the plausibility of
application of word sense disambiguation techniques in
information retrieval. Namely, if word senses are too
close (too similar), then, on the one hand, the user will be
unable to distinguish them for his/her informational need,
and, on the other hand, WSD methods will not be reliable.
Another purpose is the ability to estimate the quality of a
dictionary, i.e., if there are many close (similar) senses,
then the dictionary should be revised. In our experiments,
we used Anaya dictionary of Spanish language.
Dictionary definitions were lemmatized. For measuring
the distance, we calculated the literal matching between
two senses and matching using synonyms. The synonyms
were taken from the Spanish dictionary of synonyms. The
results show that about 90% of senses are different (the
distance is rather long), still about 10% are rather similar
(the distance is short). Thus, in general, the WSD
techniques seem to be useful in information retrieval, but
in case of the Anaya dictionary about 10% of definitions
of similar senses should be revised.