Analysis of Tree Edit Distance on XML Data

Y.-F. Wu, S.-F. Lin, and H.-C. Yen (Taiwan)


Tree edit distance, XML, algorithm, streaming, unlabeled ordered trees


The problem of comparing tree structures occurs in various areas in computer science and engineering, including the application to XML data processing. To solve this problem, tree edit distance is a common and significant measurement defining the difference between two tree structures quantitatively. Efficient tree edit distance embedding algorithms are therefore of significant importance in comparing large streaming XML document trees. In this paper, we propose a new algorithm to obtain edit distance between unlabeled ordered trees derived from streaming XML data. In comparison with the previous work, our contribution lies in simplifying the procedure of obtaining the tree edit distance without increasing the time and space complexities. The upper and lower bounds of distortion as well as the error probability of our algorithm are also analyzed in this paper.

Important Links:

Go Back