Robust Image based Document Comparison using Attributed Relational Graphs

K. Worm and B. Meffert (Germany)

Keywords

Pattern Recognition, Document Similarity, Attributed Re lational Graphs, Text Features, Line Profile

Abstract

This paper presents a new approach which allows simi larity measurement between documents using a compact image based feature representation. Various applications, in particular document management systems, require the comparison of scanned documents for their classification. The proposed method focuses on mail piece identification within the postal sorting process. Generally, mail pieces resemble in their structure and differ in text regions. Con centration on structural text region features and text line profiles exploits these differences. An attributed relational graph representation is used to combine detailed local in formation with rough layout information of a document. This method is designed to comply with the strong re quirements for postal sorting machines. In particular this approach is invariant towards document rotation, transla tion and towards document surface modifications caused by mail piece handling and transportation. Efficient algo rithms allow its usage in a real time environment. The qual ity and applicability for mail piece identification has been proven in various tests.

Important Links:



Go Back