Document Processing System for Javanese Manuscripts

A. Harjoko and A.R. Widiarti (Indonesia)

Keywords

Document Processing, Javanese Manuscripts, Pixel-level Processing

Abstract

A prototype of document processing system for Javanese manuscripts was developed. The research was motivated by the need for preserving invaluable old Javanese manuscripts. The manuscripts, which are available in hardcopy only, are not in a good condition. In addition, our system helps young generation to undestand old javanese manuscripts. In our approach, a given texts were read using an optical device and fed into the system. Next, a pixel-level processing was performed which includes: binarization, orientation normalization, filling, thinning, and segmentation. The third step is javanese character recognition. The character recognition algorithm have two essential components: (i) feature extraction, i.e. by counting the number of pixels of the object in each unit of a character image. These features were kept in the Javanese character database, which afterwards was used in the verification of a given character, and (ii) classification, i.e. by utilizing the modification of the Euclidean distance to match the Javanese character with its corresponding Latin character. Our system has been tested with pages of two Javanese manuscripts: Menak Sorangan I and Panji Sekar. Experimental results showed that the success rate of our system in recognizing characters of Javanese manuscripts was 81.28%.

Important Links:



Go Back