An Improvement over Template Matching using K-Means Algorithm for Printed Cursive Script Recognition

T.K. Khan, S.M. Azam, and S. Mohsin (Pakistan)

Keywords

OCR, cursive script recognition, confusion matrix, k means algorithm, template matching, cluster prediction.

Abstract

Template matching is vital for any pattern recognition application but its computational cost increases as templates pool increases. We propose a novel two stage algorithm for the recognition of printed segmented isolated characters of any script especially for cursive scripts like Arabic and Urdu. This algorithm is based on cluster guided pattern recognition through the use of k means clustering algorithm to considerably reduce the recognition time for the template matching based pattern recognizer. In the first stage Confusion matrix of templates pool is generated through template matching and then applying cluster prediction algorithm on confusion matrix, three different sets of clusters are predicted using k-means clustering. Based on a cost evaluation criterion, the final value for clusters is selected which is used to perform clustering. The proposed approach automatically ensures the same recognition rate as that of template matching alone because cluster centers just serve to reduce the search space of template matching from global pool of templates to local one and actual recognition is still function of template matching. The proposed technique is applied on Urdu character set in Naskh font style. The implementation results show 8.45 times faster recognition rate.

Important Links:



Go Back