G.A. Abandah, K.S. Younis, and M.Z. Khedher (Jordan)
Optical character recognition, principal component analysis, classification techniques, Arabic script
Users are still waiting for accurate optical character recognition solutions for Arabic handwritten scripts. This research explores best sets of feature extraction techniques and studies the accuracy of well-known classifiers for Arabic letters. Depending on their position in the word, Arabic letters are drawn in four forms: Isolated, Initial, Medial, and Final. The principal component analysis technique is used to select best subset of features out of a large number of extracted features. We used parametric and non-parametric classifiers and found out that a subset of 25 features is needed to get 84% recognition accuracy using a linear discriminant classifier, and using more features does not substantially improve this accuracy. However, for features fewer than 25 features, a quadratic discriminant classifier is more accurate than the linear classifier. Classifiers that are parameterized for the individual four forms score better accuracy than classifiers that do not make use of this input information.
Important Links:
Go Back