Feature Extraction and Classification for Bilingual Script (Gurmukhi and Roman)

R. Dhir (India)


Bilingual Script, segmentation, identification, projection profile.


The capability of recognizing multilingual documents is both novel and useful. With such capability, many applications can be supported including multilingual access to patent, business and regulatory information, translation, and keyword finding in document images. The main purpose of our research will be development of the methodology of a single OCR system, which will process bilingual documents typed in both Gurmukhi (Punjabi) and Roman (English). The OCR will automatically recognize the script of each word of the document and invoke the appropriate recognition engine and recognize that word.

