A Lip Reading System for Japanese Language based on the Hypercolumn Neural Networks

T. El. Tobely, N. Tsuruta, and M. Amamiya (Japan)


Lipreading, Hypercolumn neural network, image recognition, visual speech recognition, humancomputer interaction


Each letter in the Japanese language ends with vowel character. This unique feature is used in the proposed lipreading system in this paper. Where, the continuous sequence of input images are quantized into discrete se quence of vowels and used to recognize different spoken sentences. This quantization process is achieved using the Hypercolumn neural network model (HCM), which consists of hierarchical layers of the Hierarchical Self Organizing Maps (HSOM) neural network arranged as the cell planes of the Neocognitron (NC) neural network. HCM can recognize images with variant objects size, po sition, and spatial resolution. Results show that the sys tem perform well in the on-line recognition of six differ ent Japanese sentences.

Important Links:

Go Back