P. Patel and K. Ouazzane (UK)
Lipreading, viseme classification, visual features.
Several researchers have demonstrated that a visual speech reading system is beneficial complement to an audio speech recognition system by using of visual speech cues of the speakers face in noisy environment. However, robust and accurate visual feature extraction and classification are difficult object recognition and classification problems, due to high variation in pose, lighting and dynamic nature of the visemes. In this paper, a novel variable weights approach for classifying visemes is presented and compared with fixed weights based classification approach. Firstly, an approach using fixed significance factors (weights) for various components of visemes including mouth gestures is employed for visemes classification. The approach assumes that all visual features have same significance factor for every phoneme. The second approach is based on the hypothesis that the significance of a visual feature is variable for different phonemes. The efficiency of the variable weights approach is evaluated by comparing its results with fixed weights algorithm findings. The recognition results indicate that the variable weight approach has better performance than the fixed weight approach. The results presented demonstrate a highly accurate viseme classification approach with an average alphabet detection rate of about 36.9%.Furthermore, on average around 53% of alphabets were accurately detected using the viseme classifier described in this study.
Important Links:
Go Back