Face and Lip Localization in Unconstrained Imagery

B. Crow; J.X. Zhang

Face and Lip Localization in Unconstrained Imagery

B. Crow and J.X. Zhang (USA)

Keywords

Automatic visual speech recognition, face tracking, lip tracking, target localization, mean-shift, Bhattacharyya coefficient

Abstract

When combined with acoustical speech information, visual speech information (lip movement) significantly improves Automatic Speech Recognition (ASR) in acoustically noisy environments. Previous research has demonstrated that visual modality is a viable tool for identifying speech. However, the visual information has yet to become utilized in mainstream ASR systems due to the difficulty in accurately tracking lips in real-world conditions. This paper presents our current progress in addressing this issue. We derive several algorithms based on a modified HSI color space to successfully locate the face, eyes, and lips. These algorithms are then tested over imagery collected in visually challenging environments.

Important Links:

DOI:
From Proceeding (623) Signal and Image Processing - 2008

Go Back