WFST-based Large Vocabulary Continuous Speech Decoder for Service Robots

Abdelaziz A. Abdelhamid, Waleed H. Abdulla, and Bruce A. MacDonald


Speech decoder, weighted finite state transducer, humanrobot interaction


The current robotic speech recognition systems are restricted to limited domains and mostly support a limited set of allowable commands to make the robot do some actions. Now this is the time to challenge for the construction of continuous speech based robotic dialogue systems. The structure of the speech based robotic interaction consists of three main components namely: speech decoder, dialogue management, and speech synthesis. This paper discusses the design of the speech decoder for our Healthcare robot. The developed decoder is based on a pre-compiled static recognition network based on a weighted finite state transducer (WFST). The current vocabulary contains 64k words with more than 69k entries. The initial results show that the word recognition accuracy is 86.52% when tested on speakers with different dialects.

