A CELP Coder using MFCC for Server-based Speech Recognition in Mobile

G.H. Lee, J.S. Yoon, and H.K. Kim (Korea)


CELP speech coder, MFCC, Predictive VQ, Safetynet VQ, Speech recognition


Exisiting standard speech coders can provide speech com munication of high quality while they degrade the perfor mance of speech recognition systems that use the recon structed speech by the coders. The main cause of the degra dation is that the spectral envelope parameters in speech coding are optimized to speech reconstruction rather than to speech recognition. For example, mel-frequency cep stral coefficient (MFCC) is generally known to provide bet ter speech recognition performance than linear prediction coefficient (LPC) that is a typical parameter set in speech coding. In this paper, we propose a speech coder using MFCC instead of LPC to improve speech recognition per formance in mobile. However, the main drawback of using MFCC is to develop the efficient MFCC quantization with a low-bit rate. First, we explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is developed to make the proposed speech coder robust to channel error. As a result, we propose a 8.7 kbps CELP coder and it is shown from a PESQ test that the proposed coder has a comparable speech quality to 8 kbps G.729.

Important Links:

Go Back