Robust Speech Recognition with Nonstationary Noise

D. Zhang and C. Xie (PRC)


hospital noise, robustness, speech recognition, signal processing The most popular approach used to speech recognition is based on the use of the Mel-frequency cepstral coefficients (MFCCs) as the parameters, and the Gaussians Mixtures Models (GMM) for the classification task. Though MFCCs have been very successful in speech recognition, they have the following two problems: 1) They do not have any physical interpretation, and 2) Liftering of cepstral coefficients has no effect in the re


Fig.1 robotic hospital bed based on speech recognition Speech recognition is a technology that can improve accessibility to equipment control systems for people with physical disabilities or situation interfere with the use of hand functions. In this paper, a robust speech recognition system is introduced in robotic hospital bed control system to enhance robustness in noisy conditions. A combination of the second-order frequency filtering (FF2) with the RelAtive SpecTrAl (RASTA) technique for the robust speech recognition system is proposed. Two experiments of comparing the traditional Mel-frequency cepstral coefficients (MFCCs) with the new technique using a usual HMM/Gaussian mixture models (HMM/GMM) based recognition system were carried out, for both clean and noisy speech. From these tests, a conclusion that the recognition system usually gets better recognition results when the Rasta filtering is applied to the FF2 features was reached, especially in less stationary noise conditions. This suggests that FF2 combination with Rasta filtering techniques, one of which is working over frequency, the other over time, may cancel out different noise components in the speech signal. This paper proposes a speech recognition system used in the conditions of hospital ward and surgery. Figure 1 illustrates the new type speech recognition based robotic hospital bed made in our lab. In order to improve recognition accuracy in hospital noise condition, a new designed frequency-filtering features of the logarithmic filter-bank energies technique combining with RelAtive SpecTrAl (Rasta) is introduced.

Important Links:

Go Back