G. Nokas and E. Dermatas
Microphone arrays, spectrum entropy, speech recognition, direction-of-arrival
Detection of the speaker position is a crucial task in distant speech recognition applications. In this paper, we present a novel speech beam-former for noisy environments. Initially, the localization algorithm extracts a set of candidate directions of the signal sources using spatial correlation matrices of a microphone array in the frequency domain. Then, the beam-former identifies the speech signal in the direction where the signal’s spectrum entropy is minimized. The proposed method is evaluated by a phoneme recognition system, using the TIMIT speech corpus and noise recordings from air-condition and computer fan. Extended experiments, carried out in anechoic environment in the range of −10 to 25 dB, show almost perfect estimation of the speaker’s direction-of-arrival in all cases. As a consequence, the recognition rate increases four times at 0 dB SNR compared to the rate obtained by a single microphone. The direction-of-arrival accuracy increases significantly in very low SNRs.
Important Links:
Go Back