Password Secured Speaker Recognition using Time and Frequency Domain Features

K.S. Prasad, K.A. Sheela, and M.M. Latha (India)


Speaker Recognition, RBF Neural Network, DTW, TESPAR, MFCC, Vector Quantization, Kalman filter.


Automatic Speaker recognition (ASR) is a pattern recognition problem that involves the process of automatically recognizing the speakers from their voices. Password protected or secured speaker recognition system gives an extra security to the system where a person is not only identified by his voice but also needs to utter a particular password correctly in order to access the system. In this paper we present a new modeling scheme for a password protected speaker recognition system by cascading a modified RBF Neural Network using Kalman filtering approach (ANN) and a Dynamic Time Warping (DTW) model. We propose to use TESPAR (Time Encoded Signal Processing And Recognition) features for Speaker recognition and MFCC Coefficients for password recognition. The key problem is to define the TESPAR alphabet used for the TESPAR coding process. In this paper we propose an approach to generate this alphabet using the Kohenen Neural Networks in a Vector Quantization process. For the recognition process a modified RBF Neural Network using Extended Kalman filtering approach (ANN) is used. MFCC Coefficients using DTW based approach is used to verify the password related information. The combined model has been tested on 20 speakers with an overall accuracy of 98.82%.

Important Links:

Go Back