Automatic Speaker Recognition

M.M. Moon and A. Cheeran (India)


Mel frequency Cepstrum Coefficient, and Vector Quantization


An automatic speaker recognition scheme is purposed and developed, to identify or verify a person, by identifying his/her voice, using a novel method. All speaker recognition system contains two main phases, training phase and the testing phase. In the training phase the features of the word spoken are extracted and during the testing phase feature matching takes place. Feature extractor transforms the raw speech signal into a compact but effective representation that is more stable and discriminative than the original signal. The feature or the template thus extracted is stored in the database. During the recognition phase the features are extracted by the same techniques and are compared with the template in the database. In the purposed automatic speaker recognition system the features of the speech are extracted as Mel-frequency Cepstrum coefficient (MFCC). This approach is based on psychophysical studies of human perception of the frequency content of sounds. Vector Quantization (VQ) is used for speaker modeling process. The final identification decision is made based on the matching score: Speaker who has a model with the smallest matching score is selected as an author of the test speech sample.

Important Links:

Go Back