Robust Speaker Detection using Neural Networks

J.R. Shell (USA)


Neural Networks, Speech Recognition, Modeling


The work proposed in this paper utilizes Neural Networks to distinguish speech patterns. A feature extractor is used as a standard Linear Processing Coefficients (LPC) Cepstrum coder, converting the incoming speech signal captured by a Matlab interface into LPC Cepstrum feature space. A Neural Network makes each variable length LPC trajectory of an isolated word into a fixed length LPC trajectory and makes the fixed length feature vector that is fed into a recognizer. The design of the recognizer uses a Feed Forward (FF) and Back Propagation (BP) Network approach tested with variable hidden layers with Transfer functions of hyperbolic tangent and sigmoid to test the signal output for the recognition of the feature vectors of isolated words. The feature vector was normalized and de correlated by pruning techniques. The training process uses momentum to find the global minima of the error surface avoiding the oscillations in local minima. The goal of the work is to consistently identify a randomly chosen speech pattern from the samples of four different speakers uttering the same phrase 100% of the time and to verify the effectiveness of neural networks as a valid method in pattern recognition.

