Non-Speech Environmental Sound Identification for Surveillance using Self-Organizing-Maps

R. Sitte and L. Willets (Australia)


Signal Processing Sound Identification, Sound Classification, Pattern Recognition, Self-Organizing Maps, Neural Networks


In this paper we present a new approach for non-speech sound identification using Self Organizing Maps (SOM). We have found that by applying the SOM identification in a two-staged process, the identification rate surpasses the performance of other identification attempts. We have experimented with up to sixty different sounds maintaining a success rate of 70%. While even higher ratess have been reported by others, such high rates were typically limited to less than ten different sounds. We also found that as little as four sound-samples from each sound are sufficient for training the SOM. Non-speech environmental sound identification is different from speaker and music instrument identification in that it spans (in theory) an unbounded range of frequencies. This makes the pool of sound patterns almost limitless and consequently the pattern matching quite extensive and difficult. ACRONYMS ANN Artificial Neural Networks CWT Continuous Wavelet Transform DTW Dynamic Time Warping GMM Gaussian Mixture Models HMM Hidden Markov Models LVQ Learning Vector Quantization LTS Long-Term Statistics MFCC Mel Frequency Cepstral Coefficients SOM Self Organizing Maps STFT Short-Time Fourier Transform

Important Links:

Go Back