SPEAKER IDENTIFICATION IN EACH OF THE NEUTRAL AND SHOUTED TALKING ENVIRONMENTS BASED ON GENDER-DEPENDENT APPROACH USING SPHMMS

Ismail Shahin

References

  1. [1] S. Furui, Speaker-dependent-feature-extraction, recognition, and processing techniques, Speech Communication, 10, 1991, 505–520.
  2. [2] S.E. Bou-Ghazale & J.H.L. Hansen, A comparative study of traditional and newly proposed features for recognition of speech under stress, IEEE Transactions on Speech and Audio Processing, 8 (4), 2000, 429–442.
  3. [3] G. Zhou, J.H.L. Hansen, & J.F. Kaiser, Nonlinear feature based classification of speech under stress, IEEE Transactions on Speech and Audio Processing, 9 (3), 2001, 201–216.
  4. [4] Y. Chen, Cepstral domain talker stress compensation for robust speech recognition, IEEE Transactions on ASSP, 36 (4), 1988, 433–439.
  5. [5] I. Shahin, Improving speaker identification performance under the shouted talking condition using the second-order hidden Markov models, EURASIP Journal on Applied Signal Processing, 5 (4), 2005, 482–486.
  6. [6] I. Shahin, Enhancing speaker identification performance under the shouted talking condition using second-order circular hidden Markov models, Speech Communication, 48 (8), 2006, 1047–1055.
  7. [7] I. Shahin, Speaker identification in the shouted environment using suprasegmental hidden Markov models, Signal Processing Journal, 88 (11), 2008, 2700–2708.
  8. [8] I. Shahin, Employing second-order circular suprasegmental hidden Markov models to enhance speaker identification performance in shouted talking environments, EURASIP Journal on Audio, Speech, and Music Processing, 2010, Article ID 862138, 10 pages, doi:10.1155/2010/862138.
  9. [9] I. Shahin, Gender-dependent speaker identification under shouted talking condition, 3rd International Conference on Communication, Computer and Power (ICCCP’09), Muscat, Oman, 2009, 332–336.
  10. [10] J. Adell, A. Benafonte, & D. Escudero, Analysis of prosodic features: Towards modelling of emotional and pragmatic attributes of speech, XXI Congreso de la Sociedad Española para el Procesamiento del Lenguaje Natural, SEPLN, Granada, Spain, 2005.
  11. [11] T.S. Polzin & A.H. Waibel, Detecting emotions in speech, Cooperative Multimodal Communication, Second International Conference 1998, CMC, Tilburg, The Netherlands, 1998.
  12. [12] L.R. Rabiner & B.H. Juang, Fundamentals of speech recognition (Englewood Cliffs, NJ: Prentice Hall, 1993).
  13. [13] J.H.L. Hansen & S. Bou-Ghazale, Getting started with SUSAS: A speech under simulated and actual stress database, EUROSPEECH-97: International Conf. on Speech Communication and Technology, Rhodes, Greece, 1997, 1743–1746.
  14. [14] http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId= LDC99S78.
  15. [15] H. Bao, M. Xu, & T.F. Zheng, Emotion attribute projection for speaker recognition on emotional speech, INTERSPEECH 2007, Antwerp, Belgium, 2007, 758–761.
  16. [16] T. Vogt & E. Andre, Improving automatic emotion recognition from speech via gender differentiation, Proceedings of Language Resources and Evaluation Conference (LREC 2006), Genoa, Italy, 2006.
  17. [17] J. Dai, Isolated word recognition using Markov chain models, IEEE Transactions on Speech and Audio Processing Journal, 3 (6), 1995, 458–463.
  18. [18] O.W. Kwon, K. Chan, J. Hao, & T.W. Lee, Emotion recognition by speech signals, 8th European Conference on Speech Communication and Technology 2003, Geneva, Switzerland, 2003, 125–128.
  19. [19] I. Luengo, E. Navas, I. Hernaez, & J. Sanches, Automatic emotion recognition using prosodic parameters, INTERSPEECH 2005, Lisbon, Portugal, 2005, 493–496.
  20. [20] T.H. Falk & W.Y. Chan, Modulation spectral features for robust far-field speaker identification, IEEE Transactions on Audio, Speech and Language Processing, 18 (1), 2010, 90–100.
  21. [21] J.H.L. Hansen & B.D. Womack, Feature analysis and neural network-based classification of speech under stress, IEEE Transactions on Speech and Audio Processing Journal, 4 (4), 1996, 307–313.
  22. [22] T. Kinnunen & H. Li, An overview of text-independent speaker recognition: From features to supervectors, Speech Communication, 52 (1), 2010, 12–40.
  23. [23] T.L. Nwe, S.W. Foo, & L.C. De Silva, Speech emotion recognition using hidden Markov models, Speech Communication, 41 (4), 2003, 603–623.
  24. [24] D. Ververidis & C. Kotropoulos, Emotional speech recognition: Resources, features, and methods, Speech Communication, 48 (9), 2006, 1162–1181.
  25. [25] L.T. Bosch, Emotions, speech and the ASR framework, Speech Communication, 40 (1–2), 2003, 213–225. 90
  26. [26] W.H. Abdulla & N.K. Kasabov, Improving speech recognition performance through gender separation, Artificial Neural Networks and Expert Systems International Conference (ANNES), Dunedin, New Zealand, 2001, 218–222.
  27. [27] H. Harb & L. Chen, Gender identification using a general audio classifier, International Conference on Multimedia and Expo 2003 (ICME ’03), 2003, II733–II736.

Important Links:

Go Back