AUDIO SIGNAL DISCRIMINATION USING EVOLUTIONARY SPECTRUM

A. I. Al-Shoshan

References

  1. [1] G. Tzanetakis & P. Cook, Multifeature audio segmentation for browsing and annotation, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 1, New Paltz, NY, 1999, 91–94.
  2. [2] K. Martin, Towards automatic sound source recognition: identifying musical instruments, Proc. NATO Computational Hearing Advanced Study Institute, Ciocco (Tuscany), Italy, 1, 1998, 54–61.
  3. [3] P. Herrera-Boyer, Towards instrument segmentation for music content description: a critical review of instrument classification techniques, ISMIR, 1, 2000, 115–119.
  4. [4] R.O. Gjerdingen, Using connectionist models to explore complex musical patterns, Computer Music Journal, 13(3), 1989, 67–75.
  5. [5] D. H¨ornel & T. Ragg, Learning musical structure and style by recognition, prediction and evolution, in D. Rossiter (Ed.), Int. Computer Music Conf., 1996, 59–62.
  6. [6] M. Leman & P. Van Renterghem, Transputer implementation of the Kohonen feature map for a music recognition task, Proc. of the Second International Transputer Conference: Transputers for Industrial Applications II, Antwerp, BIRA, 1, 1989, 213-216.
  7. [7] C. Stevens & C. Latimer, A comparison of connectionistmodels of music recognition and human performance, Minds and Machines, 2(4), 1992, 379–400.
  8. [8] M. Kahrs & K. Brandenburg, Application of digital signal processing to audio and acoustics (London: Kluwer Academic Puplisher, 1998).
  9. [9] P. Toiviainen, Modelling the target-note technique of Bebop-style jazz improvisation: an artificial neural network approach, Music Perception, 12(4), 1995, 399–413.
  10. [10] C. Stevens & J. Wiles, Representations of tonal music: a case study in the development of temporal relationships, Proc. Connectionist Models Summer School, Hillsdale, NJ, Erlbaum, 1993, 228–235.
  11. [11] N. Griffith & P.M. Todd, Musical networks (Bradford, United Kingdom: Bradford Books The MIT Press, 1999).
  12. [12] R. Monelle, Linguistics and semiotics in music (Harwood Academic Publishers, 1992).
  13. [13] B. Feiten & T. Ungvary, Organizing sounds with neural nets, Int. Computer Music Conference, San Francisco, 1991, 441–443.
  14. [14] J.T. Foote, Content-based retrieval of music and audio, SPIE’97, 1997, 138–147.
  15. [15] J. Saunders, Real-time discrimination of broadcast speech/music, IEEE ICASSP’96, 1996, 993–996.
  16. [16] L. Rabiner & B.H. Juang, Fundamentals of speech recognition (NJ: Prentice-Hall, 1993).
  17. [17] K. El-maleh, A. Samoulian, & P. Kabal, Frame-level noise classification in mobile environment, Proc. IEEE ICASSP’99, 1999, 237–240.
  18. [18] K. El-maleh, M. Klein, G. Petrucci, & P. Kabal, Speech/music discriminator for multimedia application, ICASSP, Istanbul, 2000, 2445–2448.
  19. [19] B. Matityaho & M. Furst, Neural network based model for classification of music type, IEEE Catalogue, 95, 1995, 640–645.
  20. [20] J.D. Hoyt & H. Wechsler, Detection of human speech using hybrid recognition models, IEEE, 1994, 2, 330–333.
  21. [21] E. Scheirer & M. Slaney, Construction and evaluation of a robust multifeature speech/music discriminator, ICASSP’97, 2, Munich, Germany, 1997, 1021–1024.
  22. [22] G. Tzanetakis & P. Cook, Audio analysis using the discrete wavelet transform (Report, Computer Science Department, Princeton University, 2001).
  23. [23] D. Roy & C. Malamud, Speaker identification based text to audio alignment for an audio retrieval system, ICASSP’97, 2, Munich, Germany, 1997, 1099–1102.
  24. [24] B. Kedem, Spectral analysis and discrimination by zero-crossings, Proceedings of IEEE, 74(11), 1986, 1477–1492.
  25. [25] P. Laine, Generating musical patterns using mutually inhibited artificial neurons, Proc. Int. Computer Music Conference, San Francisco, 1997, 422–425.
  26. [26] A.I. Al-Shoshan, A. Al-Atiyah, & K. Al-Mashouq, A three-level speech, music, and mixture classifier, Journal of King Saud University {Engineering Sciences}, 2(16), 2003, 23–30.
  27. [27] H. Beigi, S. Maes, J. Sorensen, & U. Chaudhari, A hierarchical approach to large-scale speaker recognition, IEEE ICASSP’99, Phoenix, Arizona, 1999, 105–109.
  28. [28] H. Jin, F. Kubala, & R. Schwartz, Automatic speaker clustering, Proc. of the Speech Recognition Workshop, 1997, 108–111.
  29. [29] R. Meddis & M. Hewitt, Modelling the identification of concurrent vowels with different fundamental frequency, Journal of the Acoustical Society of America, 91, 1992, 233–245.
  30. [30] B.P. Bogert, M.J.R. Healy, & J.W. Tukey, The Frequency analysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum, and saphe cracking, (New York: John Wiley and Sons, 1963), 209–243.72
  31. [31] A. Eronen & A. Klapuri, Musical instrument recognition using cepstral coefficients and temporal features, Proc. ICASSP, 2000, 513–516.
  32. [32] N.J.L. Griffith, Modelling the influence of pitch duration on the induction of tonality from pitch-use, Proc. Int. Computer Music Conf., San Francisco, 1994, 35–37.
  33. [33] M.B. Priestley, Non-linear and non-stationary time series analysis (NY: Academic Press, 1988).
  34. [34] A.I. Al-Shoshan, LTV system identification using the time-varying autocorrelation function and application to audio signal discrimination, ICSP’02, China, 2002, 419–422.
  35. [35] M. Spina & V. Zue, Automatic transcription of general audio data: preliminary analysis, Proc. Fourth Int. Conf. on Spoken Language (ICSLP 1996 ), 2, October 1996, 594–597.
  36. [36] M.J. Carey, E.S. Parris, & H. Lloyd-Thomas, A comparison of features for speech, music discrimination, Proc. 1999 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 1, 1999, 149–153.
  37. [37] G. Williams & D. Ellis, Speech/music discrimination based on posterior probability features, Eurospeech-99, 2, Budapest, Hungary, September 1999, 687–690.
  38. [38] S. Srinivasan, D. Ponceleon, & D. Petkovic, Towards robust features for classifying audio in the cuevideo system, Proc. 7th ACM Int. Conf. on Multimedia, Orlando, FL, USA, 1999, 393–400.
  39. [39] T. Zhang & C.-C.J. Kuo, Hierarchical classification of audio data for archiving and retrieving, IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP’99 ), 6, March 1999, 3001–3004.
  40. [40] Y. Nakajima, Y. Lu, M. Sugano, A. Yoneyama, H. Yamagihara, & A. Kurematsu, A fast audio classification from MPEG coded data, IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP’99 ), 6, March 1999, 3005–3008.
  41. [41] L. Guojun & T. Hankinson, An investigation of automatic audio classification and segmentation, Proc. 5th Int. Conf. on Signal Processing (WCCC-ICSP 2000 ), 2, 2000, 1026–1032.
  42. [42] P.J. Moreno & R. Rifkin, Using the Fisher kernel method for web audio classification, Proc. 2000 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 4, June 2000, 2417–2420.
  43. [43] B. Schuller, F. Eyben, & G. Rigoll, Static and dynamic modelling for the recognition of non-verbal vocalisations in conversational speech, PIT 2008, 99–110.
  44. [44] B. Schuller, F. Wallhoff, D. Arsic, & G. Rigoll, Musical signal type discrimination based on large open feature sets, ICME 2006, 1089–1092.
  45. [45] A. Hyvarinen & E. Oja, Independent component analysis: algorithms and applications. International Journal of Neural Networks, 13(4–5): 411–430, 2000.
  46. [46] G. Mu & D.L. Wang, An extended model for speech segregation, Proc. IEEE, 2001, 1089–1094.
  47. [47] V. Peltonen, Computational auditory scene recognition, Master of Science Thesis, Tampere University of Technology, Department of Information Technology, February 2001.

Important Links:

Go Back