cROVER: THE CONTEXT-AUGMENTED ROVER

Kacem Abida, Fakhri Karray, and Wafa Abida

References

  1. [1] J. Fiscus, A post-processing system to yield reduced word errorrates: recogniser output voting error reduction (ROVER), Proc.1997 IEEE Workshop on Automatic Speech Recognition andUnderstanding, Santa Barbara, CA, 1997, 347–352.
  2. [2] B. Hoffmeister, D. Hillard, S. Hahn, R. Schluter, M. Ostendorf,and H. Ney, Cross-site and intra-site ASR system combination:comparisons on lattice and 1-best methods, in IEEE Int. Conf.Acoustics, Speech and Signal Processing, 2007, ICASSP 2007,4, 2007, 1145–1148.
  3. [3] B. Hoffmeister, T. Klein, R. Schl¨uter, and H. Ney, Frame basedsystem combination and a comparison with weighted ROVERand CNC, Ninth Int. Conf. Spoken Language Processing, ISCA,2006.
  4. [4] L. Mangu, E. Brill, and A. Stolcke, Finding consensus in speechrecognition: word error minimization and other applicationsof confusion networks. Computer Speech and Language, 14(4),2000, 373–400.
  5. [5] K. Abida and F. Karray, Systems combination in large vo-cabulary continuous speech recognition, IEEE Int. Conf. Au-tonomous and Intelligent Systems (AIS 2010), June 2010.
  6. [6] K.D. Voll, A methodology of error detection: improving speechrecognition in radiology, Ph.D. dissertation, Simon Fraser Uni-versity, 2006.
  7. [7] D.Z. Inkpen and A. Desilets, Semantic similarity for de-tecting recognition errors in automatic speech transcripts,HLT/EMNLP, The Association for Computational Linguistics,2005.
  8. [8] A. Sarma and D.D. Palmer, Context-based speech recogni-tion error detection and correction, Proc. HLT-NAACL 2004,Association for Computational Linguistics, 2004, 85–88.
  9. [9] K.D. Voll, M.S. Atkins, and B. Forster, Improving the utility ofspeech recognition through error detection. Journal of DigitalImaging, 21(4), 2008, 371–377.
  10. [10] S. Kaki, E. Sumita, and H. Iida, A method for correcting errorsin speech recognition using the statistical features of characterco-occurence, COLING-ACL, 1998, 653–657.
  11. [11] J. Fiscus, J. Garofolo, M. Przybocki, W. Fisher, and D. Pallett,1997 English broadcast news speech (HUB4) (Philadelphia:Linguistic Data Consortium, 1998).
  12. [12] D. Huggins-Daines, CMU Sphinx open source models,2008. [Online]. Available: http://www.speech.cs.cmu.edu/sphinx/models/233
  13. [13] T. Brants and A. Franz, Web 1T 5-gram version 1, (Philadel-phia: Linguistic Data Consortium, 2006). [Online]. Avail-able: http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId =LDC2006T13

Important Links:

Go Back