Acceleration of the Gaussian Emissions Computation in a Multi-stream based Continuous Speech Recogniser

V.C. Pera and A.J. Araújo (Portugal)


Automatic speech recognition, multi-streaming, field programmable gate arrays, computation complexity, hardware acceleration.


The multi-stream based automatic speech recognisers can obtain higher recognition rates than the conventional systems. This advantage is particularly evident on recognition tasks where the robustness to certain types of noise is critical, which is a very important issue in real world applications. However most of the multi-stream based approaches remain limited to a research topic due to their higher computation complexity. The fact of this problem has not been addressed satisfactorily in the literature is the main motivation for this study. In our work we investigated the acceleration of the acoustic likelihoods computation, the most time consuming part of the whole recogniser. This paper presents results on the computational complexity of the Gaussian mixture emissions estimation in a multi-stream statistical framework. Some results concerning the recognition performance dependence on the numeric precision at different stages of that process are presented too. In order to achieve a higher acceleration of some critical computation blocks, a hardware implementation is proposed, based on the Field Programmable Gate Array (FPGA) technology.

Important Links:

Go Back