K. Koutsomyti, S.R. Parr, V.A. Chouliaras, and J. Nunez (UK)
Speech processing devices, Speech Coding, VoIP, Coprocessors, Embedded systems, RISC CPU.
This work quantifies the performance benefit of vectorized versions of the ITU-T G.729A and G.723.1 speech coding standards. Architecture-level experimentation with the addition of custom vector instructions indicates a reduction in the dynamic instruction count of the workloads of the order of 51% and 65% respectively at a vector register length of sixteen 16-bit elements. The identified vector instructions are encapsulated in a configurable, vector accelerator that attaches to an open-source RISC CPU. The developed vector ISA is further extended via a number of scalar, custom, arithmetic instructions which yield an additional benefit of 17% and 10% respectively. We present a new implementation of the combined scalar-vector accelerator which maintains zero Load-Use latency while reducing the silicon footprint via dynamic allocation of the vector datapath to the scalar coprocessor.
Important Links:
Go Back