M. Yang, J. Wang, S.Q. Zheng, and Y. Jiang
Polynomial approximation functions, instructional-level parallelism processors, DSPs, parallel functional units, data dependency graph, critical path
The authors propose a general code optimization method for implementing polynomial approximation functions on clustered instruction-level parallelism (ILP) processors. In the proposed method, we first introduce the parallel algorithm with minimized data dependency. We then schedule and map the data dependency graph (DDG) constructed based on the parallel algorithm to appropriate clusters and functional units of a specific clustered ILP processor using the proposed parallel scheduling and mapping (PSAM) algorithm. The PSAM algorithm prioritizes those nodes on the critical path to minimize the total schedule length and ensures that the resulting schedule satisfies the resource constraints imposed by a specific cluster ILP processor. As a result, our method produces schedule lengths close to the lower bounds determined by the critical path lengths of the DDGs. Experimental results of typical polynomial mathematical functions on TI ’C67x DSP show that the proposed method achieves significant performance improvement over the traditional computation method.
Important Links:
Go Back