Architectural Support for 3D Graphics in the Complex Streamed Instruction Set

D. Cheresiz, B. Juurlink, S. Vassiliadis, and H.A.G. Wijshoff (The Netherlands)


Multimedia, 3D graphics, processor architecture


In this paper we extend the previously proposed Complex Streamed Instruction Set (CSI) architecture to provide for floating-point computations and conditional execution in or der to efficiently support 3D graphics applications. The CSI extension is evaluated using an industry standard 3D bench mark, and compared to the Intel’s Streaming SIMD Extension (SSE). Compared to a 4-way issue superscalar processor ex tended with SSE and capable of processing 8 single-precision floating-point operations in parallel, the same processor ex tended with CSI attains the speedups of 2.8 and 2.13 on the transform and lighting kernels and the speedup of 1.61 on the geometry computations in whole. We also study how perfor mance scales with the number of floating-point units and ob serve that CSI extension allows to utilize them more efficiently then SSE. Finally, the performance bottlenecks of the SSE enhanced superscalar CPUs on the 3D graphics workload are identified. Results show that performance of the 4-way issue machines is limited by the issue width and that of the 8-way machines is limited by the number of the cache ports.

Important Links:

Go Back