S.N. Vadlamani and S.F. Jenks (USA)
Parallel Processing, Thread-Level Parallelism, Producer Consumer Parallelism, Pipelined Parallelism.
In many of the current and next generation Chip Multipro cessors (CMP) and Simultaneous Multithreading (SMT) processor systems, the processors share one or more cache levels along with the memory interface. Being shared re sources, the caches and the memory interface are critical to the performance of the overall system. So, while these pro cessor systems offer significant potential for parallelism, programmers and compiler writers must ensure that their applications use the shared caches and the memory inter face in an efficient manner. In this paper, we demonstrate that for several important classes of applications, the com monly used spatial decomposition model leads to an ineffi cient usage of the shared caches and the memory interface due to increased cache misses and memory bus utilization. As an alternative, we propose and define the Synchronized Pipelined Parallelism Model (SPPM) for parallelizing such applications on CMP and SMT processors. We specifically target the Level 2 cache, and show that this model provides better performance than the spatial decomposition model, while maintaining the overall cache misses and the mem ory bus utilization at around the same levels as those in the original sequential applications.
Important Links:
Go Back