C. Leopold (Germany)
High Performance Computing, Applications, Memory and I/O Systems, Optimization
Tiling is well-known to reduce the number of cache misses in linear relaxation codes. This paper investigates analyti cally how close to optimum the improvement gets. We con sider one time step of the Jacobi and Gauss-Seidel methods on a two-dimensional array of size (N +2) × (N +2). For cache capacity C and line size L, we prove that at least 0.6N2 /(LC) capacity misses are taken, independent on the schedule of operations in the program. Furthermore, we show that tiled codes are off this bound by a factor of 4L. Finally, we reduce the factor to 7 with a sophisticated data layout scheme.
Important Links:
Go Back