Design and Evaluation of an Auto-Memoization Processor

T. Tsumura, I. Suzuki, Y. Ikeuchi, H. Matsuo, H. Nakashima, and Y. Nakashima (Japan)

Keywords

memoization, computational-reuse, speculative multi threading, CAM

Abstract

This paper describes the design and evaluation of an auto-memoization processor. The major point of this proposal is to detect the multilevel functions and loops with no additional instructions controlled by the compiler. This general purpose processor detects the functions and loops, and memoizes them automatically and dynamically. Hence, any load modules and binary programs can gain speedup without recompilation or rewriting. We also propose a parallel execution by multiple speculative cores and one main memoing core. While main core executes a memoizable region, speculative cores exe cute the same region simultaneously. The speculative exe cution uses predicted inputs. This can omit the execution of instruction regions whose inputs show monotonous in crease or decrease, and may effectively use surplus cores in coming many-core era. The result of the experiment with GENEsYs: genetic algorithm programs shows that our auto-memoization pro cessor gains significantly large speedup, up to 7.1-fold and 1.6-fold on average. Another result with SPEC CPU95 suite benchmarks shows that the auto-memoization with three speculative cores achieves up to 2.9-fold speedup for 102.swim and 1.4-fold on average. It also shows that the parallel execution by speculative cores reduces cache misses just like pre-fetching.

Important Links:

Go Back