Exploiting Instruct Recognized Scheme to Improve Processor Performance

Chin-Yung Chen, Jenn Tang, Dong-Liang Lee, and Jih-Fu Tu

References

  1. [1] W.C. Fu & J.H. Patel, Data prefetching in multiprocessorvector cache memories, Proc. 18 th Int. Sym. on ComputerArchitecture, May 1991, 54-63.
  2. [2] J.L. Baer & T.F. Chen, An effective on-chip preloadingscheme to reduce data access penalty, Proc. Supercomputing‘91,November 1991, 176–186.
  3. [3] T.F. Chen & J.L. Baer, Effective hardware-based data prefetching for high-performance processor, IEEE Trans. on Computers, May 1995, 318–328. doi:10.1109/12.381947
  4. [4] N.P. Jouppi & D. Wall, Available instruction-level parallelismfor superscalar and superpipelined machines, Proc. 13 th Int.Conf. on Architectural Support for Programming Languagesand operating system, April 1989, 272–282. doi:10.1145/70082.68207
  5. [5] K. Hwang, Advanced computer architecture parallelism scalability programmability (McGraw-Hill, 1997).
  6. [6] J. Smith, Sequential program prefetching in memory hierarchical, IEEE Computer, 47(12), 1978, 7–21.
  7. [7] J.-F. Tu, Y.-H. Wang, & L.-H. Wang, A dynamic dataprefetchung method of improving the memory latency, FourthInt. Conf./Exhibition on High Performance Computing inAsia—Pacific Region, Beijing, China, May 14–17, 2000, 13–18. doi:10.1109/HPC.2000.846508
  8. [8] G. Doshi, R. Krishnaiyer, & K. Muthukumar, Optimizingsoftware data prefetches with rotating registers, Proc. 2001 Int.Conf. on Parallel Architectures and Compilation Techniques,2001, 257–267. doi:10.1109/PACT.2001.953306
  9. [9] D.J. Lilja & S.P. Vander Wiel, A compiler-assisted data prefetchcontroller, Int. Conf. on Computer Design (ICCD ’99), 1999,372–377.
  10. [10] J.L. Hennessy & D. Patterson, Computer architecture: Aquantitative approach, 2nd ed. (Morgan Kaufmann, 1996).
  11. [11] SPEC organization, The user guide of SPEC CPU95, 1996.
  12. [12] P. Cao, E. Felten, & K. Li, Implementation and performanceof application-controlled file caching, prefetching, and diskscheme, ACM Trans. on Computer Systems, November 1996, 217–234.
  13. [13] J. Smith & J. Lee, Branch prediction strategies and branchtarget buffer design, IEEE Trans. on Computer, 17(1), 1984,6–22.
  14. [14] N.P. Jouppi, Improving direct-mapped cache performance bythe addition of a small fully-associative cache and performance buffer, Proc. 17 th Annual Int. Symp. on Computer Architecture, May 1990, 364–373. doi:10.1109/ISCA.1990.134547
  15. [15] S. Palacharla & R.E. Kessler, Evaluating stream buffers as asecondary cache replacement, Proc. 21 st Annual Int. Symp.on Computer Architecture, April 1994, 24–33.
  16. [16] T.-Y. Yeh & Y.N. Patt, Alternative implementations of two-level adaptive branch prediction, 19th Annual Int. Symp. ofComputer Architecture, Gold Coast, Australia, May 1992, 124–134. doi:10.1109/ISCA.1992.753310
  17. [17] B. Calder & D. Grunwald, Next cache line and set prediction,22nd Annual Int. Symp. on Computer Architecture, Santa Margherita Ligure, Italy, June 1995, 287–297. doi:10.1145/223982.224439
  18. [18] S.S. Pinter, Tango: A hardware-based data prefetching technique for superscalar processors, 17th Annual Int. Symp. ofComputer Architecture, San Diego, CA, May 1995, 214–225. doi:10.1109/MICRO.1996.566463
  19. [19] D. Joseph & D. Grunwald, Prefetching using Markov predictors, IEEE Trans. on Computers, 48(2), 1999, 121-133. doi:10.1109/12.752653

Important Links:

Go Back