Task Level Pipelining on Multiple Accelerators via FPGA Switch

Takaaki Miyajima, Takuya Kuhara, Toshihiro Hanawa, Hideharu Amano, and Taisuke Boku

Keywords

Interconnect for accelerators, GPU cluster, Accelerator computing

Abstract

We show a task level pipelining on multiple accelerators with PEACH2. PEACH2, which is implemented on FPGA, enables ultra low latency direct communication among multiple accelerators over computational nodes. By installing PEACH2, typical high performance computation nodes are tightly coupled. In this environment, application can be accelerated by exploiting not only data level parallelism, but also task level parallelism. Furthermore, we can process multiple task on multiple accelerators in a pipelined manner. In our evaluation, pipelined application which is implemented in a task level pipelined manner achieves 52% speed up compared to a single GPU.

Important Links:

DOI: 10.2316/P.2014.811-026
From Proceeding (810) Software Engineering / 811: Parallel and Distributed Computing and Networks / 816: Artificial Intelligence and Applications - 2014

Go Back