Averages, Distributions and Scalability of MPI Communication Times for Ethernet and Myrinet Networks

N.A.W.A. Hamid and P. Coddington (Australia)

Keywords

MPI benchmarks, parallel computer, network perform ance.

Abstract

Most modern parallel computers are clusters using Myri net or Ethernet communication networks. Several studies have been published comparing the performance of these two networks for parallel computing, however these focus on average performance, and do not address the distribu tions of communication times, which can have long tails due to contention effects. In the case of Ethernet with TCP, retransmit timeouts (RTOs) can also occur. Slow communication events may have significant impact, par ticularly for applications requiring frequent synchroniza tion, where the performance is determined by the slowest process. We have analysed the distributions of communi cation times for standard MPI routines on Ethernet with TCP and Myrinet with GM communications networks on the same cluster, and studied the scalability of the distri butions as the number of communicating processes is increased, and the effect of RTOs for Ethernet with TCP.

Important Links:



Go Back