Y. Qian, M.J. Rashti, and A. Afsahi (Canada)
Collective Communications, MPI, All-gather, InfiniBand, Multi-core Clusters
MPI_Allgather is a collective communication operation that is intensively used in many scientific applications. Due to high data exchange volume in MPI_Allgather, efficient and scalable implementation of this operation is critical to the performance of scientific applications running on emerging multi-core clusters. Mellanox ConnectX is a modern InfiniBand host channel adapter that is able to serve data communication over multiple simultaneously active connections in a scalable manner. In this work, we utilize this feature of the ConnectX architecture and propose novel multi connection aware RDMA-based MPI_Allgather algorithms for different message sizes over multi-core clusters. Our algorithms also use shared-memory for intra-node communications. The performance results indicate that the proposed four-group multi-connection aware gather-based algorithm is the best algorithm up to 32B messages and gains up to 3.1 times improvement for 4B messages over the native MVAPICH implementation. The proposed single-group multi-connection aware algorithm is the algorithm of choice for 64B to 2KB messages, and outperforms the MVAPICH by a factor of 2.3 for 64B messages. The best algorithm for 4-64KB messages is the two-group multi-connection aware algorithm that achieves an improvement of up to 1.8 for 4KB messages over MVAPICH.
Important Links:
Go Back