Gilad Shainer, Pak Lui, Tong Liu, Todd Wilde, and Jeffrey Layton
Inter-node latency, Clustering, InfiniBand, WRF
High-performance applications typically require the lowest possible latency in order to have the parallel processes be in sync as much as possible. In the past, this requirement drove the adoption of SMP machines, where the floating point elements (CPU, GPUs) were placed as much as possible on the same board. With the increased demands for higher compute capability, and lowering the cost of adoption for making large scale HPC more available, we have witnessed the increase of clustering as the preferred architecture for high-performance computing. In this paper, we investigate the performance impact of “going outside the box,” meaning the penalty of the inter-node latency versus the intra-node latency. We review the requirements of inter-node communication elements in order to ensure minimum penalty and review testing results of one of the leading HPC applications. We also explore and suggest new usage models that can be created utilizing technologies that meet the desired requirements.
Important Links:
Go Back