Investigating Operating System Noise in Extreme-Scale High-Performance Computing Systems using Simulation

Christian Engelmann


high-performance computing, performance evaluation, operating system noise, parallel discrete event simulation


Hardware/software co-design for future-generation high-performance computing (HPC) systems aims at closing the gap between the peak capabilities of the hardware and the performance realized by applications (application-architecture performance gap). Performance profiling of architectures and applications is a crucial part of this iterative process. The work in this paper focuses on operating system (OS) noise as an additional factor to be considered for co-design. It represents the first step in including OS noise in HPC hardware/software co-design by adding a noise injection feature to an existing simulation-based co-design toolkit. It reuses an existing abstraction for OS noise with frequency (periodic recurrence) and period (duration of each occurrence) to enhance the processor model of the Extreme-scale Simulator (xSim) with synchronized and random OS noise simulation. The results demonstrate this capability by evaluating the impact of OS noise on MPI_Bcast() and MPI_Reduce() in a simulated future-generation HPC system with 2,097,152 compute nodes.

Important Links:

Go Back