Jorge Buenabad-Chávez, Edgar F. Hernández-Ventura, Miguel A. Castro-García, José L. Quiroz-Fabián, Graciela Román-Alonso, and Daniel M. Yellin
Parallel Computing, Load Balancing, Multithreading, Shared Memory, Atomic Operations
In the context of processing data lists in parallel in a multicore platform, various threads share a workload, each using a list to get and insert the data items to be processed; and when a list becomes empty, the owner thread steals data items from another list — thus balancing the workload according to the processing capacity of each thread and transparently to the programmer. We have presented elsewhere two algorithms for work stealing in such context: global locking (GL) synchronises all threads only before a stealing action takes place, and chooses the largest list; low-sync locking (LSL) identifies safe phases to access lists without synchronising all threads. This paper presents thread locking (TL), a new work stealing algorithm that tries to combine the benefits of GL and LSL: no synchronisation under normal operation, and minimal synchronisation during a stealing action. TL synchronises only with a few threads whose lists are the largest, and chooses the list of the first thread that acknowledges synchronisation. We compare the three algorithms using various applications with different granularities and access patterns. Overall LSL and TL show better and more robust performance. LSL performs better with coarse granularity, while TL with fine granularity.
Important Links:
Go Back