Evaluation of Replication and Rescheduling Heuristics for Grid Systems with Varying Resource Availability

M. Chtepen, B. Dhoedt, F. De Turck, P. Demeester, F.H.A. Claeys, and P.A. Vanrolleghem (Belgium)

Keywords

Grid computing, dynamic scheduling, task replication, fail ure detection

Abstract

As grids typically consist of heterogeneously managed sub systems with strongly varying resources, resource avail ability should be taken into account in the job schedul ing process. This paper introduces several dynamic on line scheduling heuristics that reduce task loss and execu tion delay resulting from resource failures. The heuristics are based upon task replication and rescheduling of failed tasks. Characteristic to the proposed methods is the relative simplicity and the ef´Čüciency with which they are dealing with dynamic grid environments. For tuning and evaluation of the algorithms, a discrete-event simulation framework was used. Grid systems with high and low system load, as well as varying failure patterns were investigated. The ex periments have shown that the proposed failure detection based heuristics provide for almost lossless task execution but can decrease system performance, while replication based algorithms generally result in good throughput on unreliable non-excessively loaded grids, however without giving a guarantee on the number of jobs lost.

Important Links:



Go Back