An Enhanced Model-based Checkpointing Protocol

J. Wu, Y. Luo, and D. Manivannan (USA)


Checkpointing, Recovery, Useless Checkpoints


Checkpointing and rollback recovery are widely used tech niques to handle failures in distributed computing systems. Usually we avoid taking checkpoints that are useless dur ing the recovery process. Communication-Induced check pointing algorithms guarantee the usefulness of all the checkpoints and provide considerable autonomy with rela tively low overhead. In this paper, we propose an enhanced Communication-Induced checkpointing algorithm. Our al gorithm is likely to have less checkpointing overhead than an existing algorithm in the literature.

