An Improved Failure Detector based on PULL and PUSH Approach

S. Wang and X. Yun (PRC)

Keywords

distributed systems; fault-tolerant; failure detector; push;push

Abstract

It is widely recognized that the detection of process failures is a crucial problem for fault-tolerance distributed systems, and failure detectors are used in a wide variety of settings, such as network communication protocols, computer cluster management, group membership protocols, etc. Unfortunately, it is difficult to implement a reliable failure detector in asynchronous distributed systems. In this paper, a new failure detector based on the combination of PULL and PUSH approach is proposed to adapt to the situation of message losses. Through theoretical analysis, the new algorithm is proved to be feasible, and experiments show that this approach can efficiently reduce wrong suspicions caused by message losses, and increase detection time very little.

Important Links:



Go Back