Fast Transparent Failover for Reliable Web Service

N. Aghdaie and Y. Tamir (USA)

Keywords

Fault-Tolerant Systems, Network Services, TCP.

Abstract

Fault tolerance schemes can be used to increase the availability and reliability of network services. One aspect of such schemes is service failover -- the reconfiguration of available resources and restoration of state required to continue providing the service despite the loss of some of the resources and corruption of parts of the state. We have previously presented CoRAL, a fault tolerance scheme for Web service based on a redundant standby backup server and logging. The focus of this paper is the implementation and evaluation of client-transparent failover for this scheme. In the event of a primary server failure, active client connections failover to a spare where their processing continues seamlessly. If extra server resources are available, a new server can be reintegrated into the system to reestablish fault-tolerant operation. Our performance results indicate short failover times and low overhead during fault-free operation.

Important Links:



Go Back