A. Saeed and M.Y. Javed (Pakistan)
Real-Time, Checkpointing, Laxity, Fault tolerance, Reliability, Transient Faults, Recovery.
A system is reliable only if it is fault tolerant. All the operations performed in real-time systems are time lined. This paper presents a dynamic strategy which incorporates fault tolerance in real-time system through checkpointing. One of the major reasons of failures in real-time systems is transient faults. Checkpointing and rollback recovery is a very efficient technique for tolerating transient faults. These strategies are classified as static, equidistant and dynamic. Dynamic strategies tolerate faults by ensuring that each task meets its specified deadline. The proposed mechanism uses a watchdog process to detect faults by monitoring the system. On detection of a fault, the system is rolled back to the previous checkpoint. Two algorithms have been developed, decision test and scheduler. The first decides whether to create a checkpointing or not and the second efficiently schedules the tasks based on priority. Simulation has been obtained to analyze the performance of the system. The results show that the proposed checkpointing strategy provides a better fault tolerance than equidistant and dynamic strategies under harsh conditions.
Important Links:
Go Back