A model for error recovery with global checkpointing

doi:10.1016/0020-0255(83)90026-9

Journal ArticleDOI

A model for error recovery with global checkpointing

Krishna Kant

- 01 Sep 1983 -

Information Sciences

- Vol. 30, Iss: 3, pp 225-239

TLDR

A new technique for providing software fault tolerance in concurrent systems is proposed that combines the traditional global checkpointing mechanism with the recovery block concept in order to come up with an easily implementable error recovery mechanism.

About:

This article is published in Information Sciences.The article was published on 1983-09-01. It has received 4 citations till now. The article focuses on the topics: Software fault tolerance & Overhead (computing).

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A survey of rollback-recovery protocols in message-passing systems

Elmootazbellah Nabil Elnozahy, +3 more

- 01 Sep 2002 -

ACM Computing Surveys

TL;DR: This survey covers rollback-recovery techniques that do not require special language constructs and distinguishes between checkpoint-based and log-based protocols, which rely solely on checkpointing for system state restoration.

...read moreread less

Checkpointing and the modeling of program execution time

Victort F. Nicola

TL;DR: This chapter considers several models of checkpointing and recovery in a program in order to derive the distribution of program execution time or its expectation, and it is shown that the expected execution time increases linearly with the processing requirement in the presence of checkpoints.

...read moreread less

Journal ArticleDOI

A quasi-synchronous checkpointing algorithm that prevents contention for stable storage

Dakshnamoorthy Manivannan, +3 more

- 01 Aug 2008 -

Information Sciences

TL;DR: This paper proposes a staggered quasi-synchronous checkpointing algorithm which reduces contention for network stable storage without any synchronization overhead.

...read moreread less

Journal ArticleDOI

On selecting rollback points for error recovery

Raj Sekhar Pamula, +2 more

- 01 Jun 1986 -

Information Sciences

TL;DR: A generalized formulation for checkpoint selection is given from which three different schemes are derived and each scheme is shown to have a smaller expected cost of recovery and a larger optimal checkpoint interval than rolling back to the most recent checkpoint.

Bev Littlewood

- 01 Sep 1980 -

IEEE Transactions on Software Engineerin...

TL;DR: An examination of the assumptions used in early bug-counting models of software reliability shows them to be deficient and it is suggested that current theories are only the first step along what threatens to be a long road.

...read moreread less

A model for error recovery with global checkpointing

Citations

A survey of rollback-recovery protocols in message-passing systems

Checkpointing and the modeling of program execution time

A quasi-synchronous checkpointing algorithm that prevents contention for stable storage

On selecting rollback points for error recovery

References

System structure for software fault tolerance

System structure for software fault tolerance

Performance-Related Reliability Measures for Computing Systems

On the Optimum Checkpoint Interval

Theories of Software Reliability: How Good Are They and How Can They Be Improved?

Related Papers (5)

A global checkpointing model for error recovery

Simulation analysis of a dynamic checkpointing strategy for real-time systems

Analysis of Checkpointing for Real-Time Systems

Algorithm-based recovery for iterative methods without checkpointing

Adaptive page-level incremental checkpointing based on expected recovery time