scispace - formally typeset
Search or ask a question

Showing papers by "Francesco Quaglia published in 1998"


Proceedings ArticleDOI
20 Oct 1998
TL;DR: A communication-induced checkpointing protocol is presented that avoids useless checkpoints by preventing on-the-fly the formation of suspect Z-cycles and its performance with respect to other protocols is discussed.
Abstract: A useless checkpoint corresponds to the occurrence of a checkpoint and communication pattern called Z-cycle. A recent result shows that ensuring a computation without Z-cycles is a particular application of a property, namely Virtual Precedence (VP), defined on an interval-based abstraction of a computation. We first propose a taxonomy of communication-induced checkpointing protocols based on the way they ensure the VP property. Then we derive a sufficient condition ensuring no Z-cycles in a distributed computation. This condition defines a checkpoint and communication pattern, namely suspect Z-cycle, such that if no suspect Z-cycle exists in a distributed computation then no Z-cycle exists. We present finally a communication-induced checkpointing protocol that avoids useless checkpoints by preventing on-the-fly the formation of suspect Z-cycles and discuss its performance with respect to other protocols.

41 citations


Proceedings ArticleDOI
01 Jul 1998
TL;DR: A sparse state saving scheme for Time Warp parallel discrete event simulation that bases the selection of the states to be recorded on the event history of the logical processes and results of synthetic workloads are presented for a performance comparison with previous schemes.
Abstract: The paper presents a sparse state saving scheme for Time Warp parallel discrete event simulation. The scheme bases the selection of the states to be recorded on the event history of the logical processes. To this purpose, statistics on the virtual time advancement of the processes are collected for the prediction of virtual time intervals that are likely to contain rollback points; the states corresponding to the starting point of those intervals are recorded as checkpoints in order to reduce the average coasting forward. The percentage of states to be recorded is defined by a parameter whose value is dynamically recalculated on the basis of the online observation of the variation of a checkpointing rollback cost function. Simulation results of synthetic workloads are presented for a performance comparison with previous schemes.

19 citations


Book ChapterDOI
30 Mar 1998
TL;DR: The basic characteristics a checkpointing protocol needs to work with mobile hosts are shown, namely, reduction of the number of checkpoints, the use of incremental checkpointing and consistent global checkpoint built on the fly.
Abstract: Checkpointing distributed applications involving mobile hosts is an important task to reduce the rollback during a recovery from a failure and to manage voluntary disconnections. In this paper we show the basic characteristics a checkpointing protocol needs to work with mobile hosts, namely, reduction of the number of checkpoints, the use of incremental checkpointing and consistent global checkpoint built on the fly. Previous points must be implemented by using as small control information as possible and ensuring little rollback. A comparative analysis of the performance of some interesting communication-induced checkpointing protocols, adapted to a mobile setting, is presented. The analysis has been carried out by using discrete event simulation and several models have been considered for the hosts mobility.

10 citations


Journal ArticleDOI
TL;DR: This paper presents an analytical model describing the simulation execution time in function of both the state saving cost and the rollback cost, and derives a methodology that allows each logical process to adapt its state saving period on line in order to reduce the Simulation execution time.

9 citations


Book ChapterDOI
01 Jan 1998
TL;DR: This paper presents a checkpointing-recovery scheme which reduces the number of forced checkpoints, compared to previous solutions, while piggybacking, on each message, only three integers as control information.
Abstract: Communication-induced checkpointing algorithms require cooperating processes, which take checkpoints at their own pace, to take some forced checkpoints in order to guarantee domino-freeness. In this paper we present a checkpointing-recovery scheme which reduces the number of forced checkpoints, compared to previous solutions, while piggybacking, on each message, only three integers as control information. This is achieved by using information about the history of a process and an equivalence relation between local checkpoints that we introduce in this paper. A simulation study is also presented which quantifies such a reduction.

6 citations