Journal Article•DOI•

Distributed snapshots: determining global states of distributed systems

Q: what is the state of a channel in a global state?

The authors say that seq is a computation of the system if and only if event ei can occur in global state Si, 0 5 i 5 n, where So is the initial global state andSi+l = neXt(Si, ei) for 0 5 i 5 n.

Q: what is the value of next(S, e)?

The authors define a function next, where next (S, e) is the global state immediately after the occurrence of event e in global state S. The value of next(S, e) is defined only if event e can occur in global state S, in which case next(S, e) is the global state identical to S except that: (1) the state of p in next(S, e) is s’; (2) if e is a channel directed towards p, then the state of c in next(S, e) is c’s state in S with message M deleted from its head; and (3) if c is a channel directed away from p, then the state of c in next(S, e) is the same as c’s state in S with message M added to the tail.

Q: what is the state of ei in seq?

Event ei in seq is called a postrecording event if and only if it is not a prerecording event-that is, if ei is in a process p and p records its state before ei in seq.

Q: What is the effect of q recording its state?

Termination in finite time is ensured if for every process q: q spontaneously records its state or there is a path from a process p, which spontaneously records its state, to q.

Q: What is the q-marker-receiving rule for a process?

For each channel c, incident on, and directed away from p:p sends one marker along c after p records its state and before p sends further messages along c.Marker-Receiving Rule for a Process q.

Q: What is the state-transition diagram for q in Example 2.2?

6. State-transition diagram for process q in Example 2.2.initial globalstate A-state rZstate So1 p sends M S@QBc global state Slq sends M’S,eBD global state S2p receives M’@ADD global state S3A emptyFig.

K. Mani Chandy¹, Leslie Lamport²•Institutions (2)

University of Texas at Austin¹, SRI International²

01 Feb 1985-ACM Transactions on Computer Systems (ACM)-Vol. 3, Iss: 1, pp 63-75

TL;DR: An algorithm by which a process in a distributed system determines a global state of the system during a computation, which helps to solve an important class of problems: stable property detection.

read less

Abstract: This paper presents an algorithm by which a process in a distributed system determines a global state of the system during a computation. Many problems in distributed systems can be cast in terms of the problem of detecting global states. For instance, the global state detection algorithm helps to solve an important class of problems: stable property detection. A stable property is one that persists: once a stable property becomes true it remains true thereafter. Examples of stable properties are “computation has terminated,” “ the system is deadlocked” and “all tokens in a token ring have disappeared.” The stable property detection problem is that of devising algorithms to detect a given stable property. Global state detection can also be used for checkpointing.

...read moreread less

Summary (2 min read)

Jump to: [Introduction] – [A global state of a distributed system is a set of component process and channel] – [3.1. Motivation for the Steps of the Algorithm] – [3.2 Global-State-Detection Algorithm Outline] – [3.3 Termination of the Algorithm The marker receiving and sending rules guarantee that if a marker is received along every channel, then each process will record its state and the states of all] – [4. PROPERTIES OF THE RECORDED GLOBAL STATE] – [Si = Si for all i where i # j.] and [5. STABILITY DETECTION]

Introduction

This paper presents algorithms by which a process in a distributed system can determine a global state of the system during a computation.
The photographers must take several snapshots and piece the snapshots together to form a picture of the overall scene.
Examples of stable properties are “computation has terminated, ” “the system is deadlocked,” and “all tokens in a token ring have disappeared.”.
Channels are assumed to have infinite buffers, to be error-free, and to deliver messages in the order sent.

A global state of a distributed system is a set of component process and channel

The initial global state is one in which the state of each process is its initial state and the state of each channel is the empty sequence, also known as states.
The system contains one token that is passed from one process to another, and hence the authors call this system the “single-token conservation” system.
Events ‘p sends M ” and “q sends M’ ” may occur in the initial global state, and the next states after these events are different.

3.1. Motivation for the Steps of the Algorithm

The global-state recording algorithm works as follows: Each process records its own state, and the two processes that a channel is incident on cooperate in recording the channel state.
The algorithm, may send messages and require processes to carry out computations; however, the messages and computation required to record the global state must not interfere with the underlying computation.
Now assume that the global state transits to in-c (because p sends the token).
This example suggests that the recorded global state may be inconsistent if the state of c is recorded before p sends a message along c and the state of p is recorded after p sends a message along c, that is, if n > n’.

3.2 Global-State-Detection Algorithm Outline

For each channel c, incident on, and directed away from p: p sends one marker along c after p records its state and before p sends further messages along c. Marker-Receiving Rule for a Process q.
On receiving a marker along a channel C: if q has not recorded its state then begin q records its state; q records the state c as the empty sequence end else q records the state of c as the sequence of messages received along c after q’s state was recorded and before q received the marker along c.

3.3 Termination of the Algorithm The marker receiving and sending rules guarantee that if a marker is received along every channel, then each process will record its state and the states of all

Hence if p records its state and there is a path (in the graph representing the system) from p to a process q, then q will record its state in finite time because, by induction, every process along the path will record its state in finite time.
The recorded process and channel states must be collected and assembled to form the recorded global state.
The authors shall not describe algorithms for collecting the recorded information because such algorithms have been described elsewhere [4, lo].
A simple algorithm for collecting information in a system whose topology is strongly connected is for each process to send the information it records along all outgoing channels, and for each process receiving information for the first time to copy it and propagate it along all of its outgoing channels.

4. PROPERTIES OF THE RECORDED GLOBAL STATE

To gain an intuitive understanding of the properties of the global state recorded by the algorithm, the authors shall study Example 2.2.
After recording its state, q sends a marker along channel c’.
The recording algorithm was initiated in global state 5’0 and terminated in global state s3.
Observe that the global state S* recorded by the algorithm is not identical to any of the global states.
So, S1, Sz, S3 that occurred in the computation.

Si = Si for all i where i # j.

Now the authors shall show that the global state after all prerecording events and before all postrecording events in seq’ is S.
The sequence of messages sent by p along c before p sends a marker along c is the sequence corresponding to prerecorded sends on c. Part (2) now follows.
The purpose of this example is to show how the computation seq’ is derived from the computation seq.
The sequence ACM Transactions on Computer Systems, Vol. 3, NO.

5. STABILITY DETECTION

The authors now solve the stability-detection problem described in Section 1.
A stability-detection algorithm is defined as follows: Input: A stable property y Output: A Boolean value definite with the property: (y(S,) + definite) and (definite --$ y(S,) where S, and S, are the global states of the system when the algorithm is initiated and when it terminates, respectively.
Definite = false implies that the stable property does not hold when the algorithm is initiated.
The outline of the current version of the proof was suggested by them.
On partially-ordered event models of distributed computa- tions.

Did you find this useful? Give us your feedback

Content maybe subject to copyright Report

Distributed Snapshots: Determining Global

States of Distributed Systems

K. MANI CHANDY

University of Texas at Austin

and

LESLIE LAMPORT

Stanford Research Institute

This paper presents an algorithm by which a process in a distributed system determines a global

state of the system during a computation. Many problems in distributed systems can be cast in terms

of the problem of detecting global states. For instance, the global state detection algorithm helps to

solve an important class of problems: stable property detection. A stable property is one that persists:

once a stable property becomes true it remains true thereafter. Examples of stable properties are

“computation has terminated,” “ the system is deadlocked” and “all tokens in a token ring have

disappeared.” The stable property detection problem is that of devising algorithms to detect a given

stable property. Global state detection can also be used for checkpointing.

Categories and Subject Descriptors: C.2.4 [Computer-Communication Networks]: Distributed

Systems-distributed applications; distributed databases; network operating systems; D.4.1 [Operating

Systems]: Process Management-concurrency; deadlocks, multiprocessing/multiprogramming; mutual

exclusion; scheduling; synchronization; D.4.5 [Operating Systems]: Reliability-backup procedures;

checkpoint/restart; fault-tolerance; verification

General Terms: Algorithms

Additional Key Words and Phrases: Global States, Distributed deadlock detection, distributed

systems, message communication systems

1. INTRODUCTION

This paper presents algorithms by which a process in a distributed system can

determine a global state of the system during a computation. Processes in a

distributed system communicate by sending and receiving messages. A process

can record its own state and the messages it sends and receives;

it can record

nothing

else. To determine a global system state, a process

must enlist the

This work was supported in part by the Air Force Office of Scientific Research under Grant AFOSR

81-0205 and in part by the National Science Foundation under Grant MCS 81-04459.

Authors’ addresses: K. M. Chandy, Department of Computer Sciences, University of Texas at Austin,

Austin, TX 78712; L. Lamport, Stanford Research Institute, Menlo Park, CA 94025.

Permission to copy without fee all or part of this material is granted provided that the copies are not

made or distributed for direct commercial advantage, the ACM copyright notice and the title of the

publication and its date appear, and notice is given that copying is by permission of the Association

for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific

permission.

0 1985 ACM 0734-2071/85/0200-0063 $00.75

ACM Transactions on Computer Systems, Vol. 3, No. 1, February 1985, Pages 63-75.

K. M. Chandy and L. Lamporl

cooperation of other processes that must record their own local states and send

the recorded local states to

All processes cannot record their local states at

precisely the same instant unless they have access to a common clock. We assume

that processes do not share clocks or memory. The problem is to devise algorithms

by which processes record their own states and the states of communication

channels so that the set of process and channel states recorded form a global

system state. The global-state-detection algorithm is to be superimposed on the

underlying computation: it must run concurrently with, but not alter, this

underlying computation.

The state-detection algorithm plays the role of a group of photographers

observing a panoramic, dynamic scene, such as a sky filled with migrating birds-

a scene so vast that it cannot be captured by a single photograph. The photog-

raphers must take several snapshots and piece the snapshots together to form a

picture of the overall scene. The snapshots cannot all be taken at precisely the

same instant because of synchronization problems. Furthermore, the photogra-

phers should not disturb the process that is being photographed; for instance,

they cannot get all the birds in the heavens to remain motionless while the

photographs are taken. Yet, the composite picture should be meaningful. The

problem before us is to define “meaningful” and then to determine how the

photographs should be taken.

We now describe an important class of problems that can be solved with the

global-state-detection algorithm. Let y be a predicate function defined on the

global states of a distributed system D; that is, y(S) is true or false for a global

state S of

The predicate y is said to be a

stable property

if y(S) implies

y(S’) for all global states S’ of

reachable from global state S of

In other

words, if y is a stable property and y is true at a point in a computation of

then y is true at all later points in that computation. Examples of stable properties

are “computation has terminated,

” “the system is deadlocked,” and “all tokens

in a token ring have disappeared.”

Several distributed-system problems can be formulated as the general problem

of devising an algorithm by which a process in a distributed system can determine

whether a stable property y of the system holds. Deadlock detection [2, 5, 8, 9,

111 and termination detection [l, 4, lo] are special cases of the stable-property

detection problem. Details of the algorithm are presented later. The basic idea

of the algorithm is that a global state S of the system is determined and y(S) is

computed to see if the stable property y holds.

Several algorithms for solving deadlock and termination problems by deter-

mining the global states of distributed systems have been published. Gligor and

Shattuck [5] state that many of the published algorithms are incorrect and

impractical. A reason for the incorrect or impractical algorithms may be that the

relationships among local process states, global system states, and points in a

distributed computation are not well understood. One of the contributions of this

paper is to define these relationships.

Many distributed algorithms are structured as a sequence of phases, where

each phase consists of a transient part in which useful work is done, followed by

a stable part in which the system cycles endlessly and uselessly. The presence of

stable behavior indicates the end of a phase. A phase is similar to a series of

ACM Transactions on Computer Systems, Vol. 3, No. 1, February 1985.

Distributed Snapshots

iterations in a sequential program, which are repeated until successive iterations

produce no change, that is, stability is attained. Stability must be detected so

that one phase can be terminated and the next phase initiated [lo]. The

termination of a computational phase is not identical to the termination of a

computation. When a computation terminates, all activities cease-messages are

not sent and process states do not change. There may be activity during the

stable behavior that indicates the end of a computational phase-messages may

be sent and received, and processes may change state, but this activity serves no

purpose other than to signal the end of a phase. In this paper, we are concerned

with the detection of stable system properties; the cessation of activity is only

one example of a stable property.

Strictly speaking, properties such as “the system is deadlocked” are not stable

if the deadlock is “broken” and computation is reinitiated. However, to keep

exposition simple, we shall partition the overall problem into the problems of (1)

detecting the termination of one phase (and informing all processes that a phase

has ended) and (2) initiating a new phase. The following is a stable property:

“the kth computational phase has terminated,” lz = 1,2, . . . . Hence, the methods

presented in this paper are applicable to detecting the termination of the lath

phase for a given k.

In this paper we restrict attention to the problem of detecting stable properties.

The problem of initiating the next phase of computation is not considered here

because the solution to that problem varies significantly depending on the

application, being different for database deadlock detection than for detecting

the termination of a diffusing computation.

We have to present our algorithms in terms of a model of a system. The model

chosen is not important in itself; we could have couched our discussion in terms

of other models. We shall describe our model informally and only to the level of

detail necessary to make the algorithms clear.

2. MODEL OF A DISTRIBUTED SYSTEM

A distributed system consists of a finite set of processes and a finite set of

channels. It is described by a labeled, directed graph in which the vertices

represent processes and the edges represent channels. Figure 1 is an example.

Channels are assumed to have infinite buffers, to be error-free, and to deliver

messages in the order sent. (The infinite buffer assumption is made for ease of

exposition: bounded buffers may be assumed provided there exists a proof that

no process attempts to add a message to a full buffer.) The delay experienced by

a message in a channel is arbitrary but finite. The sequence of messages received

along a channel is an initial subsequence of the sequence of messages sent along

the channel. The state of a channel is the sequence of messages sent along the

channel, excluding the messages received along the channel.

A process is defined by a set of states, an initial state (from this set), and a set

of events. An event e in a process

is an atomic action that may change the state

itself and the state of

at most one

channel c incident on

the state of c may

be changed by the sending of a message along c (if c is directed away from

the receipt of a message along c (if c is directed towards

p).

An event e is defined

by (1) the process

in which the event occurs, (2) the state s of

immediately

ACM Transactions on Computer Systems, Vol. 3, No. 1, February 1985.

K. M. Chandy and L. Lamport

Fig. 1. A distributed system with processes p,

and r and channels cl, c2, c3, and c4.

before the event, (3) the state s’ of p immediately after the event, (4) the channel

(if any) whose

state

is altered by the event, and (5) the message M, if any, sent

along c (if c is a channel directed away from p) or received along c (if c is directed

towards p). We define e by the 5-tuple (p, s, s’, M, c), where M and c are a

special symbol, null, if the occurrence of e does not change the state of any

channel.

A global state of a distributed system is a set of component process and channel

states:

the initial global state is one in which the state of each process is its initial

state

and the state of each channel is the empty sequence. The occurrence of an

event may change the global state. Let e = (p, s, s’, M, c) we say e can occur in

global state S if and only if (1) the state of process p in global state S is s and

(2) if c is a channel directed towards p, then the state of c in global state S is a

sequence of messages with M at its head. We define a function next, where

next (S, e) is the global state immediately after the occurrence of event e in global

state S. The value of next(S, e) is defined only if event e can occur in global state

S, in which case next(S, e) is the global state identical to S except that: (1) the

state

of p in next(S, e) is s’; (2) if e is a channel directed towards p, then the

state of c in next(S, e) is c’s state in S with message M deleted from its head;

and (3) if c is a channel directed away from p, then the state of c in next(S, e) is

the same as c’s state in S with message M added to the tail.

Let seq = (ei: 0 5 i 5 n) be a sequence of events in component processes of a

distributed system. We say that seq is a computation of the system if and only if

event ei can occur in global state Si, 0 5 i 5 n, where So is the initial global state

and

Si+l = neXt(Si, ei) for 0 5 i 5 n.

An alternate model, based on Lamport [6], which views computations as

partially ordered sets of events, is given in [7].

Example 2.1. To illustrate the definition of a distributed system, consider a

simple system consisting of two processes p and q, and two channels c and c’ as

shown in Figure 2.

The system contains one token that is passed from one process to another, and

hence we call this system the “single-token conservation” system. Each process

has two states, so and sl, where so is the state in which the process does not

possess the token and s1 is the state in which it does. The initial state of p is sl

and of q is so. Each process has two events: (1) a transition from s1 to so wit’- the

ACM

Transactions on

Computer Systems, Vol. 3, No.

1, February 1985.

Distributed Snapshots

Fig. 2. The simple distributed system of

Examples 2.1 and 2.2.

channel

process

Fig. 3. State-transition diagram of a process in

Example 2.1.

receive token

in transit

global state: token in p

global state: token in C’

------

global state: token in q

---_

L---------l

L-------J

Fig. 4. Global states and transitions of the single-token conservation system.

sending of the token, and (2) a transition from so to s1 with the receipt of the

token. The state-transition diagram for a process is shown in Figure 3. The global

states and transitions are shown in Figure 4.

A system computation corresponds to a path in the global-state-transition

diagram (Figure 4) starting at the initial global state. Examples of system

computations are: (1) the empty sequence and (2) (p sends token,

receives

token,

sends token). The following sequence is not a computation of the system:

(p sends token,

sends token), because the event “q sends token” cannot occur

while

is in the state so.

For brevity, the four global states, in order of transition (see Figure 4), will be

called (1) in-p, (2) in-c, (3) in-q, and (4) in-c’, to denote the location of the token.

This example will be used later to motivate the algorithm. Cl

ACM Transactions on Computer Systems, Vol. 3, No.

February

1985.

HTML Viewer

Frequently Asked Questions (10)

Q1. What are the contributions mentioned in the paper "Distributed snapshots: determining global states of distributed systems" ?

This paper presents an algorithm by which a process in a distributed system determines a global state of the system during a computation.

Q2. Why do the authors study the stability detection problem?

The authors study the stability-detection problem because it is a paradigm for many practical problems, such as distributed deadlock detection.

Q3. What is the simplest way to record a state?

To ensure that the global-state recording algorithm terminates in finite time, each process must ensure that (Ll) no marker remains forever in an incident input channel and (L2) it records its state within finite time of initiation of the algorithm.

Q4. what is the state of a postrecording event?

There may be a postrecording event ej-1 before a prerecording event ej for some j, L < j < 4; this can occur only if ej-1 and ej are in different processes (because if ej-1 and cj are in the same process and ej-1 is a postrecording event, then so is ej).

Q5. what is the state of a channel in a global state?

The authors say that seq is a computation of the system if and only if event ei can occur in global state Si, 0 5 i 5 n, where So is the initial global state andSi+l = neXt(Si, ei) for 0 5 i 5 n.

Q6. what is the value of next(S, e)?

The authors define a function next, where next (S, e) is the global state immediately after the occurrence of event e in global state S. The value of next(S, e) is defined only if event e can occur in global state S, in which case next(S, e) is the global state identical to S except that: (1) the state of p in next(S, e) is s’; (2) if e is a channel directed towards p, then the state of c in next(S, e) is c’s state in S with message M deleted from its head; and (3) if c is a channel directed away from p, then the state of c in next(S, e) is the same as c’s state in S with message M added to the tail.

Q7. what is the state of ei in seq?

Event ei in seq is called a postrecording event if and only if it is not a prerecording event-that is, if ei is in a process p and p records its state before ei in seq.

Q8. What is the effect of q recording its state?

Termination in finite time is ensured if for every process q: q spontaneously records its state or there is a path from a process p, which spontaneously records its state, to q.

Q9. What is the q-marker-receiving rule for a process?

For each channel c, incident on, and directed away from p:p sends one marker along c after p records its state and before p sends further messages along c.Marker-Receiving Rule for a Process q.

Q10. What is the state-transition diagram for q in Example 2.2?

6. State-transition diagram for process q in Example 2.2.initial globalstate A-state rZstate So1 p sends M S@QBc global state Slq sends M’S,eBD global state S2p receives M’@ADD global state S3A emptyFig.

Distributed snapshots: determining global states of distributed systems

Summary (2 min read)

Introduction

A global state of a distributed system is a set of component process and channel

3.1. Motivation for the Steps of the Algorithm

3.2 Global-State-Detection Algorithm Outline

3.3 Termination of the Algorithm The marker receiving and sending rules guarantee that if a marker is received along every channel, then each process will record its state and the states of all

4. PROPERTIES OF THE RECORDED GLOBAL STATE

Si = Si for all i where i # j.

5. STABILITY DETECTION

Citations

Cites background from "Distributed snapshots: determining ..."

References

Related Papers (5)

Frequently Asked Questions (10)

Q1. What are the contributions mentioned in the paper "Distributed snapshots: determining global states of distributed systems" ?

Q2. Why do the authors study the stability detection problem?

Q3. What is the simplest way to record a state?

Q4. what is the state of a postrecording event?

Q5. what is the state of a channel in a global state?

Q6. what is the value of next(S, e)?

Q7. what is the state of ei in seq?

Q8. What is the effect of q recording its state?

Q9. What is the q-marker-receiving rule for a process?

Q10. What is the state-transition diagram for q in Example 2.2?