Eraser: a dynamic data race detector for multithreaded programs

doi:10.1145/265924.265927

Journal Article•DOI•

Eraser: a dynamic data race detector for multithreaded programs

Stefan Savage¹, Michael Burrows, Greg Nelson, Patrick G. Sobalvarro, Thomas Anderson² - Show less +1 more•Institutions (2)

University of Washington¹, University of California, Berkeley²

01 Nov 1997-ACM Transactions on Computer Systems (ACM)-Vol. 15, Iss: 4, pp 391-411

TL;DR: A new tool, called Eraser, is described, for dynamically detecting data races in lock-based multithreaded programs, which uses binary rewriting techniques to monitor every shared-monory reference and verify that consistent locking behavior is observed.

read less

Abstract: Multithreaded programming is difficult and error prone. It is easy to make a mistake in synchronization that produces a data race, yet it can be extremely hard to locate this mistake during debugging. This article describes a new tool, called Eraser, for dynamically detecting data races in lock-based multithreaded programs. Eraser uses binary rewriting techniques to monitor every shared-monory reference and verify that consistent locking behavior is observed. We present several case studies, including undergraduate coursework and a multithreaded Web search engine, that demonstrate the effectiveness of this approach.

...read moreread less

Summary (4 min read)

Jump to: [1 Introduction] – [1.2 Related work] – [2.1 Improving the locking discipline] – [2.2 Initialization and read-sharing] – [2.3 Read-write locks] – [3 Implementing Eraser] – [3.1 Representing the candidate lock sets] – [3.2 Performance] – [3.3 Program annotations] – [3.4 Race detection in an OS kernel] – [4 Experience] – [4.2 Vesta cache server] – [4.3 Petal] – [4.4 Undergraduate coursework] – [4.5 Effectiveness and Sensitivity] – [5 Additional experience] and [6 Conclusion]

1 Introduction

Multi-threading has become a common programming technique.
For this reason, many programmers have resisted using threads.
Called Eraser, that dynamically detects data races in multi-threaded programs.the authors.
A locking discipline is a programming policy that ensures the absence of data races.
Usually a potential data race is a serious error caused by failure to synchronize properly.

2.1 Improving the locking discipline

The simple locking discipline the authors have used so far is too strict.
There are three very common programming practices that violate the discipline yet are free from any data races: Initialization.
Shared variables are frequently initialized without holding a lock.
These can be safely accessed without locks.
Read-write locks allow multiple readers to access a shared variable, but allow only a single writer to do so.

2.2 Initialization and read-sharing

Programmers often take advantage of this observation when initializing newly allocated data.
Unfortunately, the authors have no easy way of knowing when initialization is complete.
When and if another thread accesses the variable, then the state changes.
A write access from a new thread changes the state from Exclusive or Shared to the Shared-Modied state, in which is updated and races are reported, just as described in the original, simple version of the algorithm.
The authors support for initialization makes Eraser’s checking more dependent on the scheduler than the authors would like.

2.3 Read-write locks

Many programs use single-writer, multiple-reader locks as well as simple locks.
The authors continue to use the state transitions of Figure 4, but when the variable enters the Shared-Modied state, the checking is slightly different:.
That is, locks held purely in read mode are removed from the candidate set when a write occurs, as such locks held by a writer do not protect against a data race between the writer and some other reader thread.

3 Implementing Eraser

Eraser is implemented for the DIGITAL Unix operating system on the Alpha processor, using the ATOM [Srivastava & Eustace 94] binary modification system.
To maintain , Eraser instruments each load and store in the program.
Eraser does not instrument loads and stores whose address mode is indirect off the stack pointer, since these are assumed to be stack references, and shared variables are assumed to be in global locations or in the heap.
The report also includes the thread ID, memory address, type of memory access, and important register values such as the program counter and stack pointer.
The authors have found that this information is usually sufficient for locating the source of the race.

3.1 Representing the candidate lock sets

A naı̈ve implementation of lock sets would store a list of candidate locks for each memory location, potentially consuming many times the allocated memory of the program.
The authors can avoid this expense by exploiting the fortunate fact that the number of distinct sets of locks observed in practice is quite small.
The entries in the table are never deallocated or modified, so each lockset index remains valid for the lifetime of the program.
Eraser also caches the result of each intersection, so that the fast case for set intersection is simply a table lookup.
All the standard memory allocation routines are instrumented to allocate and initialize a shadow word for each word allocated by the program.

3.2 Performance

Performance was not a major goal in their implementation of Eraser; consequently it has many opportunities for optimization.
The authors estimate that half of the slowdown is due to the overhead incurred by making a procedure call at every load and store instruction; which could be eliminated by using a version of ATOM that can inline monitoring code [Scales et al. 96].
Also, there are many opportunities for using static analysis to reduce the overhead of the monitoring code; but the authors have not explored them.
In spite of their limited performance tuning, the authors have found that Eraser is fast enough to debug most programs, and therefore meets the most essential performance criteria.

3.3 Program annotations

As expected, their experience with Eraser showed that it can produce false alarms.
Part of their research was aimed at finding effective annotations to suppress false alarms without accidentally losing useful warnings.
Many programs implement free lists or private allocators, and Eraser has no way of knowing that a privately recycled piece of memory is protected by a new set of locks.
True data races were found that did not affect the correctness of the program.
Some of these were intentional and others were accidental.

3.4 Race detection in an OS kernel

The authors have begun to modify Eraser to detect races in the SPIN operating system [Bershad et al. 95].
While the authors do not yet have results in terms of data races found, they have acquired some useful experience about implementing such a tool at the kernel level, which is different from the user level in several ways.
In most systems, raising the interrupt level to n ensures that only interrupts of priority greater than nwill be serviced until the interrupt level is lowered.
When the kernel sets the interrupt level to n, Eraser treats this operation as if the first n interrupt locks had all been acquired.
The most common example is the use of semaphores to synchronize execution between a thread and an I/O device driver.

4 Experience

The authors calibrated Eraser on a number of simple programs that contained common synchronization errors (e.g. forgot to lock, used the wrong lock, etc.) and versions of those programs with the errors corrected.
While programming these tests, the authors accidentally introduced a race, and encouragingly, Eraser detected it.
It also produced false alarms, which the authors were able to suppress with annotations.
The fact that Eraser worked well on the servers is evidence that experienced programmers tend to obey the simple locking discipline even in an environment that offers many more elaborate synchronization primitives.
In the remainder of this section the authors report on the details of their experiences with each program.

4.2 Vesta cache server

Vesta [Digital Equipment 96b] is an advanced software configuration management system.
Configurations are written in a specialized functional language that describes the dependencies and rules used to derive the current state of the software.
This is correct because other threads access the log entries with the log head lock held, and threads do not maintain pointers into the log.
The authors eliminated the report of these races by moving the EraserReuse annotations to the three Flush routines.
The cache server uses a main server thread to wait for incoming RPC requests.

4.3 Petal

Petal is a distributed storage system that presents its clients with a huge virtual disk implemented by a cluster of servers and physical disks [Lee & Thekkath 96].
Petal implements a distributed consensus algorithm as well as failure detection and recovery mechanisms.
The authors found two races where global variables containing statistics were modified without locking.
Finally, the authors found one false alarm that they were unable to annotate away.
GmapCh Write2 implements a join-like construct to keep the stack frame active until the threads return.

4.4 Undergraduate coursework

As a counterpoint to their experience with mature multithreaded server programs, two of their colleagues used Eraser to examine the kinds of synchronization errors found in the homework assignments produced by their undergraduate operating systems class [Choi & Lewis 97].
The authors report their results here to demonstrate how Eraser functions with a less sophisticated code base.
These assignments can be roughly categorized as low-level (build locks from testand-set), thread-level (build a small threads package), synchronization-level (build semaphores and mutexes), and application-level (producer/consumer-style problems).
Each assignment builds on the implementation of the previous assignment.
These were caused by forgetting to take locks, taking locks during writes but not for reads, using different locks to protect the same data structure at different times, and forgetting to re-acquire locks that were released in a loop.

4.5 Effectiveness and Sensitivity

Since Eraser uses a testing methodology it cannot prove that a program is free from data races.
But the authors believe that Eraser works well compared to manual testing and debugging, and that Eraser’s testing is not very sensitive to the scheduler interleaving.
The authors consulted the program history of Ni2 and reintroduced two data races that had existed in previous versions.
The first error was an unlocked access to a reference count used to garbage collect file data structures.
These races had existed in the Ni2 source code for several months before they were manually found and fixed by the program author.

5 Additional experience

Each of which concerns a form of dynamic checking for synchronization errors in multi-threaded programs that the authors experimented with and believe is important and promising, but which they did not implement in Eraser.
Using an earlier version of Eraser that detected race conditions in multi-threaded Modula-3 programs, the authors found that the Lockset algorithm reported false alarms for Trestle programs[Manasse & Nelson 91] that protected shared locations with multiple locks, because each of two readers could access the location while holding two different locks.
This prevented the false alarms, but it is possible for this modification to cause false negatives.
A few seconds into formsedit startup their experimental monitor detected a cycle of locks, showing that no partial order existed.
But more work is required to catalog the sound and useful variations on the partial order discipline, and to develop annotations to suppress false alarms.

6 Conclusion

Hardware designers have learned to design for testability.
Programmers using threads must learn the same.
Programmers in the area of operating systems seem to view dynamic race detection tools as esoteric and impractical.
As the use of multi-threading expands, so will the unreliability caused by data races, unless better methods are used to eliminate them.
The authors believe that the Lockset method implemented in Eraser is promising.

Did you find this useful? Give us your feedback

Figures (4)

Figure 2: The program allows a data race on y, but the error is not detected by happens-before in this execution interleaving.

Figure 1: Lamport’s happens-before orders events in the same thread in temporal order, and orders events in different threads if the threads synchronized with one another between the events.

Figure 4: Eraser keeps track of the state of all locations in memory. Newly allocated locations begin in the Virgin state. As various threads read and write a location, its state changes according to the transitions in the figure. Race conditions are reported only for locations in the Shared-Modied state.

Figure 3: If a shared variable is sometimes protected by lock mu1 and sometimes by lock mu2, then no lock protects it for the whole computation. The figure shows the progressive refinement of the

Content maybe subject to copyright Report

Eraser: A Dynamic Data Race Detector for Multi-Threaded Programs

Stefan Savage

Department of Computer Science and Engineering

University of Washington, Seattle

Michael Burrows Greg Nelson Patrick Sobalvarro

Digital Equipment Corporation

Systems Research Center

Thomas Anderson

Computer Science Division

University of California, Berkeley

Abstract

Multi-threaded programming is difﬁcult and error prone. It

is easy to make a mistake in synchronization that produces a

data race, yet it can be extremely hard to locate this mistake

during debugging. This paper describes a new tool, called

Eraser, for dynamically detecting data races in lock-based

multi-threaded programs. Eraser uses binary rewriting tech-

niques to monitor every shared memory reference and verify

that consistent locking behavior is observed. We present sev-

eral case studies, including undergraduate coursework and a

multi-threaded Web search engine, that demonstrate the ef-

fectiveness of this approach.

1 Introduction

Multi-threading has become a common programming tech-

nique. Most commercial operating systems support threads,

and popular applications like Microsoft Word and Netscape

Navigator are multi-threaded.

Unfortunately, debugging a multi-threaded program can

be difﬁcult. Simple errors in synchronization can produce

timing-dependent data races that can take weeks or months

to track down. For this reason, many programmers have re-

sisted using threads. The difﬁculties with using threads are

well summarized by John Ousterhout in his 1996 USENIX

presentation “Why Threads are a bad idea (for most pur-

poses)”[Ousterhout 96].

savage@cs.washington.edu

In this paper we describe a tool, called Eraser, that dy-

namically detects data races in multi-threaded programs. We

have implemented Eraser for DIGITAL Unix and used it to

detect data races in a number of programs, ranging from the

AltaVista Web search engine to introductory programming

exercises written by undergraduates.

Previous work in dynamic race detection is based on Lam-

port’s happens-before relation[Lamport 78] and checks that

conﬂicting memory accesses from different threads are sepa-

rated by synchronization events. Happens-before algorithms

handle many styles of synchronization, but this generality

comes at a cost. We have aimed Eraser speciﬁcally at the

lock-based synchronization used in modern multi-threaded

programs. Eraser simply checks that all shared memory ac-

cesses follow a consistent locking discipline. A locking dis-

cipline is a programming policy that ensures the absence of

data races. For example, a simple locking discipline is to re-

quire that every variable shared between threads is protected

by a mutual exclusion lock. We will argue that for many pro-

grams Eraser’s approach of enforcing a locking discipline is

simpler, more efﬁcient, and more thorough at catching races

than the approach based on happens-before. As far as we

know, Eraser is the ﬁrst dynamic race detection tool to be

applied to multi-threaded production servers.

The remainder of this paper is organized as follows: After

reviewing what a data race is and describing previous work

in race detection, we present the Lockset algorithm used by

Eraser, ﬁrst at a high level and then at a level low enough

to reveal the main performance-critical implementation tech-

niques. Finally, we describe the experience we have had us-

ing Eraser with a number of multi-threaded programs.

Eraser bears no relationship to the tool by the same name

constructed by John Mellor-Crummey for detecting data

races in shared-memory parallel Fortran programs as part of

the ParaScope Programming Environment[Mellor-Crummey

93].

1.1 Deﬁnitions

A lock is a simple synchronization object used for mutual

exclusion; it is either available, or owned by a thread. The

operations on a lock mu are lock(mu) and unlock(mu).

Thus it is essentially a binary semaphore used for mutual ex-

clusion, but differs from a semaphore in that only the owner

of a lock is allowed to release it.

A data race occurs when two concurrent threads access a

shared variable, and:

at least one access is a write, and

the threads use no explicit mechanism to prevent the

accesses from being simultaneous.

If a program has a potential data race, then the effect of

the conﬂicting accesses to the shared variable will depend on

the interleaving of the thread executions. Although program-

mers occasionally deliberately allow a data race when the

non-determinism seems harmless, usually a potential data

race is a serious error caused by failure to synchronize prop-

erly.

1.2 Related work

An early attempt to avoid data races was the pioneering con-

cept of a monitor introduced by C.A.R. Hoare [Hoare 74]. A

monitor is a group of shared variables together with the pro-

cedures that are allowed to access them, all bundled together

with a single anonymous lock that is automatically acquired

and released at the entry and exit of the procedures. The

shared variables in the monitor are out of scope (that is, invis-

ible) outside the monitor, consequently they can be accessed

only from within the monitor’s procedures, where the lock is

held. Thus monitors provide a static, compile-time guarantee

that accesses to shared variables are serialized and therefore

free from data races. Monitors are an effective way to avoid

data races if all shared variables are static globals, but they

don’t protect against data races in programs with dynami-

cally allocated shared variables, a limitation that early users

found was signiﬁcant[Lampson & Redell 80]. By substitut-

ing dynamic checking for static checking, our work aims to

allow dynamically allocated shared data while retaining as

much of the safety of monitors as possible.

Some attempts have been made to create purely static (that

is, compile-time) race detection systems that work in the

presence of dynamically allocated shared data: for exam-

ple, Sun’s lock

lint [SunSoft 94] and the Extended Static

Checker for Modula-3 [Detlefs et al. 97, Nelson et al. 96].

But these approaches seem problematical since they require

statically reasoning about the program’s semantics.

Most of the previous work in dynamic race detection

has been carried out by the scientiﬁc parallel programming

community [Dinning & Schonberg 90, Netzer 91, Mellor-

Crummey 91, Perkovic & Keleher 96] and is based on Lam-

port’s happens-before relation, which we now describe.

Thread 1 Thread 2

lock(mu);

v := v+1;

unlock(mu);

lock(mu);

v := v+1;

unlock(mu);

Figure 1: Lamport’s happens-before orders events in the same

thread in temporal order, and orders events in different threads if

the threads synchronized with one another between the events.

The happens-before order is a partial order on all events

of all threads in a concurrent execution. Within any single

thread, events are ordered in the order in which they oc-

curred. Between threads, events are ordered according to the

properties of the synchronization objects they access. If one

thread accesses a synchronization object and the next access

to the object is by a different thread, then the ﬁrst access is

deﬁned to happen before the second if the semantics of the

synchronization object forbid a schedule in which these two

interactions are exchanged in time. For example, Figure 1

shows one possible ordering of two threads executing the

same code segment. The three program statements executed

by Thread 1 are ordered by happens-before because they are

executed sequentially in the same thread. The lock of mu by

Thread 2 is ordered by happens-before with the unlock of

mu by Thread 1 because a lock cannot be acquired before its

previous owner has released it. Finally, the three statements

executed by Thread 2 are ordered by happens-before because

they are executed sequentially within that thread.

If two threads both access a shared variable and the ac-

cesses are not ordered by the happens-before relation, then in

another execution of the program in which the slower thread

ran faster and/or the faster thread ran slower, the two ac-

cesses could have happened simultaneously; that is, a data

race could have occurred, whether or not it actually did oc-

cur. All previous dynamic race detection tools that we know

of are based on this observation. These race detectors mon-

itor every data reference and synchronization operation and

check for conﬂicting accesses to shared variables that are un-

related by the happens-before relation for the particular exe-

cution they are monitoring.

Unfortunately, tools based on happens-before have two

signiﬁcant drawbacks. First, they are difﬁcult to implement

efﬁciently because they require per-thread information about

concurrent accesses to each shared memory location. More

importantly, the effectiveness of tools based on happens-

before is highly dependent on the interleaving produced by

the scheduler. Figure 2 shows a simple example where the

happens-before approach can miss a data race. While there is

a potential data race on the unprotected accesses to y, it will

not be detected in the execution shown in the ﬁgure, because

Thread 1 holds the lock before Thread 2, and so the accesses

to y are ordered in this interleaving by happens-before. A

tool based on happens-before would detect the error only if

the scheduler produced an interleaving in which the fragment

of code for Thread 2 occurred before the fragment of code

for Thread 1. Thus, to be effective, a race detector based

on happens-before needs a large number of test cases to test

many possible interleavings. In contrast, the programming

error in Figure 2 will be detected by Eraser with any test

case that exercises the two code paths, because the paths vio-

late the locking discipline for y regardless of the interleaving

produced by the scheduler. While Eraser is a testing tool and

therefore cannot guarantee that a program is free from races,

it can detect more races than tools based on happens-before.

The lock covers technique of Dinning and Shonberg is an

improvement to the happens-before approach for programs

that make heavy use of locks[Dinning & Schonberg 91]. In-

deed, one way to describe our approach would be that we

extend Dinning and Shonberg’s improvement and discard the

underlying happens-before apparatus that they were improv-

ing.

2 The Lockset algorithm

In this section we describe how the Lockset algorithm detects

races. The discussion is at a fairly high level; the techniques

used to implement the algorithm efﬁciently will be described

in the following section.

The ﬁrst and simplest version of the Lockset algorithm

enforces the simple locking discipline that every shared vari-

able is protected by some lock, in the sense that the lock is

held by any thread whenever it accesses the variable. Eraser

checks whether the program respects this discipline by mon-

itoring all reads and writes as the program executes. Since

Eraser has no way of knowing which locks are intended to

protect which variables, it must infer the protection relation

from the execution history.

For each shared variable

, Eraser maintains the set

of candidate locks for . This set contains those locks that

have protected for the computation so far. That is, a lock

is in if in the computation up to that point, every thread

Thread 1 Thread 2

lock(mu);

v := v+1;

unlock(mu);

lock(mu);

v := v+1;

unlock(mu);

y := y+1;

Figure 2: The program allows a data race on y, but the error is not

detected by happens-before in this execution interleaving.

that has accessed was holding at the moment of the ac-

cess. When a new variable is initialized, its candidate set

is considered to hold all possible locks. When the vari-

able is accessed, Eraser updates with the intersection

of and the set of locks held by the current thread. This

process, called lockset renement, ensures that any lock that

consistently protects

is contained in . If some lock

consistently protects , it will remain in as is re-

ﬁned. If becomes empty this indicates that there is no

lock that consistently protects .

In summary, here is the ﬁrst version of the Lockset algo-

rithm:

Let

be the set of locks held by thread .

For each , initialize to the set of all locks.

On each access to

by thread ,

set := ;

if = , then issue a warning.

Figure 3 illustrates how a potential data race is discovered

through lockset reﬁnement. The left column contains pro-

gram statements, executed in order from top to bottom. The

right column reﬂects the set of candidate locks, , after

each statement is executed. This example has two locks, so

starts containing both of them. After v is accessed

while holding mu1, is reﬁned to contain that lock.

lock(mu1);

v := v+1;

unlock(mu1);

lock(mu2);

v := v+1;

unlock(mu2);

{mu1,mu2}

{mu1}

{}

{mu1}

{}

{mu2}

{}

Program locks_held C(v)

Figure 3: If a shared variable is sometimes protected by lock mu1

and sometimes by lock mu2, then no lock protects it for the whole

computation. The ﬁgure shows the progressive reﬁnement of the

set of candidate locks

for . When becomes empty, the

Lockset algorithm has detected that no lock protects

Later, is accessed again, with only mu2 held. The inter-

section of the singleton sets mu1 and mu2 is the empty

set, correctly indicating that no lock protects .

2.1 Improving the locking discipline

The simple locking discipline we have used so far is too

strict. There are three very common programming practices

that violate the discipline yet are free from any data races:

Initialization. Shared variables are frequently initial-

ized without holding a lock.

Read-shared data. Some shared variables are written

during initialization only and are read-only thereafter.

These can be safely accessed without locks.

Read-write locks. Read-write locks allow multiple

readers to access a shared variable, but allow only a sin-

gle writer to do so.

We will extend the Lockset algorithm to accommodate ini-

tialization and read-shared data, and then extend it further to

accommodate read-write locks.

2.2 Initialization and read-sharing

There is no need for a thread to lock out others if no other

thread can possibly hold a reference to the data being ac-

cessed. Programmers often take advantage of this observa-

tion when initializing newly allocated data. To avoid false

alarms caused by these unlocked initialization writes, we de-

lay the reﬁnement of a location’s candidate set until after it

has been initialized. Unfortunately, we have no easy way

Virgin

Exclusive

Shared

Shared!

Modified

rd, new

thread

rd/wr, first

thread

wr, new

thread

Figure 4: Eraser keeps track of the state of all locations in mem-

ory. Newly allocated locations begin in the Virgin state. As various

threads read and write a location, its state changes according to the

transitions in the ﬁgure. Race conditions are reported only for loca-

tions in the Shared-Modied state.

of knowing when initialization is complete. Eraser therefore

considers a shared variable to be initialized when it is ﬁrst

accessed by a second thread. As long as a variable has been

accessed by a single thread only, reads and writes have no

effect on the candidate set.

Since simultaneous reads of a shared variable by multiple

threads are not races, there is also no need to protect a vari-

able if it is read-only. To support unlocked read-sharing for

such data, we report races only after an initialized variable

has become write-shared by more than one thread.

Figure 4 illustrates the state transitions that control when

lockset reﬁnement occurs and when races are reported.

When a variable is ﬁrst allocated, it is set to the Virgin state,

indicating that the data is new and has not yet been refer-

enced by any thread. Once the data is accessed, it enters

the Exclusive state, signifying that it is has been accessed,

but by one thread only. In this state, subsequent reads and

writes by the same thread do not change the variable’s state

and do not update

. This addresses the initialization is-

sue, since the ﬁrst thread can initialize the variable without

causing to be reﬁned. When and if another thread ac-

cesses the variable, then the state changes. A read access

changes the state to Shared. In the Shared state, is up-

dated, but data races are not reported, even if becomes

empty. This takes care of the read-shared data issue, since

multiple threads can read a variable without causing a race

to be reported. A write access from a new thread changes

the state from Exclusive or Shared to the Shared-Modied

state, in which is updated and races are reported, just

as described in the original, simple version of the algorithm.

Our support for initialization makes Eraser’s checking

more dependent on the scheduler than we would like. Sup-

pose that a thread allocates and initializes a shared variable

without a lock, and erroneously makes the variable accessi-

ble to a second thread before it has completed the initializa-

tion. Then Eraser will detect the error if any of the second

thread’s accesses occur before the ﬁrst thread’s ﬁnal initial-

ization actions, but otherwise Eraser will miss the error. We

don’t think this has been a problem, but we have no way of

knowing for sure.

2.3 Read-write locks

Many programs use single-writer, multiple-reader locks as

well as simple locks. To accommodate this style we intro-

duce our last reﬁnement of the locking discipline: we require

that for each variable , some lock protects , meaning

is held in write mode for every write of , and is held in

some mode (read or write) for every read of .

We continue to use the state transitions of Figure 4,

but when the variable enters the Shared-Modied state, the

checking is slightly different:

Let be the set of locks held in any mode by

thread .

Let be the set of locks held in write

mode by thread .

For each , initialize to the set of all locks.

On each read of by thread ,

set := ;

if = , then issue a warning.

On each write of by thread ,

set := ;

if = , then issue a warning.

That is, locks held purely in read mode are removed from

the candidate set when a write occurs, as such locks held by

a writer do not protect against a data race between the writer

and some other reader thread.

3 Implementing Eraser

Eraser is implemented for the DIGITAL Unix operating sys-

tem on the Alpha processor, using the ATOM [Srivastava &

Eustace 94] binary modiﬁcation system. Eraser takes an un-

modiﬁed program binary as input and adds instrumentation

to produce a new binary that is functionally identical, but in-

cludes calls to the Eraser runtime to implement the Lockset

algorithm.

To maintain , Eraser instruments each load and store

in the program. To maintain for each thread ,

Eraser instruments each call to acquire or release a lock, as

well as the stubs that manage thread initialization and ﬁnal-

ization. To initialize for dynamically allocated data,

Eraser instruments each call to the storage allocator.

Eraser treats each 32-bit word in the heap or global data

as a possible shared variable, since on our platform a 32-bit

word is the smallest memory-coherent unit. Eraser does not

instrument loads and stores whose address mode is indirect

off the stack pointer, since these are assumed to be stack ref-

erences, and shared variables are assumed to be in global

locations or in the heap. Eraser will maintain candidate sets

for stack locations that are accessed via registers other than

the stack pointer, but this is an artifact of the implementation

rather than a deliberate plan to support programs that share

stack locations between threads.

When a race is reported, Eraser indicates the ﬁle and line

number at which it was discovered and a backtrace listing of

all active stack frames. The report also includes the thread

ID, memory address, type of memory access, and important

We have found that this information is usually sufﬁcient for

locating the source of the race. If the cause of a race is still

unclear, the user can direct Eraser to log all the accesses to

a particular variable that result in a change to its candidate

lock set.

3.1 Representing the candidate lock sets

A na¨ıve implementation of lock sets would store a list of

candidate locks for each memory location, potentially con-

suming many times the allocated memory of the program.

We can avoid this expense by exploiting the fortunate fact

that the number of distinct sets of locks observed in practice

is quite small. In fact, we have never observed more than

10,000 distinct sets of locks occurring in any execution of

the Lockset monitoring algorithm. Consequently, we rep-

resent each set of locks by a small integer, a lockset index

into a table whose entries represent the set of locks as sorted

vectors of lock addresses. Hashing is used to eliminate du-

plicates in the table and to ﬁnd a lockset index from a given

set of locks. The entries in the table are never deallocated or

modiﬁed, so each lockset index remains valid for the lifetime

of the program. Eraser also caches the result of each inter-

section, so that the fast case for set intersection is simply a

table lookup. Each lock vector in the table is sorted, so that

when the cache fails, the slow case of the intersection oper-

ation can be performed by a simple comparison of the two

sorted vectors.

For every 32-bit word in the data segment and heap, there

is a corresponding shadow word that is used to contain a 30-

bit lockset index and a 2-bit state condition. In the Exclusive

state, the 30 bits are not used to store a lockset index, but

used instead to store the ID of the thread with exclusive ac-

cess.

All the standard memory allocation routines are instru-

mented to allocate and initialize a shadow word for each

word allocated by the program. When a thread accesses a

memory location, Eraser ﬁnds the shadow word by adding a

ﬁxed displacement to the location’s address.

HTML Viewer

Eraser: a dynamic data race detector for multithreaded programs

Summary (4 min read)

1 Introduction

2.1 Improving the locking discipline

2.2 Initialization and read-sharing

2.3 Read-write locks

3 Implementing Eraser

3.1 Representing the candidate lock sets

3.2 Performance

3.3 Program annotations

3.4 Race detection in an OS kernel

4 Experience

4.2 Vesta cache server

4.3 Petal

4.4 Undergraduate coursework

4.5 Effectiveness and Sensitivity

5 Additional experience

6 Conclusion

Figures (4)

Citations

Cites methods from "Eraser: a dynamic data race detecto..."

Cites methods from "Eraser: a dynamic data race detecto..."

Cites background or methods from "Eraser: a dynamic data race detecto..."

Cites background from "Eraser: a dynamic data race detecto..."

References

Related Papers (5)