Revocation techniques for Java concurrency

doi:10.1002/CPE.1008

CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE

Concurrency Computat.: Pract. Exper. 2005; 00:1–41 Prepared using cpeauth.cls [Version: 2002/09/19 v2.02]

Revocation techniques for Java

concurrency

Adam Welc

†

, Suresh Jagannathan

‡

, Antony L. Hosking

§

Department of Computer Sciences

Purdue University

250 N. University Street

West Lafayette, IN 47907-2066, U.S.A.

SUMMARY

This paper proposes two approaches to managing concurrency in Java using a guarded region abstraction.

Both approaches use revocation of such regions – the ability to undo their effects automatically

and transparently. These new techniques alleviate many of the constraints that inhibit construction

of transparently scalable and robust concurrent applications. The ﬁrst solution, revocable monitors,

augments existing mutual exclusion monitors with the ability to resolve priority inversion and deadlock

dynamically, by reverting program execution to a consistent state when such situations are detected,

while preserving Java semantics. The second technique, transactional monitors, extends the functionality

of revocable monitors by implementing guarded regions as lightweight transactions that can be executed

concurrently (or in parallel on multiprocessor platforms). The presentation includes discussion of design

and implementation issues for both schemes, as well as a detailed performance study to compare their

behavior with the traditional, state-of-the-art implementation of Java monitors based on mutual exclusion.

KEY WORDS: isolation, atomicity, concurrency, synchronization, Java, speculation

1. Introduction

Managing complexity is a major challenge in constructing robust large-scale server applications

(such as database management systems, application servers, airline reservation systems, etc). In

a typical environment, large numbers of clients may access a server application concurrently. To

provide satisfactory response time and throughput, applications are often made concurrent. Thus, many

programming languages (eg, Smalltalk, C++, ML, Modula-3, Java) provide mechanisms that enable

concurrent programming via a thread abstraction, with threads being the smallest unit of concurrent

†

E-mail: welc@cs.purdue.edu

‡

E-mail: suresh@cs.purdue.edu

§

E-mail: hosking@cs.purdue.edu

Contract/grant sponsor: National Science Foundation; contract/grant number: IIS-9988637, CCR-0085792, STI-0034141

Copyright

c

 2005 John Wiley & Sons, Ltd.

2 A. WELC, S. JAGANNATHAN, A. L. HOSKING

execution. Another key mechanism offered by these languages is the notion of guarded code regions in

which accesses to shared data performed by one thread are isolated from accesses performed by other

threads, and all updates performed by a thread within a guarded region become visible to the other

threads atomically, once the executing thread exits the region. Guarded regions (eg, Java synchronized

methods and blocks, Modula-3 LOCK statements) are usually implemented using mutual-exclusion

locks.

In this paper, we explore two new approaches to concurrent programming, comparing their

performance against use of a state-of-the-art mutual exclusion implementation that uses thin locks

to minimize the overhead of locking [4]. Our discussion is grounded in the context of the Java

programming language, but is applicable to any language that offers the following mechanisms:

• Multithreading: concurrent threads of control executing over objects in a shared address space.

• Synchronized blocks: lexically-delimited blocks of code, guarded by dynamically-scoped

monitors (locks). Threads synchronize on a given monitor, acquiring it on entry to the block

and releasing it on exit. Only one thread may be perceived to execute within a synchronized

block at any time, ensuring exclusive access to all monitor-protected blocks.

• Exception scopes: blocks of code in which an error condition can change the normal ﬂow

of control of the active thread, by exiting active scopes, and transferring control to a handler

associated with each block.

Difﬁculties arising in the use of mutual exclusion locking with multiple threads are widely-

recognized, such as race conditions, priority inversion and deadlock.

Race conditions are a serious issue for non-trivial concurrent programs. A race exists when two

threads can access the same object, and one of the accesses is a write. To avoid races, programmers

must carefully construct their application to trade off performance and throughput (by maximizing

concurrent access to shared data) for correctness (by limiting concurrent access when it could lead to

incorrect behavior), or rely on race detector tools that identify when races occur [7, 8, 18]. Recent work

has advocated higher-level safety properties such as atomicity for concurrent applications [19].

In languages with priority scheduling of threads, a low-priority thread may hold a lock even while

other threads, which may have higher priority, are waiting to acquire it. Priority inversion results when

a low-priority thread T

l

holds a lock required by some high-priority thread T

h

, forcing the high-priority

T

h

to wait until T

l

releases the lock. Even worse, an unbounded number of runnable medium-priority

threads T

m

may exist, thus preventing T

l

from running, making unbounded the time that T

l

(and hence

T

h

) must wait. Such situations can cause havoc in applications where high-priority threads demand

some level of guaranteed throughput.

Deadlock results when two or more threads are unable to proceed because each is waiting on a lock

held by another. Such a situation is easily constructed for two threads, T

1

and T

2

: T

1

ﬁrst acquires lock

L

1

while T

2

acquires L

2

, then T

1

tries to acquire L

2

while T

2

tries to acquire L

1

, resulting in deadlock.

Deadlocks may also result from a far more complex interaction among multiple threads and may stay

undetected until and beyond application deployment. The ability to resolve a deadlock dynamically is

much more attractive than permanently stalling some subset of concurrent threads.

For real-world concurrent programs with complex module and dependency structures, it is difﬁcult

to perform an exhaustive exploration of the space of possible interleavings to determine statically

when races, deadlocks, or priority inversions may arise. For such applications, the ability to redress

undesirable interactions transparently among scheduling decisions and lock managementis very useful.

Copyright

c

 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2005; 00:1–41

Prepared using cpeauth.cls

REVOCATION TECHNIQUES FOR JAVA CONCURRENCY 3

These observations inspire the ﬁrst solution we propose: revocable monitors. Our technique augments

existing mutual exclusion monitors with the ability to resolve priority inversion dynamically (and

automatically). Some instances of deadlock may be resolved by revocation. However, we note that

deadlocks inherent to a program that are independent of scheduling decisions will manifest themselves

as livelock when revocation is used.

A second difﬁculty with using mutual exclusion to mediate data accesses among threads is ensuring

adequate performance when running on multi-processor platforms. To manipulate a complex shared

data structure like a tree or heap, applications must either impose a global locking scheme on the

roots, or employ locks at lower-level nodes in the structure. The former strategy is simple, but reduces

realizable concurrency and may induce false exclusion: threads wishing to access a distinct piece of the

structure may nonetheless block while waiting for another thread that is accessing an unrelated piece

of the structure. The latter approach permits multiple threads to access the structure simultaneously,

but incurs implementation complexity, and requires more memory to hold the necessary lock state.

Our solution to this problem is an alternative to lock-based mutual exclusion: transactional

monitors. These extend the functionality of revocable monitors by implementing guarded regions as

lightweight transactions that can be executed concurrently (or in parallel on multiprocessor platforms).

Transactional monitors deﬁne the following data visibility property that preserves isolation and

atomicity invariants on shared data protected by the monitor: all updates to objects guarded by a

transactional monitor become visible to other threads only on successful completion of the monitor

transaction.

∗

Because transactional monitors impose serializability invariants on the regions they

protect (ie, preserve the appearance of serial execution), they can help reduce race conditions by

allowing programmers to more aggressively guard code regions that may access shared data without

paying a signiﬁcant performance penalty. Since the system dynamically records and redresses state

violations (by revoking the effects of the transaction when a serializability violation is detected),

programmers are relieved from the burden of having to determine when mutual exclusion can safely

be relaxed. Thus, programmers can afford to over-specify code regions that must be guarded, provided

the implementation can relax such over-speciﬁcation safely and efﬁciently whenever possible.

While revocable monitors and transactional monitors rely on similar mechanisms, and can exist

side-by-side in the same virtual machine, their semantics and intended utility are quite different. We

expect revocable monitors to be used primarily to resolve deadlock as well as to improvethroughputfor

high-priority threads by transparently averting priority inversion. In contrast, we envision transactional

monitors as an entirely new synchronization framework that addresses the performance impact of

classical mutual exclusion while simplifying concurrent programming.

We examine the performance and scalability of these different approaches in the context of a state-of-

the-art Java compiler and virtual machine, namely the Jikes Research Virtual Machine (RVM) [3] from

IBM. Jikes RVM is an ideal platform to compare our solutions with pure lock-based mutual exclusion,

since it already uses sophisticated strategies to minimize the overhead of traditional mutual-exclusion

locks [4]. A detailed evaluation in this context provides an accurate depiction of the tradeoffs embodied

and beneﬁts obtained using the solutions we propose.

∗

A slightly weaker visibility property is present in Java for updates performed within a synchronized block (or method);

these are guaranteed to be visible to other threads only upon exit from the block.

Copyright

c

 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2005; 00:1–41

Prepared using cpeauth.cls

4 A. WELC, S. JAGANNATHAN, A. L. HOSKING

T

l

T

h

T

m

synchronized(mon) {

o1.f++;

o2.f++;

bar();

}

foo();

Figure 1. Priority inversion

2. Revocable monitors: Overview

There are several ways to remedy erroneous or undesirable behavior in concurrent programs. Static

techniques can sometimes identify erroneous conditions, allowing programmers to restructure their

application appropriately. When static techniques are infeasible, dynamic techniques can be used both

to identify problems and remedy them when possible. Solutions to priority inversionsuch as the priority

ceiling and priority inheritance protocols [40] are good examples of such dynamic solutions.

Priority ceiling and priority inheritance solve an unbounded priority inversion problem, illustrated

using the code fragment in Figure 1 (both T

l

and T

h

execute the same code and methods foo() and

bar() contain an arbitrary sequence of operations). Let us assume that thread T

l

(low priority) is ﬁrst

to acquire the monitor mon, modiﬁes objects o

1

and o

2

, and is then preempted by thread T

m

(medium

priority). Note that thread T

h

(high priority) is not permitted to enter monitor mon until it has been

released by T

l

, but since method foo() executed by T

m

may contain arbitrary sequence of actions (eg,

synchronous communication with another thread), it may take arbitrary time before T

l

is allowed to run

again (and exit the monitor). Thus thread T

h

may be forced to wait for an unbounded amount of time

before it is allowed to complete its actions.

The priority ceiling technique raises the priority of any locking thread to the highest priority of

any thread that ever uses that lock (ie, its priority ceiling). This requires the programmer to supply

the priority ceiling for each lock used throughout the execution of a program. In contrast, priority

inheritance will raise the priority of a thread only when holding a lock causes it to block a higher

priority thread. When this happens, the low priority thread inherits the priority of the higher priority

thread it is blocking. Both of these solutions prevent a medium priority thread from blocking the

execution of the low priority thread (and thus also the high priority thread) indeﬁnitely. However, even

in the absence of the medium priority thread, the high priority thread is forced to wait until the low

priority thread releases its lock. In the example given, the time to execute method bar() is potentially

unbounded, thus high priority thread T

h

may still be delayed indeﬁnitely until low priority thread T

l

ﬁnishes executing bar() and releases the monitor. Neither priority ceiling nor priority inheritance

offer a solution to this problem.

Besides priority inversion, deadlock is another potentially unwanted consequence of using mutual-

exclusion abstractions. A typical deadlock situation is illustrated with the code fragment in Figure 2.

Let us assume the following sequence of actions: thread T

1

acquires monitor mon1 and updates object

Copyright

c

 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2005; 00:1–41

Prepared using cpeauth.cls

REVOCATION TECHNIQUES FOR JAVA CONCURRENCY 5

T

1

T

2

synchronized(mon1) {

o1.f++;

synchronized(mon2) {

bar();

}

synchronized(mon2) {

o2.f++;

synchronized(mon1) {

bar();

}

Figure 2. Deadlock

o

1

, thread T

2

acquires monitor mon2 and updates object o

2

, thread T

1

attempts to acquire monitor mon2

(T

1

blocks since mon2 is already held by thread T

2

) and thread T

2

attempts to acquire monitor mon1

(T

2

blocks as well since mon1 is already held by T

1

). The result is that both threads are deadlocked –

they will remain blocked indeﬁnitely and method bar() will never get executed by any of the threads.

In both of the scenarios illustrated by Figures 1 and 2, one can identify a single offending thread that

must be revoked in order to resolve either the priority inversion or the deadlock. For priority inversion

the offending thread is the low-priority thread currently executing the monitor. For deadlock, it is either

of the threads engaged in deadlock – there exist various techniques for preventing or detecting deadlock

[21], but all require that the actions of one of the threads leading to deadlock be revoked.

Revocable monitors can alleviate both these issues. Our approach to revocation combines compiler

techniques with run-time detection and resolution. When the need for revocation is encountered, the

run-time system selectively revokes the offending thread executing the monitor (ie, synchronized

block) and its effects. All updates to shared data performed within the monitor are logged. Upon

detecting priority inversion or deadlock (either at lock acquisition, or in the background), the run-time

system interrupts the offending thread, uses the logged updates to undo that thread’s shared updates,

and transfers control of the thread back to the beginning of the block for retry. Externally, the effect of

the roll-back is to make it appear that the offending thread never entered the block.

The process of revoking the effects performed by a low priority thread within a monitor is illustrated

in Figure 3 where wavy lines represent threads T

l

and T

h

, circles represent objects o

1

and o

2

, updated

objects are marked grey, and the box represents the dynamic scope of a common monitor guarding a

synchronized block executed by the threads. This scenario is based on the code from Figure 1 (data

access operations performed within method bar() have been omitted for brevity). In Figure 3(a) low-

priority thread T

l

is about to enter the synchronized block, which it does in Figure 3(b), modifying

object o

1

. High-priority thread T

h

tries to acquire the same monitor, but is blocked by low-priority

T

l

(Figure 3(c)). Here, a priority inheritance approach [40] would raise the priority of thread T

l

to

that of T

h

, but T

h

would still have to wait for T

l

to release the lock. If a priority ceiling protocol was

used, the priority of T

l

would be raised to the ceiling upon its entry to the synchronized block, but

the problem of T

h

being forced to wait for T

l

to release the lock would remain. Instead, our approach

preempts T

l

, undoing any updates to o

1

, and transfers control in T

l

back to the point of entry to the

synchronized block. Here T

l

must wait while T

h

enters the monitor, and updates objects o

1

(Figure 3(e))

Copyright

c

 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2005; 00:1–41

Prepared using cpeauth.cls

Revocation techniques for Java concurrency

Figures

Citations

Compiler and runtime support for efficient software transactional memory

Enforcing isolation and ordering in STM

Safe nondeterminism in a deterministic-by-default parallel language

What do high-level memory models mean for transactions?

A semantic framework for designer transactions

References

Apologizing versus asking permission: optimistic concurrency control for abstract data types

An efficient meta-lock for implementing ubiquitous synchronization

Types for atomicity

The Two-Phase Commitment Protocol in an Extended π-Calculus

A status report on the OO7 OODBMS benchmarking effort

Related Papers (5)

Language support for lightweight transactions

Composable memory transactions

Automated inference of atomic sets for safe concurrent execution

Concurrency control for object bases

Specifying Flexible Concurrency Control Schemes: an Abstract Operational Approach