scispace - formally typeset
Search or ask a question
Author

Stephen T. Heumann

Bio: Stephen T. Heumann is an academic researcher from University of Illinois at Urbana–Champaign. The author has contributed to research in topics: Concurrency & Concurrent computing. The author has an hindex of 4, co-authored 8 publications receiving 415 citations.

Papers
More filters
Journal ArticleDOI
25 Oct 2009
TL;DR: It is demonstrated that a practical type and effect system can simplify parallel programming by guaranteeing deterministic semantics with modular, compile-time type checking even in a rich, concurrent object-oriented language such as Java.
Abstract: Today's shared-memory parallel programming models are complex and error-prone.While many parallel programs are intended to be deterministic, unanticipated thread interleavings can lead to subtle bugs and nondeterministic semantics. In this paper, we demonstrate that a practical type and effect system can simplify parallel programming by guaranteeing deterministic semantics with modular, compile-time type checking even in a rich, concurrent object-oriented language such as Java. We describe an object-oriented type and effect system that provides several new capabilities over previous systems for expressing deterministic parallel algorithms.We also describe a language called Deterministic Parallel Java (DPJ) that incorporates the new type system features, and we show that a core subset of DPJ is sound. We describe an experimental validation showing thatDPJ can express a wide range of realistic parallel programs; that the new type system features are useful for such programs; and that the parallel programs exhibit good performance gains (coming close to or beating equivalent, nondeterministic multithreaded programs where those are available).

318 citations

Journal ArticleDOI
26 Jan 2011
TL;DR: A language together with a type and effect system that supports nondeterministic computations with a deterministic-by-default guarantee, which provides a static semantics, dynamic semantics, and a complete proof of soundness for the language, both with and without the barrier removal feature.
Abstract: A number of deterministic parallel programming models with strong safety guarantees are emerging, but similar support for nondeterministic algorithms, such as branch and bound search, remains an open question. We present a language together with a type and effect system that supports nondeterministic computations with a deterministic-by-default guarantee: nondeterminism must be explicitly requested via special parallel constructs (marked nd), and any deterministic construct that does not execute any nd construct has deterministic input-output behavior. Moreover, deterministic parallel constructs are always equivalent to a sequential composition of their constituent tasks, even if they enclose, or are enclosed by, nd constructs. Finally, in the execution of nd constructs, interference may occur only between pairs of accesses guarded by atomic statements, so there are no data races, either between atomic statements and unguarded accesses (strong isolation) or between pairs of unguarded accesses (stronger than strong isolation alone). We enforce the guarantees at compile time with modular checking using novel extensions to a previously described effect system. Our effect system extensions also enable the compiler to remove unnecessary transactional synchronization. We provide a static semantics, dynamic semantics, and a complete proof of soundness for the language, both with and without the barrier removal feature. An experimental evaluation shows that our language can achieve good scalability for realistic parallel algorithms, and that the barrier removal techniques provide significant performance gains.

76 citations

Proceedings ArticleDOI
23 Feb 2013
TL;DR: A new concurrent programming model based on tasks with effects that offers strong safety guarantees while still providing the flexibility needed to support the many ways that concurrency is used in complex applications is proposed.
Abstract: Today's widely-used concurrent programming models either provide weak safety guarantees, making it easy to write code with subtle errors, or are limited in the class of programs that they can express. We propose a new concurrent programming model based on tasks with effects that offers strong safety guarantees while still providing the flexibility needed to support the many ways that concurrency is used in complex applications. The core unit of work in our model is a dynamically-created task. The model's key feature is that each task has programmer-specified effects, and a run-time scheduler is used to ensure that two tasks are run concurrently only if they have non-interfering effects. Through the combination of statically verifying the declared effects of tasks and using an effect-aware run-time scheduler, our model is able to guarantee strong safety properties, including data race freedom and atomicity. It is also possible to use our model to write programs and computations that can be statically proven to behave deterministically. We describe the tasks with effects programming model and provide a formal dynamic semantics for it. We also describe our implementation of this model in an extended version of Java and evaluate its use in several programs exhibiting various patterns of concurrency.

25 citations

Proceedings ArticleDOI
09 Nov 2015
TL;DR: This paper infer annotations inspired by Deterministic Parallel Java (DPJ) for a type-safe subset of C++ that gives strong safety guarantees and expresses the inference as a constraint satisfaction problem and develops, implement, and evaluate an algorithm for solving it.
Abstract: In this paper, we present the first full regions-and-effects inference algorithm for explicitly parallel fork-join programs. We infer annotations inspired by Deterministic Parallel Java (DPJ) for a type-safe subset of C++. We chose the DPJ annotations because they give the strongest safety guarantees of any existing concurrency-checking approach we know of, static or dynamic, and it is also the most expressive static checking system we know of that gives strong safety guarantees. This expressiveness, however, makes manual annotation difficult and tedious, which motivates the need for automatic inference, but it also makes the inference problem very challenging: the code may use region polymorphism, imperative updates with complex aliasing, arbitrary recursion, hierarchical region specifications, and wildcard elements to describe potentially infinite sets of regions. We express the inference as a constraint satisfaction problem and develop, implement, and evaluate an algorithm for solving it. The region and effect annotations inferred by the algorithm constitute a checkable proof of safe parallelism, and it can be recorded both for documentation and for fast and modular safety checking.

4 citations

Proceedings ArticleDOI
07 Jun 2012
TL;DR: This work argues that a concurrent programming model should offer strong safety guarantees, while still providing the flexibility and performance needed to support the many ways that concurrency is used in complex, interactive applications.
Abstract: Concurrent programming has become ubiquitous, but today's widely-used concurrent programming models provide few safety guarantees, making it easy to write code with subtle errors. Models that do give strong guarantees often can only express a relatively limited class of programs. We argue that a concurrent programming model should offer strong safety guarantees, while still providing the flexibility and performance needed to support the many ways that concurrency is used in complex, interactive applications.To achieve this, we propose a new programming model based on tasks with effects. In this model, the core unit of work is a dynamically-created task. The key feature of our model is that each task has programmer-specified, statically-checked effects, and a runtime scheduler is used to ensure that two tasks are run concurrently only if their effects are non-interfering. Our model guarantees strong safety properties, including data race freedom and a form of atomicity. We describe this programming model and its properties, and propose several research questions related to it.

2 citations


Cited by
More filters
Proceedings ArticleDOI
10 Nov 2012
TL;DR: A runtime system that dynamically extracts parallelism from Legion programs, using a distributed, parallel scheduling algorithm that identifies both independent tasks and nested parallelism.
Abstract: Modern parallel architectures have both heterogeneous processors and deep, complex memory hierarchies. We present Legion, a programming model and runtime system for achieving high performance on these machines. Legion is organized around logical regions, which express both locality and independence of program data, and tasks, functions that perform computations on regions. We describe a runtime system that dynamically extracts parallelism from Legion programs, using a distributed, parallel scheduling algorithm that identifies both independent tasks and nested parallelism. Legion also enables explicit, programmer controlled movement of data through the memory hierarchy and placement of tasks based on locality information via a novel mapping interface. We evaluate our Legion implementation on three applications: fluid-flow on a regular grid, a three-level AMR code solving a heat diffusion equation, and a circuit simulation.

500 citations

Proceedings ArticleDOI
23 Oct 2011
TL;DR: Experimental results show that Dthreads substantially outperforms a state-of-the-art deterministic runtime system, and for a majority of the benchmarks evaluated here, matches and occasionally exceeds the performance of pthreads.
Abstract: Multithreaded programming is notoriously difficult to get right. A key problem is non-determinism, which complicates debugging, testing, and reproducing errors. One way to simplify multithreaded programming is to enforce deterministic execution, but current deterministic systems for C/C++ are incomplete or impractical. These systems require program modification, do not ensure determinism in the presence of data races, do not work with general-purpose multithreaded programs, or run up to 8.4× slower than pthreads. This paper presents Dthreads, an efficient deterministic multithreading system for unmodified C/C++ applications that replaces the pthreads library. Dthreads enforces determinism in the face of data races and deadlocks. Dthreads works by exploding multithreaded applications into multiple processes, with private, copy-on-write mappings to shared memory. It uses standard virtual memory protection to track writes, and deterministically orders updates by each thread. By separating updates from different threads, Dthreads has the additional benefit of eliminating false sharing. Experimental results show that Dthreads substantially outperforms a state-of-the-art deterministic runtime system, and for a majority of the benchmarks evaluated here, matches and occasionally exceeds the performance of pthreads.

247 citations

Journal ArticleDOI
25 Oct 2009
TL;DR: Grace is presented, a software-only runtime system that eliminates concurrency errors for a class of multithreaded programs: those based on fork-join parallelism, and can achieve high scalability and performance while preventing concurrence errors.
Abstract: The shift from single to multiple core architectures means that programmers must write concurrent, multithreaded programs in order to increase application performance. Unfortunately, multithreaded applications are susceptible to numerous errors, including deadlocks, race conditions, atomicity violations, and order violations. These errors are notoriously difficult for programmers to debug.This paper presents Grace, a software-only runtime system that eliminates concurrency errors for a class of multithreaded programs: those based on fork-join parallelism. By turning threads into processes, leveraging virtual memory protection, and imposing a sequential commit protocol, Grace provides programmers with the appearance of deterministic, sequential execution, while taking advantage of available processing cores to run code concurrently and efficiently. Experimental results demonstrate Grace's effectiveness: with modest code changes across a suite of computationally-intensive benchmarks (1-16 lines), Grace can achieve high scalability and performance while preventing concurrency errors.

244 citations

Journal ArticleDOI
13 Mar 2010
TL;DR: This work develops a compiler and runtime system that runs arbitrary multithreaded C/C++ POSIX Threads programs deterministically but resorts to serialization rarely, for handling interthread communication and synchronization.
Abstract: The behavior of a multithreaded program does not depend only on its inputs. Scheduling, memory reordering, timing, and low-level hardware effects all introduce nondeterminism in the execution of multithreaded programs. This severely complicates many tasks, including debugging, testing, and automatic replication. In this work, we avoid these complications by eliminating their root cause: we develop a compiler and runtime system that runs arbitrary multithreaded C/C++ POSIX Threads programs deterministically. A trivial non-performant approach to providing determinism is simply deterministically serializing execution. Instead, we present a compiler and runtime infrastructure that ensures determinism but resorts to serialization rarely, for handling interthread communication and synchronization. We develop two basic approaches, both of which are largely dynamic with performance improved by some static compiler optimizations. First, an ownership-based approach detects interthread communication via an evolving table that tracks ownership of memory regions by threads. Second, a buffering approach uses versioned memory and employs a deterministic commit protocol to make changes visible to other threads. While buffering has larger single-threaded overhead than ownership, it tends to scale better (serializing less often). A hybrid system sometimes performs and scales better than either approach individually. Our implementation is based on the LLVM compiler infrastructure. It needs neither programmer annotations nor special hardware. Our empirical evaluation uses the PARSEC and SPLASH2 benchmarks and shows that our approach scales comparably to nondeterministic execution.

233 citations

Proceedings Article
30 Mar 2009
TL;DR: This work argues for a parallel programming model that is deterministic by default: deterministic behavior is guaranteed unless the programmer explicitly uses nondeterministic constructs, particularly challenging for modern object-oriented languages with expressive use of reference aliasing and updates to shared mutable state.
Abstract: In today's widely used parallel programming models, subtle programming errors can lead to unintended nondeterministic behavior and hard to catch bugs. In contrast, we argue for a parallel programming model that is deterministic by default: deterministic behavior is guaranteed unless the programmer explicitly uses nondeterministic constructs. This goal is particularly challenging for modern object-oriented languages with expressive use of reference aliasing and updates to shared mutable state. We propose a broad research agenda in support of this goal, and we describe some of our own work to further that agenda.

205 citations