Home
/
Authors
/
Raghavan Raman

Author

Raghavan Raman

Other affiliations: IBM, Oracle Corporation, Business International Corporation

Bio: Raghavan Raman is an academic researcher from Rice University. The author has contributed to research in topics: Graph database & Programming paradigm. The author has an hindex of 12, co-authored 20 publications receiving 601 citations. Previous affiliations of Raghavan Raman include IBM & Oracle Corporation.

Topics: Graph database, Programming paradigm, Java concurrency, Cilk, Task parallelism ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Work-first and help-first scheduling policies for async-finish task parallelism

[...]

Yi Guo¹, Rajkishore Barik¹, Raghavan Raman¹, Vivek Sarkar¹•Institutions (1)

Rice University¹

23 May 2009

TL;DR: This paper introduces a new work-stealing scheduler with compiler support for async-finish task parallelism that can accommodate both work- first and help-first scheduling policies, and provides insights on scenarios in which the help- first policy yields better results than the work-first policy and vice versa.

...read moreread less

Abstract: Multiple programming models are emerging to address an increased need for dynamic task parallelism in applications for multicore processors and shared-address-space parallel computing. Examples include OpenMP 3.0, Java Concurrency Utilities, Microsoft Task Parallel Library, Intel Thread Building Blocks, Cilk, X10, Chapel, and Fortress. Scheduling algorithms based on work stealing, as embodied in Cilk's implementation of dynamic spawn-sync parallelism, are gaining in popularity but also have inherent limitations. In this paper, we address the problem of efficient and scalable implementation of X10's async-finish task parallelism, which is more general than Cilk's spawn-sync parallelism. We introduce a new work-stealing scheduler with compiler support for async-finish task parallelism that can accommodate both work-first and help-first scheduling policies. Performance results on two different multicore SMP platforms show significant improvements due to our new work-stealing algorithm compared to the existing work-sharing scheduler for X10, and also provide insights on scenarios in which the help-first policy yields better results than the work-first policy and vice versa.

...read moreread less

174 citations

Proceedings Article•DOI•

Scalable and precise dynamic datarace detection for structured parallelism

[...]

Raghavan Raman¹, Jisheng Zhao¹, Vivek Sarkar¹, Martin Vechev², Eran Yahav³ - Show less +1 more•Institutions (3)

Rice University¹, ETH Zurich², Technion – Israel Institute of Technology³

11 Jun 2012

TL;DR: This work presents a new precise dynamic race detector that leverages structured parallelism in order to address limitations of existing dynamic race detectors, and requires constant space per memory location, works in parallel, and is efficient in practice.

...read moreread less

Abstract: Existing dynamic race detectors suffer from at least one of the following three limitations:(i)space overhead per memory location grows linearly with the number of parallel threads [13], severely limiting the parallelism that the algorithm can handle;(ii)sequentialization: the parallel program must be processed in a sequential order, usually depth-first [12, 24]. This prevents the analysis from scaling with available hardware parallelism, inherently limiting its performance;(iii) inefficiency: even though race detectors with good theoretical complexity exist, they do not admit efficient implem entations and are unsuitable for practical use [4, 18].We present a new precise dynamic race detector that leverages structured parallelism in order to address these limitations. Our algorithm requires constant space per memory location, works in parallel, and is efficient in practice. We implemented and evaluated our algorithm on a set of 15 benchmarks. Our experimental results indicate an average (geometric mean) slowdown of 2.78x on a 16-core SMP system.

...read moreread less

110 citations

Proceedings Article•DOI•

The habanero multicore software research project

[...]

Rajkishore Barik¹, Zoran Budimlić¹, Vincent Cavé¹, Sanjay Chatterjee¹, Yi Guo¹, David M. Peixotto¹, Raghavan Raman¹, Jun Shirako¹, Sagnak Tasirlar¹, Yonghong Yan¹, Yisheng Zhao¹, Vivek Sarkar¹ - Show less +8 more•Institutions (1)

Rice University¹

25 Oct 2009

TL;DR: The main components of Rice University's Habanero Multicore Software Research Project are described, which proposes a new approach to multicore software enablement based on a two-level programming model consisting of a higher-level coordination language for domain experts and a lower-level parallel language for programming experts.

...read moreread less

Abstract: Multiple programming models are emerging to address an increased need for dynamic task parallelism in multicore shared-memory multiprocessors. This poster describes the main components of Rice University's Habanero Multicore Software Research Project, which proposes a new approach to multicore software enablement based on a two-level programming model consisting of a higher-level coordination language for domain experts and a lower-level parallel language for programming experts.

...read moreread less

84 citations

Book Chapter•DOI•

Efficient data race detection for async-finish parallelism

[...]

Raghavan Raman¹, Jisheng Zhao¹, Vivek Sarkar¹, Martin Vechev², Eran Yahav² - Show less +1 more•Institutions (2)

Rice University¹, IBM²

01 Nov 2010

TL;DR: An efficient dynamic race detector algorithm targeting the async-finish task-parallel parallel programming model, which generalize the spawn-sync constructs used in Cilk, while still ensuring that all computation graphs are deadlock-free.

...read moreread less

Abstract: A major productivity hurdle for parallel programming is the presence of data races. Data races can lead to all kinds of harmful program behaviors, including determinism violations and corrupted memory. However, runtime overheads of current dynamic data race detectors are still prohibitively large (often incurring slowdowns of 10× or larger) for use in mainstream software development. In this paper, we present an efficient dynamic race detector algorithm targeting the async-finish task-parallel parallel programming model. The async and finish constructs are at the core of languages such as X10 and Habanero Java (HJ). These constructs generalize the spawn-sync constructs used in Cilk, while still ensuring that all computation graphs are deadlock-free. We have implemented our algorithm in a tool called TASKCHECKER and evaluated it on a suite of 12 benchmarks. To reduce overhead of the dynamic analysis, we have also implemented various static optimizations in the tool. Our experimental results indicate that our approach performs well in practice, incurring an average slowdown of 3.05× compared to a serial execution in the optimized case.

...read moreread less

74 citations

Book Chapter•DOI•

Automatic verification of determinism for structured parallel programs

[...]

Martin Vechev¹, Eran Yahav¹, Raghavan Raman², Vivek Sarkar²•Institutions (2)

IBM¹, Rice University²

14 Sep 2010

TL;DR: The main idea is to leverage the structure of the program to reduce determinism verification to an independence property that can be proved using a simple sequential analysis.

...read moreread less

Abstract: We present a static analysis for automatically verifying determinism of structured parallel programs. The main idea is to leverage the structure of the program to reduce determinism verification to an independence property that can be proved using a simple sequential analysis. Given a task-parallel program, we identify program fragments that may execute in parallel and check that these fragments perform independent memory accesses using a sequential analysis. Since the parts that can execute in parallel are typically only a small fraction of the program, we can employ powerful numerical abstractions to establish that tasks executing in parallel only perform independent memory accesses. We have implemented our analysis in a tool called DICE and successfully applied it to verify determinism on a suite of benchmarks derived from those used in the highperformance computing community.

...read moreread less

33 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

What is Twitter

[...]

Rizal Setya Perdana

01 Jan 2013

1,098 citations

Proceedings Article•DOI•

Ligra: a lightweight graph processing framework for shared memory

[...]

Julian Shun¹, Guy E. Blelloch¹•Institutions (1)

Carnegie Mellon University¹

23 Feb 2013

TL;DR: This paper presents a lightweight graph processing framework that is specific for shared-memory parallel/multicore machines, which makes graph traversal algorithms easy to write and significantly more efficient than previously reported results using graph frameworks on machines with many more cores.

...read moreread less

Abstract: There has been significant recent interest in parallel frameworks for processing graphs due to their applicability in studying social networks, the Web graph, networks in biology, and unstructured meshes in scientific simulation. Due to the desire to process large graphs, these systems have emphasized the ability to run on distributed memory machines. Today, however, a single multicore server can support more than a terabyte of memory, which can fit graphs with tens or even hundreds of billions of edges. Furthermore, for graph algorithms, shared-memory multicores are generally significantly more efficient on a per core, per dollar, and per joule basis than distributed memory systems, and shared-memory algorithms tend to be simpler than their distributed counterparts.In this paper, we present a lightweight graph processing framework that is specific for shared-memory parallel/multicore machines, which makes graph traversal algorithms easy to write. The framework has two very simple routines, one for mapping over edges and one for mapping over vertices. Our routines can be applied to any subset of the vertices, which makes the framework useful for many graph traversal algorithms that operate on subsets of the vertices. Based on recent ideas used in a very fast algorithm for breadth-first search (BFS), our routines automatically adapt to the density of vertex sets. We implement several algorithms in this framework, including BFS, graph radii estimation, graph connectivity, betweenness centrality, PageRank and single-source shortest paths. Our algorithms expressed using this framework are very simple and concise, and perform almost as well as highly optimized code. Furthermore, they get good speedups on a 40-core machine and are significantly more efficient than previously reported results using graph frameworks on machines with many more cores.

...read moreread less

816 citations

Proceedings Article•DOI•

Scalable work stealing

[...]

James Dinan¹, D. Brian Larkins¹, P. Sadayappan¹, Sriram Krishnamoorthy², Jarek Nieplocha² - Show less +1 more•Institutions (2)

Ohio State University¹, Pacific Northwest National Laboratory²

14 Nov 2009

TL;DR: This work investigates the design and scalability of work stealing on modern distributed memory systems and demonstrates high efficiency and low overhead when scaling to 8,192 processors for three benchmark codes: a producer-consumer benchmark, the unbalanced tree search benchmark, and a multiresolution analysis kernel.

...read moreread less

Abstract: Irregular and dynamic parallel applications pose significant challenges to achieving scalable performance on large-scale multicore clusters. These applications often require ongoing, dynamic load balancing in order to maintain efficiency. Scalable dynamic load balancing on large clusters is a challenging problem which can be addressed with distributed dynamic load balancing systems. Work stealing is a popular approach to distributed dynamic load balancing; however its performance on large-scale clusters is not well understood. Prior work on work stealing has largely focused on shared memory machines. In this work we investigate the design and scalability of work stealing on modern distributed memory systems. We demonstrate high efficiency and low overhead when scaling to 8,192 processors for three benchmark codes: a producer-consumer benchmark, the unbalanced tree search benchmark, and a multiresolution analysis kernel.

...read moreread less

286 citations

Proceedings Article•DOI•

Introductory programming: a systematic literature review

[...]

Andrew Luxton-Reilly¹, Simon², Ibrahim Al-Bluwi³, Brett A. Becker⁴, Michail N. Giannakos⁵, Amruth N. Kumar⁶, Linda Ott⁷, James H. Paterson⁸, Michael James Scott⁹, Judy Sheard¹⁰, Claudia Szabo¹¹ - Show less +7 more•Institutions (11)

University of Auckland¹, University of Newcastle², Princeton University³, University College Dublin⁴, Norwegian University of Science and Technology⁵, Ramapo College⁶, Michigan Technological University⁷, Glasgow Caledonian University⁸, Falmouth University⁹, Monash University¹⁰, University of Adelaide¹¹

02 Jul 2018

TL;DR: An ITiCSE working group conducted a systematic review of the introductory programming literature to explore trends, highlight advances in knowledge over the past 15 years, and indicate possible directions for future research.

...read moreread less

Abstract: As computing becomes a mainstream discipline embedded in the school curriculum and acts as an enabler for an increasing range of academic disciplines in higher education, the literature on introductory programming is growing. Although there have been several reviews that focus on specific aspects of introductory programming, there has been no broad overview of the literature exploring recent trends across the breadth of introductory programming. This paper is the report of an ITiCSE working group that conducted a systematic review in order to gain an overview of the introductory programming literature. Partitioning the literature into papers addressing the student, teaching, the curriculum, and assessment, we explore trends, highlight advances in knowledge over the past 15 years, and indicate possible directions for future research.

...read moreread less

282 citations

Proceedings Article•DOI•

Automatic software repair: a survey

[...]

Luca Gazzola¹, Daniela Micucci¹, Leonardo Mariani¹•Institutions (1)

University of Milano-Bicocca¹

27 May 2018

TL;DR: A new class of approaches, namely program repair techniques, whose key idea is to try to automatically repair software systems by producing an actual fix that can be validated by the testers before it is finally accepted, or that is adapted to properly fit the system.

...read moreread less

Abstract: Despite their growing complexity and increasing size, modern software applications must satisfy strict release requirements that impose short bug fixing and maintenance cycles, putting significant pressure on developers who are responsible for timely producing high-quality software. To reduce developers workload, repairing and healing techniques have been extensively investigated as solutions for efficiently repairing and maintaining software in the last few years. In particular, repairing solutions have been able to automatically produce useful fixes for several classes of bugs that might be present in software programs. A range of algorithms, techniques, and heuristics have been integrated, experimented, and studied, producing a heterogeneous and articulated research framework where automatic repair techniques are proliferating. This paper organizes the knowledge in the area by surveying a body of 108 papers about automatic software repair techniques, illustrating the algorithms and the approaches, comparing them on representative examples, and discussing the open challenges and the empirical evidence reported so far.

...read moreread less

256 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104

Collapse