scispace - formally typeset
Proceedings ArticleDOI

Concerto: A Program Parallelization, Orchestration and Distribution Infrastructure

TLDR
Concerto is a Parallelization, Orchestration and Distribution Framework, and is a component of the larger Program Transformation and Parallelization Solution, where the Distribution and Mapping process is entirely automated, requires no user directives and is based solely on Dependence and Flow analysis of the sequential program.
Abstract
The important step in Program Parallelization, is identifying the pieces of the given program, that can be run concurrently, on separate processing elements The parallel pieces once identified, need to be hoisted and executed remotely, and the results combined This is a complex process, usually referred to as Program Orchestration and Distribution, and the details are closely tied to the target architecture of the parallel machine Program Distribution in a Shared Memory Parallel Computer, is comparatively simpler, and involves structuring the parallel pieces, as separate threads, with synchronization provided as needed, and just scheduling the threads, on the various processors On the contrary, Program Orchestration on a Distributed Machine, such as a Cluster is more involved, and requires explicit message passing, with the help of Send and Receive primitives, to share variables between the parallel subprograms, which are running on separate machines Concerto is a Parallelization, Orchestration and Distribution Framework, and is a component of our larger Program Transformation and Parallelization Solution The parallel architectures targeted, include both Shared Memory Multicomputer and Distributed Memory Multicomputer However, the focus of this paper, is mainly on the class of Distributed Memory Parallel Machines Here we look at issues involved in Program Distribution, and provide a high level design of Concerto, our solution to the problem, along with the Program Parallelization and Distribution Algorithm Majority of the existing Program Distribution Solutions, require user annotations to identify parallel pieces of code and data, which can be a cumbersome process, from the programmer perspective However in Concerto, the Distribution and Mapping process is entirely automated, requires no user directives and is based solely on Dependence and Flow analysis of the sequential program

read more

Citations
More filters

Communication-Minimal Partitioning of Parallel Loops and Data Arrays for Cache-Coherent Distributed -Memory Multiprocessors

TL;DR: In this article, the authors present a compiler for cache-coherent distributed shared memory multiprocessors that automatically partitions loops and data arrays to optimize locality of access, and describe a working compiler for such machines.
Book ChapterDOI

Efficient Graph Algorithms for Mapping Tasks to Processors

TL;DR: In this paper, the authors present several efficient graph based algorithms to map concurrent tasks to individual processors of a parallel machine, referred to here as the Processor Task Mapping and finding effective solutions.
Book ChapterDOI

CALIPER: A Coarse Grain Parallel Performance Estimator and Predictor

TL;DR: After surveying the published literature, and searching for similar commercial products, the authors did not find a comparable technology, to assess the contributions made by Caliper, at the time of writing, and so it is claimed that Caliper is the only product of its kind today.
Book ChapterDOI

Evaluation of Graph Algorithms for Mapping Tasks to Processors

TL;DR: The algorithms to solve this mapping problem and their complexity analyses were presented in an earlier paper and here the evaluation of these algorithms are evaluated using an empirical approach using cost equations, with results gathered for different permutations of processor and task count.
Journal ArticleDOI

Inherent Parallelism and Speedup Estimation of Sequential Programs

Sesha Kalyur, +1 more
TL;DR: Sesha Kalyur and Nagaraja G.S, “Inherent Parallelism and Speedup Estimation of Sequential Programs”, Annals of Emerging Technologies in Computing (AETiC), Print ISSN: 2516-0281, Online ISSN-029X, pp. 62-77, Vol.
References
More filters
Book

Parallel Computer Architecture: A Hardware/Software Approach

TL;DR: This book explains the forces behind this convergence of shared-memory, message-passing, data parallel, and data-driven computing architectures and provides comprehensive discussions of parallel programming for high performance and of workload-driven evaluation, based on understanding hardware-software interactions.
Proceedings ArticleDOI

Active messages: a mechanism for integrated communication and computation

TL;DR: It is shown that active messages are sufficient to implement the dynamically scheduled languages for which message driven machines were designed and, with this mechanism, latency tolerance becomes a programming/compiling concern.
Book

Advanced Computer Architecture: Parallelism,Scalability,Programmability

Kai Hwang
TL;DR: This book deals with advanced computer architecture and parallel programming techniques and is suitable for use as a textbook in a one-semester graduate or senior course, offered by Computer Science, Computer Engineering, Electrical Engineering, or Industrial Engineering programs.
Proceedings Article

TreadMarks: distributed shared memory on standard workstations and operating systems

TL;DR: A performance evaluation of TreadMarks running on Ultrix using DECstation-5000/240's that are connected by a 100-Mbps switch-based ATM LAN and a 10-Mbps Ethernet supports the contention that, with suitable networking technology, DSM is a viable technique for parallel computation on clusters of workstations.
Journal ArticleDOI

Parallel Computer Architecture: A Hardware/Software Approach

TL;DR: The core section introduces a uniform model of one-way communication protocols and shows that the corresponding uniform one- way communication complexity is strongly related to the size of deterministic finite automata.