Concerto: A Program Parallelization, Orchestration and Distribution Infrastructure

doi:10.1109/CSITSS.2017.8447691

Proceedings ArticleDOI

Concerto: A Program Parallelization, Orchestration and Distribution Infrastructure

TLDR

Concerto is a Parallelization, Orchestration and Distribution Framework, and is a component of the larger Program Transformation and Parallelization Solution, where the Distribution and Mapping process is entirely automated, requires no user directives and is based solely on Dependence and Flow analysis of the sequential program.

Abstract:

The important step in Program Parallelization, is identifying the pieces of the given program, that can be run concurrently, on separate processing elements The parallel pieces once identified, need to be hoisted and executed remotely, and the results combined This is a complex process, usually referred to as Program Orchestration and Distribution, and the details are closely tied to the target architecture of the parallel machine Program Distribution in a Shared Memory Parallel Computer, is comparatively simpler, and involves structuring the parallel pieces, as separate threads, with synchronization provided as needed, and just scheduling the threads, on the various processors On the contrary, Program Orchestration on a Distributed Machine, such as a Cluster is more involved, and requires explicit message passing, with the help of Send and Receive primitives, to share variables between the parallel subprograms, which are running on separate machines Concerto is a Parallelization, Orchestration and Distribution Framework, and is a component of our larger Program Transformation and Parallelization Solution The parallel architectures targeted, include both Shared Memory Multicomputer and Distributed Memory Multicomputer However, the focus of this paper, is mainly on the class of Distributed Memory Parallel Machines Here we look at issues involved in Program Distribution, and provide a high level design of Concerto, our solution to the problem, along with the Program Parallelization and Distribution Algorithm Majority of the existing Program Distribution Solutions, require user annotations to identify parallel pieces of code and data, which can be a cumbersome process, from the programmer perspective However in Concerto, the Distribution and Mapping process is entirely automated, requires no user directives and is based solely on Dependence and Flow analysis of the sequential program

Concerto: A Program Parallelization, Orchestration and Distribution Infrastructure

Citations

Communication-Minimal Partitioning of Parallel Loops and Data Arrays for Cache-Coherent Distributed -Memory Multiprocessors

Efficient Graph Algorithms for Mapping Tasks to Processors

CALIPER: A Coarse Grain Parallel Performance Estimator and Predictor

Evaluation of Graph Algorithms for Mapping Tasks to Processors

Inherent Parallelism and Speedup Estimation of Sequential Programs

References

Parallel Computer Architecture: A Hardware/Software Approach

Active messages: a mechanism for integrated communication and computation

Advanced Computer Architecture: Parallelism,Scalability,Programmability

TreadMarks: distributed shared memory on standard workstations and operating systems

Parallel Computer Architecture: A Hardware/Software Approach

Related Papers (5)

Pandore: a system to manage data distribution

Generating parallel applications for distributed memory systems using aspects, components, and patterns

A framework for generating task parallel programs

Estimation of dynamical characteristics of a parallel program on a model

ParaCite: Auto-parallelization of a sequential program using the Program Dependence Graph