scispace - formally typeset
Search or ask a question
Topic

SPMD

About: SPMD is a research topic. Over the lifetime, 928 publications have been published within this topic receiving 12882 citations. The topic is also known as: single program, multiple data.


Papers
More filters
Proceedings ArticleDOI
18 Feb 2009
TL;DR: It is shown that machine topology has a signi¿cant impact on performance for all parallelization strategies and that topology awareness should be built into all applications in the future.
Abstract: Today most systems in high-performance computing (HPC) feature a hierarchical hardware design: Shared memory nodes with several multi-core CPUs are connected via a network infrastructure. Parallel programming must combine distributed memory parallelization on the node interconnect with shared memory parallelization inside each node. We describe potentials and challenges of the dominant programming models on hierarchically structured hardware: Pure MPI (Message Passing Interface), pure OpenMP (with distributed shared memory extensions) and hybrid MPI+OpenMP in several ?avors. We pinpoint cases where a hybrid programming model can indeed be the superior solution because of reduced communication needs and memory consumption, or improved load balance. Furthermore we show that machine topology has a signi?cant impact on performance for all parallelization strategies and that topology awareness should be built into all applications in the future. Finally we give an outlook on possible standardization goals and extensions that could make hybrid programming easier to do with performance in mind.

446 citations

Journal ArticleDOI
TL;DR: This work discusses the main additions to Java are immutable classes, multidimensional arrays, an explicitly parallel SPMD model of computation with a global address space, and zone-based memory management, and reports progress on the development of Titanium.
Abstract: Titanium is a language and system for high-performance parallel scientific computing. Titanium uses Java as its base, thereby leveraging the advantages of that language and allowing us to focus attention on parallel computing issues. The main additions to Java are immutable classes, multidimensional arrays, an explicitly parallel SPMD model of computation with a global address space, and zone-based memory management. We discuss these features and our design approach, and report progress on the development of Titanium, including our current driving application: a three-dimensional adaptive mesh refinement parallel Poisson solver. © 1998 John Wiley & Sons, Ltd.

433 citations

Proceedings ArticleDOI
16 Nov 2002
TL;DR: This work presents MPICH-V, an automatic Volatility tolerant MPI environment based on uncoordinated checkpoint/roll-back and distributed message logging, and presents a detailed performance evaluation of every component and its global performance for non-trivial parallel applications.
Abstract: Global Computing platforms, large scale clusters and future TeraGRID systems gather thousands of nodes for computing parallel scientific applications. At this scale, node failures or disconnections are frequent events. This Volatility reduces the MTBF of the whole system in the range of hours or minutes. We present MPICH-V, an automatic Volatility tolerant MPI environment based on uncoordinated checkpoint/roll-back and distributed message logging. MPICH-V architecture relies on Channel Memories, Checkpoint servers and theoretically proven protocols to execute existing or new, SPMD and Master-Worker MPI applications on volatile nodes. To evaluate its capabilities, we run MPICH-V within a framework for which the number of nodes, Channels Memories and Checkpoint Servers can be completely configured as well as the node Volatility. We present a detailed performance evaluation of every component of MPICH-V and its global performance for non-trivial parallel applications. Experimental results demonstrate good scalability and high tolerance to node volatility.

323 citations

Journal ArticleDOI
TL;DR: An execution model for supporting programs that use pointer-based dynamic data structures is described that uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy task creation.
Abstract: Compiling for distributed-memory machines has been a very active research area in recent years. Much of this work has concentrated on programs that use arrays as their primary data structures. To date, little work has been done to address the problem of supporting programs that use pointer-based dynamic data structures. The techniques developed for supporting SPMD execution of array-based programs rely on the fact that arrays are statically defined and directly addressable. Recursive data structures do not have these properties, so new techniques must be developed. In this article, we describe an execution model for supporting programs that use pointer-based dynamic data structures. This model uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy task creation. We intend to exploit this execution model using compiler analyses and automatic parallelization techniques. We have implemented a prototype system, which we call Olden, that runs on the Intel iPSC/860 and the Thinking Machines CM-5. We discuss our implementation and report on experiments with five benchmarks.

265 citations

Proceedings ArticleDOI
01 Jun 1993
TL;DR: It is shown that the problems of communication code generation, local memory management, message aggregation and redundant data communication elimination can all be solved by projecting polyhedra represented by sets of inequalities onto lower dimensional spaces.
Abstract: This paper presents several algorithms to solve code generation and optimization problems specific to machines with distributed address spaces. Given a description of how the computation is to be partitioned across the processors in a machine, our algorithms produce an SPMD (single program multiple data) program to be run on each processor. Our compiler generated the necessary receive and send instructions, optimizes the communication by eliminating redundant communication and aggregating small messages into large messages, allocates space locally on each processor, and translates global data addresses to local addresses.Our techniques are based on an exact data-flow analysis on individual array element accesses. Unlike data dependence analysis, this analysis determines if two dynamic instances refer to the same value, and not just to the same location. Using this information, our compiler can handle more flexible data decompositions and find more opportunities for communication optimization than systems based on data dependence analysis.Our technique is based on a uniform framework, where data decompositions, computation decompositions and the data flow information are all represented as systems of linear inequalities. We show that the problems of communication code generation, local memory management, message aggregation and redundant data communication elimination can all be solved by projecting polyhedra represented by sets of inequalities onto lower dimensional spaces.

241 citations


Network Information
Related Topics (5)
Compiler
26.3K papers, 578.5K citations
81% related
Cache
59.1K papers, 976.6K citations
80% related
Server
79.5K papers, 1.4M citations
78% related
Data structure
28.1K papers, 608.6K citations
77% related
Scalability
50.9K papers, 931.6K citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20237
202224
20217
20204
20199
201820