How to emulate shared memory

doi:10.1109/SFCS.1987.32

Proceedings ArticleDOI

How to emulate shared memory

- pp 185-194

TLDR

In this paper, the authors presented a simple algorithm for emulating an N processor CRCW PRAM on an N node butterfly, where each step of the PRAM is emulated in time O(log N) with high probability, using FIFO queues of size O(1) at each node.

Abstract:

We present a simple algorithm for emulating an N processor CRCW PRAM on an N node butterfly. Each step of the PRAM is emulated in time O(log N) with high probability, using FIFO queues of size O(1) at each node. The only use of randomization is in selecting a hash function to distribute the shared address space of the PRAM onto the nodes of the butterfly. The routing itself is both deterministic and oblivious, and messages are combined without the use of associative memories or explicit sorting. As a corollary we improve the result of Pippenger [8] by routing permutations with bounded queues in logarithmic time, without the possibility of deadlock. Besides being optimal, our algorithm has the advantage of extreme simplicity and is readily suited for use in practice.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A bridging model for parallel computation

Leslie G. Valiant

- 01 Aug 1990 -

Communications of The ACM

TL;DR: The bulk-synchronous parallel (BSP) model is introduced as a candidate for this role, and results quantifying its efficiency both in implementing high-level language features and algorithms, as well as in being implemented in hardware.

...read moreread less

Journal ArticleDOI

Scheduling multithreaded computations by work stealing

Robert D. Blumofe, +1 more

- 01 Sep 1999 -

Journal of the ACM

TL;DR: This paper gives the first provably good work-stealing scheduler for multithreaded computations with dependencies, and shows that the expected time to execute a fully strict computation on P processors using this scheduler is 1:1.

...read moreread less

Book ChapterDOI

Parallel algorithms for shared-memory machines

Richard M. Karp, +1 more

TL;DR: In this paper, the authors discuss parallel algorithms for shared-memory machines and discuss the theoretical foundations of parallel algorithms and parallel architectures, and present a theoretical analysis of the appropriate logical organization of a massively parallel computer.

...read moreread less

Proceedings ArticleDOI

Scheduling multithreaded computations by work stealing

Robert D. Blumofe, +1 more

TL;DR: This paper gives the first provably good work-stealing scheduler for multithreaded computations with dependencies, and shows that the expected time T/sub P/ to execute a fully strict computation on P processors using this work- Stealing Scheduler is T/ Sub P/=O(T/sub 1//P+T/ sub /spl infin//), where T/ sub 1/ is the minimum serial execution time of the multith readed computation and T/

...read moreread less

MonographDOI

Introduction to Parallel Computing

Zbigniew J. Czech

TL;DR: In this article, a comprehensive introduction to parallel computing is provided, discussing theoretical issues such as the fundamentals of concurrent processes, models of parallel and distributed computing, and metrics for evaluating and comparing parallel algorithms, as well as practical issues, including methods of designing and implementing shared-and distributed-memory programs, and standards for parallel program implementation.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Universal schemes for parallel communication

Leslie G. Valiant, +1 more

TL;DR: This paper shows that there exists an N-processor computer that can simulate arbitrary N- processor parallel computations with only a factor of O(log N) loss of runtime efficiency, and isolates a combinatorial problem that lies at the heart of this question.

...read moreread less

Proceedings Article

"Hot Spot" Contention and Combining in Multistage Interconnection Networks.

Gregory Francis Pfister, +1 more

Journal ArticleDOI

“Hot spot” contention and combining in multistage interconnection networks

G. F. Pfister, +1 more

- 01 Oct 1985 -

IEEE Transactions on Computers

TL;DR: The technique of message combining was found to be an effective means of eliminating this problem if it arises due to lock or synchronization contention, severely degrading all memory access, not just access to shared lock locations, due to an effect the authors call tree saturation.

...read moreread less

Journal ArticleDOI

Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors

Allan Gottlieb, +2 more

- 01 Apr 1983 -

ACM Transactions on Programming Language...

TL;DR: This paper implements several basic operating system primitives by using a "replace-add" operation, which can supersede the standard "test and set" and which appears to be a universal primitive for efficiently coordinating large numbers of independently acting sequential processors.

...read moreread less

Journal ArticleDOI

A logarithmic time sort for linear size networks

John H. Reif, +1 more

- 01 Jan 1987 -

Journal of the ACM

TL;DR: A randomized algorithm that sorts on an N- node network with constant valence in O(log N) time with probability at least 1 - N- “α” - “ α” for all large enough items.

...read moreread less