Proceedings ArticleDOI
How to emulate shared memory
Abhiram Ranade
- pp 185-194
TLDR
In this paper, the authors presented a simple algorithm for emulating an N processor CRCW PRAM on an N node butterfly, where each step of the PRAM is emulated in time O(log N) with high probability, using FIFO queues of size O(1) at each node.Citations
More filters
Journal ArticleDOI
A bridging model for parallel computation
TL;DR: The bulk-synchronous parallel (BSP) model is introduced as a candidate for this role, and results quantifying its efficiency both in implementing high-level language features and algorithms, as well as in being implemented in hardware.
Journal ArticleDOI
Scheduling multithreaded computations by work stealing
TL;DR: This paper gives the first provably good work-stealing scheduler for multithreaded computations with dependencies, and shows that the expected time to execute a fully strict computation on P processors using this scheduler is 1:1.
Book ChapterDOI
Parallel algorithms for shared-memory machines
TL;DR: In this paper, the authors discuss parallel algorithms for shared-memory machines and discuss the theoretical foundations of parallel algorithms and parallel architectures, and present a theoretical analysis of the appropriate logical organization of a massively parallel computer.
Proceedings ArticleDOI
Scheduling multithreaded computations by work stealing
TL;DR: This paper gives the first provably good work-stealing scheduler for multithreaded computations with dependencies, and shows that the expected time T/sub P/ to execute a fully strict computation on P processors using this work- Stealing Scheduler is T/ Sub P/=O(T/sub 1//P+T/ sub /spl infin//), where T/ sub 1/ is the minimum serial execution time of the multith readed computation and T/
MonographDOI
Introduction to Parallel Computing
TL;DR: In this article, a comprehensive introduction to parallel computing is provided, discussing theoretical issues such as the fundamentals of concurrent processes, models of parallel and distributed computing, and metrics for evaluating and comparing parallel algorithms, as well as practical issues, including methods of designing and implementing shared-and distributed-memory programs, and standards for parallel program implementation.
References
More filters
Proceedings ArticleDOI
Universal schemes for parallel communication
Leslie G. Valiant,G. J. Brebner +1 more
TL;DR: This paper shows that there exists an N-processor computer that can simulate arbitrary N- processor parallel computations with only a factor of O(log N) loss of runtime efficiency, and isolates a combinatorial problem that lies at the heart of this question.
Journal ArticleDOI
“Hot spot” contention and combining in multistage interconnection networks
G. F. Pfister,V. A. Norton +1 more
TL;DR: The technique of message combining was found to be an effective means of eliminating this problem if it arises due to lock or synchronization contention, severely degrading all memory access, not just access to shared lock locations, due to an effect the authors call tree saturation.
Journal ArticleDOI
Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors
TL;DR: This paper implements several basic operating system primitives by using a "replace-add" operation, which can supersede the standard "test and set" and which appears to be a universal primitive for efficiently coordinating large numbers of independently acting sequential processors.
Journal ArticleDOI
A logarithmic time sort for linear size networks
John H. Reif,Leslie G. Valiant +1 more
TL;DR: A randomized algorithm that sorts on an N- node network with constant valence in O(log N) time with probability at least 1 - N - “α” - “ α” for all large enough items.