Journal ArticleDOI
The directory-based cache coherence protocol for the DASH multiprocessor
Daniel E. Lenoski,James Laudon,Kourosh Gharachorloo,Anoop Gupta,John L. Hennessy +4 more
- Vol. 18, pp 148-159
TLDR
The design of the DASH coherence protocol is presented and how it addresses the issues of correctness, performance and protocol complexity are discussed and compared to the IEEE Scalable Coherent Interface protocol.Abstract:Â
DASH is a scalable shared-memory multiprocessor currently being developed at Stanford's Computer Systems Laboratory. The architecture consists of powerful processing nodes, each with a portion of the shared-memory, connected to a scalable interconnection network. A key feature of DASH is its distributed directory-based cache coherence protocol. Unlike traditional snoopy coherence protocols, the DASH protocol does not rely on broadcast; instead it uses point-to-point messages sent between the processors and memories to keep caches consistent. Furthermore, the DASH system does not contain any single serialization or control point. While these features provide the basis for scalability, they also force a reevaluation of many fundamental issues involved in the design of a protocol. These include the issues of correctness, performance and protocol complexity. In this paper, we present the design of the DASH coherence protocol and discuss how it addresses the above issues. We also discuss our strategy for verifying the correctness of the protocol and briefly compare our protocol to the IEEE Scalable Coherent Interface protocol.read more
Citations
More filters
Book
Parallel Computer Architecture: A Hardware/Software Approach
TL;DR: This book explains the forces behind this convergence of shared-memory, message-passing, data parallel, and data-driven computing architectures and provides comprehensive discussions of parallel programming for high performance and of workload-driven evaluation, based on understanding hardware-software interactions.
Journal ArticleDOI
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
Milo M. K. Martin,Daniel J. Sorin,Bradford M. Beckmann,Michael R. Marty,Min Xu,Alaa R. Alameldeen,Kevin E. Moore,Mark D. Hill,Darien Wood +8 more
TL;DR: The Wisconsin Multifacet Project has created a simulation toolset to characterize and evaluate the performance of multiprocessor hardware systems commonly used as database and web servers as mentioned in this paper, which includes a set of timing simulator modules for modeling the timing of the memory system and microprocessors.
Proceedings ArticleDOI
Memory consistency and event ordering in scalable shared-memory multiprocessors
Kourosh Gharachorloo,Daniel E. Lenoski,James Laudon,Phillip B. Gibbons,Anoop Gupta,John L. Hennessy +5 more
TL;DR: A new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models is introduced and is shown to be equivalent to the sequential consistency model for parallel programs with sufficient synchronization.
Journal ArticleDOI
The Stanford Dash multiprocessor
Daniel E. Lenoski,James Laudon,Kourosh Gharachorloo,Wolf-Dietrich Weber,Abhinav Gupta,John L. Hennessy,Mark Horowitz,Monica S. Lam +7 more
TL;DR: The directory architecture for shared memory (Dash) as discussed by the authors allows shared data to be cached, significantly reducing the latency of memory accesses and yielding higher processor utilization and higher overall performance, and a distributed directory-based protocol that provides cache coherence without compromising scalability.
Journal ArticleDOI
TreadMarks: shared memory computing on networks of workstations
Cristiana Amza,Alan L. Cox,Sandhya Dwarkadas,P. Keleher,Honghui Lu,Ramakrishnan Rajamony,Weimin Yu,Willy Zwaenepoel +7 more
TL;DR: This work discusses the experience with parallel computing on networks of workstations using the TreadMarks distributed shared memory system, which allows processes to assume a globally shared virtual memory even though they execute on nodes that do not physically share memory.
References
More filters
Journal ArticleDOI
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs
TL;DR: Many large sequential computers execute operations in a different order than is specified by the program, and a correct execution by each processor does not guarantee the correct execution of the entire program.
Proceedings ArticleDOI
Memory consistency and event ordering in scalable shared-memory multiprocessors
Kourosh Gharachorloo,Daniel E. Lenoski,James Laudon,Phillip B. Gibbons,Anoop Gupta,John L. Hennessy +5 more
TL;DR: A new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models is introduced and is shown to be equivalent to the sequential consistency model for parallel programs with sufficient synchronization.
Journal ArticleDOI
A New Solution to Coherence Problems in Multicache Systems
TL;DR: A memory hierarchy has coherence problems as soon as one of its levels is split in several independent units which are not equally accessible from faster levels or processors.
Journal ArticleDOI
Cache coherence protocols: evaluation using a multiprocessor simulation model
James Archibald,Jean-Loup Baer +1 more
TL;DR: The magnitude of the potential performance difference between the various approaches indicates that the choice of coherence solution is very important in the design of an efficient shared-bus multiprocessor, since it may limit the number of processors in the system.
Proceedings ArticleDOI
A low-overhead coherence solution for multiprocessors with private cache memories
TL;DR: This paper presents a cache coherence solution for multiprocessors organized around a single time-shared bus that aims at reducing bus traffic and hence bus wait time and increases the overall processor utilization.