Dynamic decentralized cache schemes for mimd parallel processors

doi:10.1145/773453.808203

NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS:

The copyright law of the United States (title 17, U.S. Code) governs the making

of photocopies or other reproductions of copyrighted material. Any copying of this

document without permission of its author may be prohibited by law.

CMU-CS-84-139

Dynamic Decentralized Cache Schemes for MIMD Parallel Processors

by

Larry Rudolph

Zary Segall

Computer Science Department

Carnegie-Mellon University

Abstract

This paper presents two cache schemes for a shared-memory shared bus multiprocessor. Both

schemes feature decentralized consistency control and dynamic type classification of the datum

cached (i.e. read-only, local, or shared). It is shown how to exploit these features to minimize the

shared bus traffic. The broadcasting ability of the shared bus is used not only to signal an event but

also to distribute data. In addition, by introducing a new synchronization construct, i.e. the Test-and-

Test-and-Set instruction, many of the traditional parallel processing "hot spots" or bottlenecks are

eliminated. Sketches of formal correctness proofs for the proposed schemes are also presented. It

appears that moderately large parallel processors can be designed by employing the principles

presented in this paper.

This research has been supported in part by National Science Foundation Grant MCS-8120270. The

views and conclusions contained in this paper are those of the authors and should not be interpreted

as representing the official policies, either expressed or implied, of NSF or Carnegie-Mellon

University.

Table of Contents

1. Introduction

2.

Assumptions

3.

The RB Cache Scheme

4. Proof of Consistency - Sketch

5. The RWB Cache Scheme

6. Synchronization Using Caches

6.1. Synchronization Using RB Scheme

6.2. Synchronization Using RWB Scheme

7. Shared Bus Bandwidth

8. Conclusion

ii

List of Figures

Figu re 3-1: State Transition Diagram for each Cache Entry for the RB Scheme 9

Figu re 5-1: State Transition Diagram for each Cache Entry for the RWB Scheme 14

Figu re 6-1: Synchronization with Test-and-Set for RB Scheme 16

Figu re 6-

2:

Synchronization with Test-and-Test-and-Set for RB Scheme 17

Figure 6-3: Synchronization with Test-and-Test-and-Set for RWB Scheme 18

Figu re 7-1: Multiple Shared Bus Cached Based Parallel Processor 20

Ill

List of Tables

Table

1

-1: Cm* Emulated Cache Results

Dynamic decentralized cache schemes for mimd parallel processors

Citations

Transactional memory: architectural support for lock-free data structures

Algorithms for scalable synchronization on shared-memory multiprocessors

Ligra: a lightweight graph processing framework for shared memory

The performance of spin lock alternatives for shared-money multiprocessors

Cache coherence protocols: evaluation using a multiprocessor simulation model

References

Cache Memories

The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer

A New Solution to Coherence Problems in Multicache Systems

Using cache memory to reduce processor-memory traffic

Cache system design in the tightly coupled multiprocessor system

Related Papers (5)

Algorithms for scalable synchronization on shared-memory multiprocessors

The performance of spin lock alternatives for shared-money multiprocessors

A New Solution to Coherence Problems in Multicache Systems

Wait-free synchronization

Memory consistency and event ordering in scalable shared-memory multiprocessors

Dynamic decentralized cache schemes for mimd parallel processors

Citations

Transactional memory: architectural support for lock-free data structures

Algorithms for scalable synchronization on shared-memory multiprocessors

Ligra: a lightweight graph processing framework for shared memory

The performance of spin lock alternatives for shared-money multiprocessors

Cache coherence protocols: evaluation using a multiprocessor simulation model

References

Cache Memories

The NYU Ultracomputer&#8212;Designing an MIMD Shared Memory Parallel Computer

A New Solution to Coherence Problems in Multicache Systems

Using cache memory to reduce processor-memory traffic

Cache system design in the tightly coupled multiprocessor system

Related Papers (5)

Algorithms for scalable synchronization on shared-memory multiprocessors

The performance of spin lock alternatives for shared-money multiprocessors

A New Solution to Coherence Problems in Multicache Systems

Wait-free synchronization

Memory consistency and event ordering in scalable shared-memory multiprocessors

The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer