Experimental evaluation of on-chip microprocessor cache memories

doi:10.1145/773453.808178

Journal ArticleDOI

Experimental evaluation of on-chip microprocessor cache memories

Mark D. Hill, +1 more

- Vol. 12, Iss: 3, pp 158-166

Chats0

TLDR

This paper uses trace driven simulation to study design tradeoffs for small (on-chip) caches, and finds that general purpose caches of 64 bytes (net size) are marginally useful in some cases, while 1024-byte caches perform fairly well.

Abstract:

Advances in integrated circuit density are permitting the implementation on a single chip of functions and performance enhancements beyond those of a basic processors. One performance enhancement of proven value is a cache memory; placing a cache on the processor chip can reduce both mean memory access time and bus traffic. In this paper we use trace driven simulation to study design tradeoffs for small (on-chip) caches. Miss ratio and traffic ratio (bus traffic) are the metrics for cache performance. Particular attention is paid to sub-block caches (also known as sector caches), in which address tags are associated with blocks, each of which contains multiple sub-blocks; sub-blocks are the transfer unit. Using traces from two 16-bit architectures (Z8000, PDP-11) and two 32-bit architectures (VAX-11, System/370), we find that general purpose caches of 64 bytes (net size) are marginally useful in some cases, while 1024-byte caches perform fairly well; typical miss and traffic ratios for a 1024 byte (net size) cache, 4-way set associative with 8 byte blocks are: PDP-11: .039, .156, Z8000: .015, .060, VAX 11: .080, .160, Sys/370: .244, .489. (These figures are based on traces of user programs and the performance obtained in practice is likely to be less good.) The use of sub-blocks allows tradeoffs between miss ratio and traffic ratio for a given cache size. Load forward is quite useful. Extensive simulation results are presented.

Experimental evaluation of on-chip microprocessor cache memories

Citations

Memory Bandwidth Limitations of Future Microprocessors

An analytical cache model

Trace-driven memory simulation: a survey

A class of compatible cache consistency protocols and their support by the IEEE futurebus

Efficiently enabling conventional block sizes for very large die-stacked DRAM caches

References

Cache Memories

Evaluation techniques for storage hierarchies

Branch Prediction Strategies and Branch Target Buffer Design

Branch prediction strategies and branch target buffer design

Using cache memory to reduce processor-memory traffic

Related Papers (5)

Cache Memories

Using cache memory to reduce processor-memory traffic

Structural aspects of the system/360 model 85: II the cache

Computer Architecture: A Quantitative Approach

A case for direct-mapped caches