scispace - formally typeset
Search or ask a question

Showing papers on "Smart Cache published in 1984"


Proceedings ArticleDOI
01 Jan 1984
TL;DR: This paper presents a cache coherence solution for multiprocessors organized around a single time-shared bus that aims at reducing bus traffic and hence bus wait time and increases the overall processor utilization.
Abstract: This paper presents a cache coherence solution for multiprocessors organized around a single time-shared bus. The solution aims at reducing bus traffic and hence bus wait time. This in turn increases the overall processor utilization. Unlike most traditional high-performance coherence solutions, this solution does not use any global tables. Furthermore, this coherence scheme is modular and easily extensible, requiring no modification of cache modules to add more processors to a system. The performance of this scheme is evaluated by using an approximate analysis method. It is shown that the performance of this scheme is closely tied with the miss ratio and the amount of sharing between processors.

531 citations


Journal ArticleDOI
01 Jan 1984
TL;DR: This paper uses trace driven simulation to study design tradeoffs for small (on-chip) caches, and finds that general purpose caches of 64 bytes (net size) are marginally useful in some cases, while 1024-byte caches perform fairly well.
Abstract: Advances in integrated circuit density are permitting the implementation on a single chip of functions and performance enhancements beyond those of a basic processors. One performance enhancement of proven value is a cache memory; placing a cache on the processor chip can reduce both mean memory access time and bus traffic. In this paper we use trace driven simulation to study design tradeoffs for small (on-chip) caches. Miss ratio and traffic ratio (bus traffic) are the metrics for cache performance. Particular attention is paid to sub-block caches (also known as sector caches), in which address tags are associated with blocks, each of which contains multiple sub-blocks; sub-blocks are the transfer unit. Using traces from two 16-bit architectures (Z8000, PDP-11) and two 32-bit architectures (VAX-11, System/370), we find that general purpose caches of 64 bytes (net size) are marginally useful in some cases, while 1024-byte caches perform fairly well; typical miss and traffic ratios for a 1024 byte (net size) cache, 4-way set associative with 8 byte blocks are: PDP-11: .039, .156, Z8000: .015, .060, VAX 11: .080, .160, Sys/370: .244, .489. (These figures are based on traces of user programs and the performance obtained in practice is likely to be less good.) The use of sub-blocks allows tradeoffs between miss ratio and traffic ratio for a given cache size. Load forward is quite useful. Extensive simulation results are presented.

99 citations


Journal ArticleDOI
01 Jan 1984
TL;DR: The Static Column RAM devices recently introduced offer the potential for implementing a direct-mapped cache on-chip with only a small increase in complexity over that needed for a conventional dynamic RAM memory system.
Abstract: The Static Column RAM devices recently introduced offer the potential for implementing a direct-mapped cache on-chip with only a small increase in complexity over that needed for a conventional dynamic RAM memory system. Trace-driven simulation shows that such a cache can only be marginally effective if used in the obvious way. However it can be effective in satisfying the requests from a processor containing an on-chip cache. The SCRAM cache is more effective if the processor cache handles both instructions and data.

25 citations


Proceedings Article
19 Dec 1984

8 citations


01 Aug 1984
TL;DR: A cache coherence solution is proposed for a two-level cache organization for multiprocessors and the performance of the proposed multi-processor is evaluated with analytical methods.
Abstract: : This thesis proposes a two-level cache organization for multiprocessors. The first level of cache consists of a private cache per processor. The second level of caches is shared by all processors. The main memory is also similarly shared. A cache coherence solution is proposed for such an organization. The performance of the proposed multi-processor is evaluated with analytical methods. The factors that affect the performance are quantitatively discussed. A variation of the proposed coherence algorithm is presented to improve the performance. Keywords: High reliability; Cache memories; Mathematical analysis. (Author)

1 citations