Showing papers on "Cache algorithms published in 1983"

PDF

Open Access

Proceedings Article•DOI•

Using cache memory to reduce processor-memory traffic

[...]

13 Jun 1983

TL;DR: It is demonstrated that a cache exploiting primarily temporal locality (look-behind) can indeed reduce traffic to memory greatly, and introduce an elegant solution to the cache coherency problem.

...read moreread less

Abstract: The importance of reducing processor-memory bandwidth is recognized in two distinct situations: single board computer systems and microprocessors of the future. Cache memory is investigated as a way to reduce the memory-processor traffic. We show that traditional caches which depend heavily on spatial locality (look-ahead) for their performance are inappropriate in these environments because they generate large bursts of bus traffic. A cache exploiting primarily temporal locality (look-behind) is then proposed and demonstrated to be effective in an environment where process switches are infrequent. We argue that such an environment is possible if the traffic to backing store is small enough that many processors can share a common memory and if the cache data consistency problem is solved. We demonstrate that such a cache can indeed reduce traffic to memory greatly, and introduce an elegant solution to the cache coherency problem.

...read moreread less

431 citations

Patent•

Method for dynamically allocating LRU/MRU managed memory among concurrent sequential processes

[...]

Richard Lewis Mattson¹, Juan A. Rodriguez-Rosell¹•Institutions (1)

IBM¹

21 Oct 1983

TL;DR: In this paper, short traces of consecutive CPU references to storage are accumulated and processed to ascertain hit ratio as a function of cache size. From this determination, an allocation of cache can be made.

...read moreread less

Abstract: Short traces of consecutive CPU references to storage are accumulated and processed to ascertain hit ratio as a function of cache size. From this determination, an allocation of cache can be made. Because this determination requires minimal processing time, LRU-referenceable memory space among concurrently executing sequential processes is used dynamically by a CPU cache manager.

...read moreread less

117 citations

Patent•

A hierarchical memory system including separate cache memories for storing data and instructions

[...]

Francis Paul Carrubba¹, John Cocke¹, Norman Henry Kreitzer¹, George Radin¹•Institutions (1)

IBM¹

21 Dec 1983

TL;DR: In this paper, a hierarchical memory system for use with a high speed data processor characterized by having separate dedicated cache memories for storing data and instructions and further characterized by each cache having a unique cache directory containing a plurality of control bits for assisting line replacement within the individual cache memories and to insure that unnecessary or incorrect data is never stored back into said main memory.

...read moreread less

Abstract: A hierarchical memory system for use with a high speed data processor characterized by having separate dedicated cache memories for storing data and instructions and further characterized by each cache having a unique cache directory containing a plurality of control bits for assisting line replacement within the individual cache memories and for eliminating many accesses to main memory and to insure that unnecessary or incorrect data is never stored back into said main memory. The present cache architecture and control features render broadcasting between the data cache and instruction cache unnecessary. Modification of the instruction cache is not permitted. Accordingly, control bits indicating a modification in the cache directory for the instruction cache are not necessary and similarly it is never necessary to store instruction cache lines back into main memory since their modification is not permitted. The cache architecture and controls permit normal instruction and data cache fetches and data cache stores. Additionally, special instructions are provided for setting the special control bits provided in both the instruction and data cache directories, independently of actual memory accessing OPS by the CPU and for storing and loading cache lines independently of memory OPS by the CPU.

...read moreread less

107 citations

Journal Article•DOI•

Cache Performance in the VAX-11/780

[...]

Douglas W. Clark

01 Feb 1983-ACM Transactions on Computer Systems

TL;DR: Measurements are reported including the hit ratios of data and instruction references, the rate of cache invalidations by I/O, and the amount of waiting time due to cache misses.

...read moreread less

96 citations

Journal Article•DOI•

Efficient methods for calculating the success function of fixed space replacement policies

[...]

Frank Olken

01 May 1983-Performance Evaluation

TL;DR: Efficient methods are discussed for calculating the success function of replacement policies used to manage very large fixed size caches, and how to modify Bennett and Kruskal's algorithm to run in bounded space.

...read moreread less

88 citations

Proceedings Article•DOI•

A study of instruction cache organizations and replacement policies

[...]

James E. Smith, James R. Goodman

13 Jun 1983

TL;DR: It is concluded theoretically that random replacement is better than LRU and FIFO, and that under certain circumstances, a direct-mapped or set associative caches may perform better than a full associative cache organization.

...read moreread less

Abstract: Instruction caches are analyzed both theoretically and experimentally. The theoretical analysis begins with a new model for cache referencing behavior—the loop model. This model is used to study cache organizations and replacement policies. It is concluded theoretically that random replacement is better than LRU and FIFO, and that under certain circumstances, a direct-mapped or set associative cache may perform better than a full associative cache organization. Experimental results using instruction trace data are then given. The experimental results are shown to support the theoretical conclusions.

...read moreread less

79 citations

Patent•

DASD cache block staging

[...]

Malcolm Coleman Easton¹, John H. Howard¹•Institutions (1)

IBM¹

01 Jul 1983

TL;DR: In this article, the authors propose a method for Direct (DASD) cache management that reduces the volume of data transfer between DASD and cache while avoiding the complexity of managing variable length records in the cache.

...read moreread less

Abstract: A method for Direct (DASD) cache management that reduces the volume of data transfer between DASD and cache while avoiding the complexity of managing variable length records in the cache. This is achieved by always choosing the starting point for staging a record to be at the start of the missing record and, at the same time, allocating and managing cache space in fixed length blocks. The method steps require staging records, starting with the requested record and continuing until either the cache block is full, the end of track is reached, or a record already in the cache is encountered.

...read moreread less

70 citations

Proceedings Article•DOI•

Architecture of a VLSI instruction cache for a RISC

[...]

David A. Patterson, Phil Garrison, Mark D. Hill, Dimitris Lioupis, Chris Nyberg, Tim Sippel, Korbin S. Van Dyke - Show less +3 more

13 Jun 1983

TL;DR: In designing a VLSI instruction cache for a RISC microprocessor the authors have uncovered four ideas potentially applicable to other V LSI machines that provide expansible cache memory, increased cache speed, reduced program code size, and decreased manufacturing costs.

...read moreread less

Abstract: A cache was first used in a commercial computer in 1968,1 and researchers have spent the last 15 years analyzing caches and suggesting improvements. In designing a VLSI instruction cache for a RISC microprocessor we have uncovered four ideas potentially applicable to other VLSI machines. These ideas provide expansible cache memory, increased cache speed, reduced program code size, and decreased manufacturing costs. These improvements blur the habitual distinction between an instruction cache and an instruction fetch unit. The next four sections present the four architectural ideas, followed by a section on performance evaluation of each idea. We then describe the implementation of the cache and finally summarize the results.

...read moreread less

66 citations

Patent•

Data processing machine with improved cache memory management

[...]

Horace H. Tsiang

17 Oct 1983

TL;DR: In this paper, the cache operating cycle is divided into two subcycles dedicated to mutually exclusive operations: the first subcycle is dedicated to receiving a central processor memory read request, with its address.

...read moreread less

Abstract: A data processing machine in which the cache operating cycle is divided into two subcycles dedicated to mutually exclusive operations. The first subcycle is dedicated to receiving a central processor memory read request, with its address. The second subcycle is dedicated to every other kind of cache operation, in particular either (a) receiving an address from a peripheral processor for checking the cache contents after a peripheral processor write to main memory, or (b) writing anything to the cache, including an invalid bit after a cache check match condition, or data after either a cache miss or a central processor write to main memory. The central processor can continue uninteruptedly to read the cache on successive central processor microinstruction cycles, regardless of the fact that the cache contents are being "simultaneously" checked, invalidated or updated after central processor writes. After a cache miss, although the central processor must be stopped to permit updating, it can resume operations a cycle earlier than is possible without the divided cache cycle.

...read moreread less

62 citations

Journal Article•DOI•

Fractal nature of software-cache interaction

[...]

J. Voldman¹, Benoit B. Mandelbrot¹, Lee Windsor Hoevel¹, J. Knight¹, P. L. Rosenfeld¹ - Show less +1 more•Institutions (1)

IBM¹

01 Mar 1983-Ibm Journal of Research and Development

TL;DR: This paper uses fractals to model the clustering of cache misses as a discriminate between interactive and batch environments and finds that the cluster dimension provides a measure of the intrinsic differences between workloads.

...read moreread less

Abstract: This paper uses fractals to model the clustering of cache misses. The clustering of cache misses can be quantified by a single number analog to a fractional dimension, and we are intrigued by the possibility that this number can be used as a measure of software complexity. The essential intuition is that cache misses are a direct reflection of changes in locality of reference, and that complex software requires more frequent (and larger) changes in this locality than simple software. The cluster dimension provides a measure (and perhaps the basis for a model) of the intrinsic differences between workloads. In this paper, we focus on cache miss activity as a discriminate between interactive and batch environments.

...read moreread less

52 citations

Patent•

Instruction cache memory system

[...]

William E. Hall

22 Feb 1983

TL;DR: In this article, an instruction is transferred to a region from the main data memory in response to a program address and may be executed without waiting for simultaneous transfer of a large block or number of instructions.

...read moreread less

Abstract: A memory system includes a high-speed, multi-region instruction cache, each region of which stores a variable number of instructions received from a main data memory said instructions forming part of a program. An instruction is transferred to a region from the main data memory in response to a program address and may be executed without waiting for simultaneous transfer of a large block or number of instructions. Meanwhile, instructions at consecutively subsequent addresses in the main data memory are transferred to the same region for building an expanding cache of rapidly accessible instructions. The expansion of a given region is brought about as a result of the addressing of that region, such that a cache region receiving a main line of the aforementioned program will be expanded in preference to a region receiving an occasionally used sub-routine. When a new program address is presented, a simultaneous comparison is made with pointers which are provided to be indicative of addresses of instructions currently stored in the various cache regions, and stored information is gated from a region which produces a favorable comparison. When a new address is presented to which no cache region is responsive, the least recently used region, that is the region that has been accessed least recently, is immediately invalidated and reused by writing thereover, starting with the new address to which no cache region was responsive, for accumulating a substituted cache of information from the main data memory.

...read moreread less

Patent•

Enable/disable control checking apparatus

[...]

James W. Keeley¹, Robert V. Ledoux¹, Virendra S. Negi¹•Institutions (1)

Honeywell¹

30 Jun 1983

TL;DR: A multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors as mentioned in this paper.

...read moreread less

Abstract: A multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors. Test control apparatus which couples to the directory error checking apparatus operates to selectively enable and disable the directory error checking circuits in response to commands received from a central processing unit so as to enable the testing of the cache directory and other portions of the cache system using common test routines.

...read moreread less

Patent•

Test apparatus for testing a multilevel cache system with graceful degradation capability

[...]

James W. Keeley¹, Robert V. Ledoux¹, Virendra S. Negi¹•Institutions (1)

Honeywell¹

30 Jun 1983

TL;DR: In this paper, a multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors.

...read moreread less

Abstract: A multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors. Test apparatus coupled to the control apparatus and operates to selectively alter the operational states of the cache levels in response to commands received from a central processing unit for enabling testing of such control apparatus in addition to the other cache control areas.

...read moreread less

Patent•

Cache with independent addressable data and directory arrays

[...]

Charles P. Ryan¹, Russell W. Guenthner¹, Leonard G. Trubisky¹•Institutions (1)

Honeywell¹

28 Feb 1983

TL;DR: Cache memory includes a dual or two-part cache with one part of the cache being primarily designated for instruction data while the other part is designated for operand data, but not exclusively as discussed by the authors.

...read moreread less

Abstract: Cache memory includes a dual or two part cache with one part of the cache being primarily designated for instruction data while the other is primarily designated for operand data, but not exclusively. For a maximum speed of operation, the two parts of the cache are equal in capacity. The two parts of the cache, designated I-Cache and O-Cache, are semi-independent in their operation and include arrangements for effecting synchronized searches, they can accommodate up to three separate operations substantially simultaneously. Each cache unit has a directory and a data array with the directory and data array being separately addressable. Each cache unit may be subjected to a primary and to one or more secondary concurrent uses with the secondary uses prioritized. Data is stored in the cache unit on a so-called store-into basis wherein data obtained from the main memory is operated upon and stored in the cache without returning the operated upon data to the main memory unit until subsequent transactions require such return.

...read moreread less

Patent•

Verification of real page numbers of stack stored prefetched instructions from instruction cache

[...]

Charles P. Ryan¹, Russell W. Guenthner¹•Institutions (1)

Honeywell¹

28 Feb 1983

TL;DR: A cache memory includes a dual or two part cache with one part of the cache being primarily designated for instruction data while the other part is primarily reserved for operand data, but not exclusively as discussed by the authors.

...read moreread less

Abstract: A cache memory includes a dual or two part cache with one part of the cache being primarily designated for instruction data while the other is primarily designated for operand data, but not exclusively. For a maximum speed of operation, the two parts of the cache are equal in capacity. The two parts of the cache, designated I-Cache and O-Cache, are semi-independent in their operation and include arrangements for effecting synchronized searches, they can accommodate up to three separate operations substantially simultaneously. Each cache unit has a directory and a data array with the directory and data array being separately addressable. Each cache unit may be subjected to a primary and to one or more secondary concurrent uses with the secondary uses prioritized. Data is stored in the cache unit on a so-called store-into basis wherein data obtained from the main memory is operated upon and stored in the cache without returning the operated upon data to the main memory unit until subsequent transactions require such return. An arrangement is included whereby the real page number of a delayed transaction may be verified.

...read moreread less

Journal Article•DOI•

Transient behavior of cache memories

[...]

William D. Strecker

01 Nov 1983-ACM Transactions on Computer Systems

TL;DR: In this paper a mathematical model is developed which predicts the effect on the miss ratio of running a program in a sequence of interrupted execution intervals and results are compared to measured miss ratios of real programs executing in an interrupted execution environment.

...read moreread less

Abstract: A cache is a small, fast associative memory located between a central processor and primary memory and used to hold copies of the contents of primary memory locations. A key performance measure of a cache is the miss ratio: the fraction of processor references which are not satisfied by the cache and result in primary memory references. The miss ratio is sometimes measured by running a single program to completion; however, in real systems rarely does such uninterrupted execution occur. In this paper a mathematical model is developed which predicts the effect on the miss ratio of running a program in a sequence of interrupted execution intervals. Results from the model are compared to measured miss ratios of real programs executing in an interrupted execution environment.

...read moreread less

Patent•

Method and apparatus for implementing an algorithm associated with stored information

[...]

Edgar R. Goodrich¹, Douglas R. Kraft¹•Institutions (1)

Motorola¹

13 May 1983

TL;DR: In this article, a protocol for implementing a data replacement algorithm associated with a fast, low capacity cache, such as least recently used (LRU), which is fast and which minimizes circuitry is provided.

...read moreread less

Abstract: A circuit and method for implementing a predetermined data replacement algorithm associated with a fast, low capacity cache, such as least recently used (LRU), which is fast and which minimizes circuitry is provided. A latch stores the present status of the replacement algorithm, and an address control signal indicates which one of n sets of stored information in the cache has been most recently accessed, where n is an integer. The predetermined algorithm is implemented by a predetermined permutation table stored in a translator which provides an output signal in response to both the present status of the replacement algorithm and the address control signal. The output signal indicates which one of the n sets of stored information in the cache may be replaced with new information.

...read moreread less

Journal Article•DOI•

Shared Cache for Multiple-Stream Computer Systems

[...]

Yeh¹, Patel, Davidson•Institutions (1)

IBM¹

01 Jan 1983-IEEE Transactions on Computers

TL;DR: Cache memory organization for parallel-pipelined multiprocessor systems is evaluated and a shared cache is evaluated, which can attain a higher hit ratio and suffers performance degradation due to access conflicts.

...read moreread less

Abstract: Cache memory organization for parallel-pipelined multiprocessor systems is evaluated. Private caches have a cache coherence problem. A shared cache avoids this problem and can attain a higher hit ratio due to sharing of single copies of common blocks and dynamic allocation of cache space among the processes. However, a shared cache suffers performance degradation due to access conflicts.

...read moreread less

Cache implementation for multiple microprocessors

[...]

Chinya V. Ravishankar, J.R. Goodman

01 Jan 1983

TL;DR: An organisation for a cache memory system for use in a microprocessor-based system structured around the multibus or some similar bus is presented and standard dynamic random access memory (DRAM) is used to store the data in the cache.

...read moreread less

Abstract: An organisation for a cache memory system for use in a microprocessor-based system structured around the multibus or some similar bus is presented. Standard dynamic random access memory (DRAM) is used to store the data in the cache. Information necessary for control of and access to the cache is held in a specially designed NMOS VLSI chip. The feasibility of this approach has been demonstrated by designing and fabricating the VLSI chip and a test facility. The critical parameters and implementation details are discussed. This implementation supports multiple cards, each containing a processor and a cache. The technique involves monitoring the bus for references to main storage. The contention for cache cycles between the processor and the bus is resolved by using two identical copies of the tag memory. 9 references.

...read moreread less

Proceedings Article•DOI•

Performance of shared cache for parallel-pipelined computer systems

[...]

Phil C. C. Yeh, Janak H. Patel, Edward S. Davidson

13 Jun 1983

TL;DR: Effective shared cache organizations are proposed which retain the cache coherency advantage and which have very low access conflict even with very high request rates.

...read moreread less

Abstract: Shared-cache memory organizations for parallel-pipelined multiple instruction stream processors avoid the cache coherence problem of private caches by sharing single copies of common blocks. A shared cache may have a higher hit ratio, but suffers performance degradation due to access conflicts. Effective shared cache organizations are proposed which retain the cache coherency advantage and which have very low access conflict even with very high request rates. Analytic expressions for performance based on a Markov model have been found for several important cases. Performance of shared cache organizations and design tradeoffs are discussed.

...read moreread less

Patent•

Dynamic addressing for variable track length cache memory

[...]

Paul B Brown¹, Bradley E. Whitney¹, Paul Robert Swiatek¹•Institutions (1)

Storage Technology Corporation¹

08 Nov 1983

TL;DR: A data storage system includes a host computer (10) and magnetic disk units (26) of diverse types as mentioned in this paper, which store data records at addresses which are generated by a microprocessor (40) in the cache manager.

...read moreread less

Abstract: A data storage system includes a host computer (10) and magnetic disk units (26) of diverse types. A solid state cache memory (30) stores data records at addresses which are generated by a microprocessor (40) in the cache manager (32). These addresses include a beginning of track address and an end of track address which span a frame having enough memory locations to store an entire track for a particular type of disk unit (26).

...read moreread less

Shared cache in a checkpoint environment

[...]

J.A. Weiss, B.E. Willner

01 Apr 1983

TL;DR: This paper describes a shared cache management scheme to maintain data integrity in a multiprocessing system and describes how this scheme can be implemented in a distributed system.

...read moreread less

Abstract: This paper describes a shared cache management scheme to maintain data integrity in a multiprocessing system.

...read moreread less

Mp-shared cache with store-through local caches

[...]

R.P. Fletcher, R.A. Heller, D.M. Stein

01 Mar 1983

TL;DR: The authors describe a new shared cache scheme for multiprocessors (MPS) that permits each processor to execute directly out of shared cache.

...read moreread less

Abstract: The authors describe a new shared cache scheme for multiprocessors (MPS) The scheme permits each processor to execute directly out of shared cache The access time is longer than that to private cache Store-through local caches are used

...read moreread less

Patent•

Vector processing hardware assist and method

[...]

Edward George Drimak¹•Institutions (1)

IBM¹

15 Dec 1983

TL;DR: In this article, a cache memory between a CPU and a main memory is employed to store vectors in a cache vector space, and three vector operand address registers (48-50) are employed for reading vector operands from said cache memory and for writing results of vector operations back into cache memory.

...read moreread less

Abstract: A cache memory (10), between a CPU and a main memory, is employed to store vectors in a cache vector space (11). Three vector operand address registers (48-50) are employed for reading vector operand elements from said cache memory and for writing results of vector operations back into cache memory. A data path from the cache memory allows vector operand elements to be written into selected local storage registers of the CPU, and a path from the local storage registers to the cache memory includes a buffer. This apparatus allows overlapped reading and writing of vector elements to minimize the time required for vector processing.

...read moreread less

Journal Article•DOI•

Directory-based cache coherence scheme using number-balanced binary tree

[...]

Dae-Wha Seo¹, Jung Wan Cho¹•Institutions (1)

KAIST¹

01 Jan 1983-Microprocessing and Microprogramming

TL;DR: A new directory-based scheme (BIND) based on a number-balanced binary tree can significantly reduce invalidation latency, directory memory requirements, and network traffic as compared to the existing directory- based schemes.

...read moreread less