scispace - formally typeset
Search or ask a question

Showing papers on "Cache invalidation published in 1983"


Proceedings ArticleDOI
13 Jun 1983
TL;DR: It is demonstrated that a cache exploiting primarily temporal locality (look-behind) can indeed reduce traffic to memory greatly, and introduce an elegant solution to the cache coherency problem.
Abstract: The importance of reducing processor-memory bandwidth is recognized in two distinct situations: single board computer systems and microprocessors of the future. Cache memory is investigated as a way to reduce the memory-processor traffic. We show that traditional caches which depend heavily on spatial locality (look-ahead) for their performance are inappropriate in these environments because they generate large bursts of bus traffic. A cache exploiting primarily temporal locality (look-behind) is then proposed and demonstrated to be effective in an environment where process switches are infrequent. We argue that such an environment is possible if the traffic to backing store is small enough that many processors can share a common memory and if the cache data consistency problem is solved. We demonstrate that such a cache can indeed reduce traffic to memory greatly, and introduce an elegant solution to the cache coherency problem.

431 citations


Patent
21 Oct 1983
TL;DR: In this paper, short traces of consecutive CPU references to storage are accumulated and processed to ascertain hit ratio as a function of cache size. From this determination, an allocation of cache can be made.
Abstract: Short traces of consecutive CPU references to storage are accumulated and processed to ascertain hit ratio as a function of cache size. From this determination, an allocation of cache can be made. Because this determination requires minimal processing time, LRU-referenceable memory space among concurrently executing sequential processes is used dynamically by a CPU cache manager.

117 citations


Patent
21 Dec 1983
TL;DR: In this paper, a hierarchical memory system for use with a high speed data processor characterized by having separate dedicated cache memories for storing data and instructions and further characterized by each cache having a unique cache directory containing a plurality of control bits for assisting line replacement within the individual cache memories and to insure that unnecessary or incorrect data is never stored back into said main memory.
Abstract: A hierarchical memory system for use with a high speed data processor characterized by having separate dedicated cache memories for storing data and instructions and further characterized by each cache having a unique cache directory containing a plurality of control bits for assisting line replacement within the individual cache memories and for eliminating many accesses to main memory and to insure that unnecessary or incorrect data is never stored back into said main memory. The present cache architecture and control features render broadcasting between the data cache and instruction cache unnecessary. Modification of the instruction cache is not permitted. Accordingly, control bits indicating a modification in the cache directory for the instruction cache are not necessary and similarly it is never necessary to store instruction cache lines back into main memory since their modification is not permitted. The cache architecture and controls permit normal instruction and data cache fetches and data cache stores. Additionally, special instructions are provided for setting the special control bits provided in both the instruction and data cache directories, independently of actual memory accessing OPS by the CPU and for storing and loading cache lines independently of memory OPS by the CPU.

107 citations


Proceedings ArticleDOI
13 Jun 1983
TL;DR: It is concluded theoretically that random replacement is better than LRU and FIFO, and that under certain circumstances, a direct-mapped or set associative caches may perform better than a full associative cache organization.
Abstract: Instruction caches are analyzed both theoretically and experimentally. The theoretical analysis begins with a new model for cache referencing behavior—the loop model. This model is used to study cache organizations and replacement policies. It is concluded theoretically that random replacement is better than LRU and FIFO, and that under certain circumstances, a direct-mapped or set associative cache may perform better than a full associative cache organization. Experimental results using instruction trace data are then given. The experimental results are shown to support the theoretical conclusions.

79 citations


Patent
01 Jul 1983
TL;DR: In this article, the authors propose a method for Direct (DASD) cache management that reduces the volume of data transfer between DASD and cache while avoiding the complexity of managing variable length records in the cache.
Abstract: A method for Direct (DASD) cache management that reduces the volume of data transfer between DASD and cache while avoiding the complexity of managing variable length records in the cache. This is achieved by always choosing the starting point for staging a record to be at the start of the missing record and, at the same time, allocating and managing cache space in fixed length blocks. The method steps require staging records, starting with the requested record and continuing until either the cache block is full, the end of track is reached, or a record already in the cache is encountered.

70 citations


Proceedings ArticleDOI
13 Jun 1983
TL;DR: In designing a VLSI instruction cache for a RISC microprocessor the authors have uncovered four ideas potentially applicable to other V LSI machines that provide expansible cache memory, increased cache speed, reduced program code size, and decreased manufacturing costs.
Abstract: A cache was first used in a commercial computer in 1968,1 and researchers have spent the last 15 years analyzing caches and suggesting improvements. In designing a VLSI instruction cache for a RISC microprocessor we have uncovered four ideas potentially applicable to other VLSI machines. These ideas provide expansible cache memory, increased cache speed, reduced program code size, and decreased manufacturing costs. These improvements blur the habitual distinction between an instruction cache and an instruction fetch unit. The next four sections present the four architectural ideas, followed by a section on performance evaluation of each idea. We then describe the implementation of the cache and finally summarize the results.

66 citations


Patent
17 Oct 1983
TL;DR: In this paper, the cache operating cycle is divided into two subcycles dedicated to mutually exclusive operations: the first subcycle is dedicated to receiving a central processor memory read request, with its address.
Abstract: A data processing machine in which the cache operating cycle is divided into two subcycles dedicated to mutually exclusive operations. The first subcycle is dedicated to receiving a central processor memory read request, with its address. The second subcycle is dedicated to every other kind of cache operation, in particular either (a) receiving an address from a peripheral processor for checking the cache contents after a peripheral processor write to main memory, or (b) writing anything to the cache, including an invalid bit after a cache check match condition, or data after either a cache miss or a central processor write to main memory. The central processor can continue uninteruptedly to read the cache on successive central processor microinstruction cycles, regardless of the fact that the cache contents are being "simultaneously" checked, invalidated or updated after central processor writes. After a cache miss, although the central processor must be stopped to permit updating, it can resume operations a cycle earlier than is possible without the divided cache cycle.

62 citations


Patent
22 Feb 1983
TL;DR: In this article, an instruction is transferred to a region from the main data memory in response to a program address and may be executed without waiting for simultaneous transfer of a large block or number of instructions.
Abstract: A memory system includes a high-speed, multi-region instruction cache, each region of which stores a variable number of instructions received from a main data memory said instructions forming part of a program. An instruction is transferred to a region from the main data memory in response to a program address and may be executed without waiting for simultaneous transfer of a large block or number of instructions. Meanwhile, instructions at consecutively subsequent addresses in the main data memory are transferred to the same region for building an expanding cache of rapidly accessible instructions. The expansion of a given region is brought about as a result of the addressing of that region, such that a cache region receiving a main line of the aforementioned program will be expanded in preference to a region receiving an occasionally used sub-routine. When a new program address is presented, a simultaneous comparison is made with pointers which are provided to be indicative of addresses of instructions currently stored in the various cache regions, and stored information is gated from a region which produces a favorable comparison. When a new address is presented to which no cache region is responsive, the least recently used region, that is the region that has been accessed least recently, is immediately invalidated and reused by writing thereover, starting with the new address to which no cache region was responsive, for accumulating a substituted cache of information from the main data memory.

51 citations


Patent
30 Jun 1983
TL;DR: A multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors as mentioned in this paper.
Abstract: A multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors. Test control apparatus which couples to the directory error checking apparatus operates to selectively enable and disable the directory error checking circuits in response to commands received from a central processing unit so as to enable the testing of the cache directory and other portions of the cache system using common test routines.

42 citations


Patent
30 Jun 1983
TL;DR: In this paper, a multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors.
Abstract: A multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors. Test apparatus coupled to the control apparatus and operates to selectively alter the operational states of the cache levels in response to commands received from a central processing unit for enabling testing of such control apparatus in addition to the other cache control areas.

40 citations


Patent
28 Feb 1983
TL;DR: Cache memory includes a dual or two-part cache with one part of the cache being primarily designated for instruction data while the other part is designated for operand data, but not exclusively as discussed by the authors.
Abstract: Cache memory includes a dual or two part cache with one part of the cache being primarily designated for instruction data while the other is primarily designated for operand data, but not exclusively. For a maximum speed of operation, the two parts of the cache are equal in capacity. The two parts of the cache, designated I-Cache and O-Cache, are semi-independent in their operation and include arrangements for effecting synchronized searches, they can accommodate up to three separate operations substantially simultaneously. Each cache unit has a directory and a data array with the directory and data array being separately addressable. Each cache unit may be subjected to a primary and to one or more secondary concurrent uses with the secondary uses prioritized. Data is stored in the cache unit on a so-called store-into basis wherein data obtained from the main memory is operated upon and stored in the cache without returning the operated upon data to the main memory unit until subsequent transactions require such return.

Patent
28 Feb 1983
TL;DR: A cache memory includes a dual or two part cache with one part of the cache being primarily designated for instruction data while the other part is primarily reserved for operand data, but not exclusively as discussed by the authors.
Abstract: A cache memory includes a dual or two part cache with one part of the cache being primarily designated for instruction data while the other is primarily designated for operand data, but not exclusively. For a maximum speed of operation, the two parts of the cache are equal in capacity. The two parts of the cache, designated I-Cache and O-Cache, are semi-independent in their operation and include arrangements for effecting synchronized searches, they can accommodate up to three separate operations substantially simultaneously. Each cache unit has a directory and a data array with the directory and data array being separately addressable. Each cache unit may be subjected to a primary and to one or more secondary concurrent uses with the secondary uses prioritized. Data is stored in the cache unit on a so-called store-into basis wherein data obtained from the main memory is operated upon and stored in the cache without returning the operated upon data to the main memory unit until subsequent transactions require such return. An arrangement is included whereby the real page number of a delayed transaction may be verified.

Journal ArticleDOI
Yeh1, Patel, Davidson
TL;DR: Cache memory organization for parallel-pipelined multiprocessor systems is evaluated and a shared cache is evaluated, which can attain a higher hit ratio and suffers performance degradation due to access conflicts.
Abstract: Cache memory organization for parallel-pipelined multiprocessor systems is evaluated. Private caches have a cache coherence problem. A shared cache avoids this problem and can attain a higher hit ratio due to sharing of single copies of common blocks and dynamic allocation of cache space among the processes. However, a shared cache suffers performance degradation due to access conflicts.

Patent
21 Mar 1983
TL;DR: In this paper, an indicator is set for that segment by the storage control unit to prevent further attempts to transfer the segment, which remains in the cache store until the host processor issues an initialization or reset segment command.
Abstract: In a system having a host processor connected through a storage control unit to a cache store and a plurality of disk devices, segments of data which have been written to, while resident in the cache store, are transferred to the disks at some later time. If an abnormal condition, such as a bad spot on the disk, prevents the transfer of a segment from the cache store to the disk, an indicator is set for that segment by the storage control unit to prevent further attempts to transfer the segment. A segment whose indicator is set remains in the cache store until the host processor issues an initialize or a reset segment command.

Proceedings ArticleDOI
13 Jun 1983
TL;DR: Effective shared cache organizations are proposed which retain the cache coherency advantage and which have very low access conflict even with very high request rates.
Abstract: Shared-cache memory organizations for parallel-pipelined multiple instruction stream processors avoid the cache coherence problem of private caches by sharing single copies of common blocks. A shared cache may have a higher hit ratio, but suffers performance degradation due to access conflicts. Effective shared cache organizations are proposed which retain the cache coherency advantage and which have very low access conflict even with very high request rates. Analytic expressions for performance based on a Markov model have been found for several important cases. Performance of shared cache organizations and design tradeoffs are discussed.

Patent
07 Oct 1983
TL;DR: In this paper, the authors propose to improve the reliability of a system by disconnecting a cache memory to which abnormality occurs or updating its contents when abnormality occurred to the cache memory or one of disk controllers.
Abstract: PURPOSE:To improve the reliability of a system by disconnecting a cache memory to which abnormality occurs or updating its contents when abnormality occurs to the cache memory or one of disk controllers. CONSTITUTION:The disk controller 1 or 2 which detects the abnormality of the cache memory 3 by a cache memory abnormality detection part sends out a disconnection indication signal to the cache memory 3. Consequently, the cache memory 3 resets a ready-to-use state flag to inhibit access from the other disk controller to the cache memory 3. Further, after the disk controller 1 or 2 to which the abnormality occurs recovers, the disk controller checks three state display flags of the cache memory 3 and a self-diagnosis to the cache memory 3 is not taken when a connection state flag is not set and when the ready-to-use state flag is set.

Patent
01 Aug 1983
TL;DR: In this paper, the authors propose to prevent the deterioration of a system by limiting a block to be fixed on a cache to an area specified by an address in accessing and the address of preceding access.
Abstract: PURPOSE:To prevent the deterioration of a system by limiting a block to be fixed on a cache to an area specified by an address in accessing and the address of preceding access. CONSTITUTION:Sequential access already accessed the copy of a block A in a main storage buffer 30, i.e. a block (a) in the cache 2 and is accessing the copy of a block B, i.e. a block (b). The copy of a block C is stored in the cache 2 as a block (c) by block transfer from a main storage 3 to the cache 2 on the basis of the preceding access. Under said state, a fixed flag obtains ''0'' because sequential access passed the whole area of the block (a) indicating the fixing is released to be an object to be replaced.

01 Mar 1983
TL;DR: The authors describe a new shared cache scheme for multiprocessors (MPS) that permits each processor to execute directly out of shared cache.
Abstract: The authors describe a new shared cache scheme for multiprocessors (MPS) The scheme permits each processor to execute directly out of shared cache The access time is longer than that to private cache Store-through local caches are used

Journal ArticleDOI
Dae-Wha Seo1, Jung Wan Cho1
TL;DR: A new directory-based scheme (BIND) based on a number-balanced binary tree can significantly reduce invalidation latency, directory memory requirements, and network traffic as compared to the existing directory- based schemes.