scispace - formally typeset
Search or ask a question

Showing papers on "Cache pollution published in 1983"


Proceedings ArticleDOI
13 Jun 1983
TL;DR: It is demonstrated that a cache exploiting primarily temporal locality (look-behind) can indeed reduce traffic to memory greatly, and introduce an elegant solution to the cache coherency problem.
Abstract: The importance of reducing processor-memory bandwidth is recognized in two distinct situations: single board computer systems and microprocessors of the future. Cache memory is investigated as a way to reduce the memory-processor traffic. We show that traditional caches which depend heavily on spatial locality (look-ahead) for their performance are inappropriate in these environments because they generate large bursts of bus traffic. A cache exploiting primarily temporal locality (look-behind) is then proposed and demonstrated to be effective in an environment where process switches are infrequent. We argue that such an environment is possible if the traffic to backing store is small enough that many processors can share a common memory and if the cache data consistency problem is solved. We demonstrate that such a cache can indeed reduce traffic to memory greatly, and introduce an elegant solution to the cache coherency problem.

431 citations


Patent
21 Oct 1983
TL;DR: In this paper, short traces of consecutive CPU references to storage are accumulated and processed to ascertain hit ratio as a function of cache size. From this determination, an allocation of cache can be made.
Abstract: Short traces of consecutive CPU references to storage are accumulated and processed to ascertain hit ratio as a function of cache size. From this determination, an allocation of cache can be made. Because this determination requires minimal processing time, LRU-referenceable memory space among concurrently executing sequential processes is used dynamically by a CPU cache manager.

117 citations


Patent
21 Dec 1983
TL;DR: In this paper, a hierarchical memory system for use with a high speed data processor characterized by having separate dedicated cache memories for storing data and instructions and further characterized by each cache having a unique cache directory containing a plurality of control bits for assisting line replacement within the individual cache memories and to insure that unnecessary or incorrect data is never stored back into said main memory.
Abstract: A hierarchical memory system for use with a high speed data processor characterized by having separate dedicated cache memories for storing data and instructions and further characterized by each cache having a unique cache directory containing a plurality of control bits for assisting line replacement within the individual cache memories and for eliminating many accesses to main memory and to insure that unnecessary or incorrect data is never stored back into said main memory. The present cache architecture and control features render broadcasting between the data cache and instruction cache unnecessary. Modification of the instruction cache is not permitted. Accordingly, control bits indicating a modification in the cache directory for the instruction cache are not necessary and similarly it is never necessary to store instruction cache lines back into main memory since their modification is not permitted. The cache architecture and controls permit normal instruction and data cache fetches and data cache stores. Additionally, special instructions are provided for setting the special control bits provided in both the instruction and data cache directories, independently of actual memory accessing OPS by the CPU and for storing and loading cache lines independently of memory OPS by the CPU.

107 citations


Journal ArticleDOI
TL;DR: Measurements are reported including the hit ratios of data and instruction references, the rate of cache invalidations by I/O, and the amount of waiting time due to cache misses.

96 citations


Proceedings ArticleDOI
13 Jun 1983
TL;DR: It is concluded theoretically that random replacement is better than LRU and FIFO, and that under certain circumstances, a direct-mapped or set associative caches may perform better than a full associative cache organization.
Abstract: Instruction caches are analyzed both theoretically and experimentally. The theoretical analysis begins with a new model for cache referencing behavior—the loop model. This model is used to study cache organizations and replacement policies. It is concluded theoretically that random replacement is better than LRU and FIFO, and that under certain circumstances, a direct-mapped or set associative cache may perform better than a full associative cache organization. Experimental results using instruction trace data are then given. The experimental results are shown to support the theoretical conclusions.

79 citations


Patent
01 Jul 1983
TL;DR: In this article, the authors propose a method for Direct (DASD) cache management that reduces the volume of data transfer between DASD and cache while avoiding the complexity of managing variable length records in the cache.
Abstract: A method for Direct (DASD) cache management that reduces the volume of data transfer between DASD and cache while avoiding the complexity of managing variable length records in the cache. This is achieved by always choosing the starting point for staging a record to be at the start of the missing record and, at the same time, allocating and managing cache space in fixed length blocks. The method steps require staging records, starting with the requested record and continuing until either the cache block is full, the end of track is reached, or a record already in the cache is encountered.

70 citations


Proceedings ArticleDOI
13 Jun 1983
TL;DR: In designing a VLSI instruction cache for a RISC microprocessor the authors have uncovered four ideas potentially applicable to other V LSI machines that provide expansible cache memory, increased cache speed, reduced program code size, and decreased manufacturing costs.
Abstract: A cache was first used in a commercial computer in 1968,1 and researchers have spent the last 15 years analyzing caches and suggesting improvements. In designing a VLSI instruction cache for a RISC microprocessor we have uncovered four ideas potentially applicable to other VLSI machines. These ideas provide expansible cache memory, increased cache speed, reduced program code size, and decreased manufacturing costs. These improvements blur the habitual distinction between an instruction cache and an instruction fetch unit. The next four sections present the four architectural ideas, followed by a section on performance evaluation of each idea. We then describe the implementation of the cache and finally summarize the results.

66 citations


Patent
17 Oct 1983
TL;DR: In this paper, the cache operating cycle is divided into two subcycles dedicated to mutually exclusive operations: the first subcycle is dedicated to receiving a central processor memory read request, with its address.
Abstract: A data processing machine in which the cache operating cycle is divided into two subcycles dedicated to mutually exclusive operations. The first subcycle is dedicated to receiving a central processor memory read request, with its address. The second subcycle is dedicated to every other kind of cache operation, in particular either (a) receiving an address from a peripheral processor for checking the cache contents after a peripheral processor write to main memory, or (b) writing anything to the cache, including an invalid bit after a cache check match condition, or data after either a cache miss or a central processor write to main memory. The central processor can continue uninteruptedly to read the cache on successive central processor microinstruction cycles, regardless of the fact that the cache contents are being "simultaneously" checked, invalidated or updated after central processor writes. After a cache miss, although the central processor must be stopped to permit updating, it can resume operations a cycle earlier than is possible without the divided cache cycle.

62 citations


Patent
22 Feb 1983
TL;DR: In this article, an instruction is transferred to a region from the main data memory in response to a program address and may be executed without waiting for simultaneous transfer of a large block or number of instructions.
Abstract: A memory system includes a high-speed, multi-region instruction cache, each region of which stores a variable number of instructions received from a main data memory said instructions forming part of a program. An instruction is transferred to a region from the main data memory in response to a program address and may be executed without waiting for simultaneous transfer of a large block or number of instructions. Meanwhile, instructions at consecutively subsequent addresses in the main data memory are transferred to the same region for building an expanding cache of rapidly accessible instructions. The expansion of a given region is brought about as a result of the addressing of that region, such that a cache region receiving a main line of the aforementioned program will be expanded in preference to a region receiving an occasionally used sub-routine. When a new program address is presented, a simultaneous comparison is made with pointers which are provided to be indicative of addresses of instructions currently stored in the various cache regions, and stored information is gated from a region which produces a favorable comparison. When a new address is presented to which no cache region is responsive, the least recently used region, that is the region that has been accessed least recently, is immediately invalidated and reused by writing thereover, starting with the new address to which no cache region was responsive, for accumulating a substituted cache of information from the main data memory.

51 citations


Patent
30 Jun 1983
TL;DR: A multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors as mentioned in this paper.
Abstract: A multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors. Test control apparatus which couples to the directory error checking apparatus operates to selectively enable and disable the directory error checking circuits in response to commands received from a central processing unit so as to enable the testing of the cache directory and other portions of the cache system using common test routines.

42 citations


Patent
30 Jun 1983
TL;DR: In this paper, a multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors.
Abstract: A multilevel set associative cache system whose directory and cache store are organized into levels of memory locations includes control apparatus which selectively degrades cache operation in response to error signals from directory error checking circuits to those levels detected to be free from errors. Test apparatus coupled to the control apparatus and operates to selectively alter the operational states of the cache levels in response to commands received from a central processing unit for enabling testing of such control apparatus in addition to the other cache control areas.

Patent
28 Feb 1983
TL;DR: Cache memory includes a dual or two-part cache with one part of the cache being primarily designated for instruction data while the other part is designated for operand data, but not exclusively as discussed by the authors.
Abstract: Cache memory includes a dual or two part cache with one part of the cache being primarily designated for instruction data while the other is primarily designated for operand data, but not exclusively. For a maximum speed of operation, the two parts of the cache are equal in capacity. The two parts of the cache, designated I-Cache and O-Cache, are semi-independent in their operation and include arrangements for effecting synchronized searches, they can accommodate up to three separate operations substantially simultaneously. Each cache unit has a directory and a data array with the directory and data array being separately addressable. Each cache unit may be subjected to a primary and to one or more secondary concurrent uses with the secondary uses prioritized. Data is stored in the cache unit on a so-called store-into basis wherein data obtained from the main memory is operated upon and stored in the cache without returning the operated upon data to the main memory unit until subsequent transactions require such return.

Patent
28 Feb 1983
TL;DR: A cache memory includes a dual or two part cache with one part of the cache being primarily designated for instruction data while the other part is primarily reserved for operand data, but not exclusively as discussed by the authors.
Abstract: A cache memory includes a dual or two part cache with one part of the cache being primarily designated for instruction data while the other is primarily designated for operand data, but not exclusively. For a maximum speed of operation, the two parts of the cache are equal in capacity. The two parts of the cache, designated I-Cache and O-Cache, are semi-independent in their operation and include arrangements for effecting synchronized searches, they can accommodate up to three separate operations substantially simultaneously. Each cache unit has a directory and a data array with the directory and data array being separately addressable. Each cache unit may be subjected to a primary and to one or more secondary concurrent uses with the secondary uses prioritized. Data is stored in the cache unit on a so-called store-into basis wherein data obtained from the main memory is operated upon and stored in the cache without returning the operated upon data to the main memory unit until subsequent transactions require such return. An arrangement is included whereby the real page number of a delayed transaction may be verified.

Journal ArticleDOI
TL;DR: In this paper a mathematical model is developed which predicts the effect on the miss ratio of running a program in a sequence of interrupted execution intervals and results are compared to measured miss ratios of real programs executing in an interrupted execution environment.
Abstract: A cache is a small, fast associative memory located between a central processor and primary memory and used to hold copies of the contents of primary memory locations. A key performance measure of a cache is the miss ratio: the fraction of processor references which are not satisfied by the cache and result in primary memory references. The miss ratio is sometimes measured by running a single program to completion; however, in real systems rarely does such uninterrupted execution occur. In this paper a mathematical model is developed which predicts the effect on the miss ratio of running a program in a sequence of interrupted execution intervals. Results from the model are compared to measured miss ratios of real programs executing in an interrupted execution environment.

Journal ArticleDOI
Yeh1, Patel, Davidson
TL;DR: Cache memory organization for parallel-pipelined multiprocessor systems is evaluated and a shared cache is evaluated, which can attain a higher hit ratio and suffers performance degradation due to access conflicts.
Abstract: Cache memory organization for parallel-pipelined multiprocessor systems is evaluated. Private caches have a cache coherence problem. A shared cache avoids this problem and can attain a higher hit ratio due to sharing of single copies of common blocks and dynamic allocation of cache space among the processes. However, a shared cache suffers performance degradation due to access conflicts.

Patent
Jerrold L. Allen1
05 Dec 1983
TL;DR: In this paper, a data transfer bus is provided between each group of cache memory portions and the corresponding portion of main memory such that the data transfers for the system as a whole is increased.
Abstract: In a data handling system having one or more processors, a cache memory associated with each processor and a main memory unit, each cache memory is divided into an equal number of portions, and the main memory is divided into a corresponding number of portions. A data transfer bus is provided between each group of cache memory portions and the corresponding portion of main memory such that each group of cache memory portions corresponds to only a portion of main memory. Each data transfer bus in independently controlled such that the rate of data transfers for the system as a whole is increased.

01 Jan 1983
TL;DR: An organisation for a cache memory system for use in a microprocessor-based system structured around the multibus or some similar bus is presented and standard dynamic random access memory (DRAM) is used to store the data in the cache.
Abstract: An organisation for a cache memory system for use in a microprocessor-based system structured around the multibus or some similar bus is presented. Standard dynamic random access memory (DRAM) is used to store the data in the cache. Information necessary for control of and access to the cache is held in a specially designed NMOS VLSI chip. The feasibility of this approach has been demonstrated by designing and fabricating the VLSI chip and a test facility. The critical parameters and implementation details are discussed. This implementation supports multiple cards, each containing a processor and a cache. The technique involves monitoring the bus for references to main storage. The contention for cache cycles between the processor and the bus is resolved by using two identical copies of the tag memory. 9 references.

Patent
09 Dec 1983
TL;DR: In this article, a non-write-through cache memory associated with each of the system's processing elements stores computations generated by that processing element at a context switch, the stored information is sequentially written to two separate main memory units.
Abstract: Apparatus for maintaining duplicate copies of information stored in fault-tolerant computer main memories. A non write-through cache memory associated with each of the system's processing elements stores computations generated by that processing element. At a context switch, the stored information is sequentially written to two separate main memory units. A separate status area in main memory is updated by the processing element both before and after each writing operation so that a fault occurring during data processing or during any storage operation leaves the system with sufficient information to be able to reconstruct the data without loss of integrity. To efficiently transfer information between the cache memory and the system main memories without consuming a large amount of processing time at context switches, a block status memory associated with the cache memory contains an entry for each data block in the cache memory. The entry indicates whether the corresponding data block has been modified during data processing or written with computational data from the processing element. The storage operations are carried out by high-speed hardware which stores only the modified data blocks. Additional special-purpose hardware simultaneously invalidates all cache memory entries so that a new task can be loaded and started.

Patent
25 Nov 1983
TL;DR: In this article, the authors propose to attain duplicate writing of data without any consciousness at all in terms of the software by providing a flag representing whether or not a data not written exists to a directory managed by a cache memory.
Abstract: PURPOSE:To attain duplicate writing of data without any consciousness at all in terms of the software by providing a flag representing whether or not a data not written exists to a directory managed by a cache memory. CONSTITUTION:The directory 80 to manage a cache memory 60 is constituted by a sector address, a pointer and a buffer busy flag. When the write of data is finished to a block from a CPU, the buffer busy flag is set to logical 1 and reset to logical 0 when the data of the block is finished for write in two magnetic disc devices. When a microprocessor 110 checks the directory 80 and finds out a block where the buffer busy flag is logical 0, the data transfer control with the CPU is attained by the direct memory access controller.

Proceedings ArticleDOI
13 Jun 1983
TL;DR: Effective shared cache organizations are proposed which retain the cache coherency advantage and which have very low access conflict even with very high request rates.
Abstract: Shared-cache memory organizations for parallel-pipelined multiple instruction stream processors avoid the cache coherence problem of private caches by sharing single copies of common blocks. A shared cache may have a higher hit ratio, but suffers performance degradation due to access conflicts. Effective shared cache organizations are proposed which retain the cache coherency advantage and which have very low access conflict even with very high request rates. Analytic expressions for performance based on a Markov model have been found for several important cases. Performance of shared cache organizations and design tradeoffs are discussed.

Patent
06 Oct 1983
TL;DR: In this paper, the authors propose to access a cache memory with logical addresses and reduce an overhead in task switching by providing a recognizer for recognizing which task data in the cache memory belongs to.
Abstract: PURPOSE:To access a cache memory with logical addresses and reduce an overhead in task switching by providing a recognizer for recognizing which task data in the cache memory belongs to. CONSTITUTION:The cache memory 3A consists of a data memory part 3A3 for holding a copy of part of data in a main memory, a directory part 3A3 for checking whether data in an address requested from a basic processing part 3B is present in the data memory part 3A3 or not, and a control part 3A1 for controlling the operation of the cache memory 3A. When the data is present in the data memory part 3A3 in read operation, the cache memory 3A transfers it to the basic processing part 3B through a signal line 3C6. Write data is received from the basic processing part 3B in write operation and when there is a write address present in the data memory part 3A3, the received data is written there.

Patent
07 Oct 1983
TL;DR: In this paper, the authors propose to improve the reliability of a system by disconnecting a cache memory to which abnormality occurs or updating its contents when abnormality occurred to the cache memory or one of disk controllers.
Abstract: PURPOSE:To improve the reliability of a system by disconnecting a cache memory to which abnormality occurs or updating its contents when abnormality occurs to the cache memory or one of disk controllers. CONSTITUTION:The disk controller 1 or 2 which detects the abnormality of the cache memory 3 by a cache memory abnormality detection part sends out a disconnection indication signal to the cache memory 3. Consequently, the cache memory 3 resets a ready-to-use state flag to inhibit access from the other disk controller to the cache memory 3. Further, after the disk controller 1 or 2 to which the abnormality occurs recovers, the disk controller checks three state display flags of the cache memory 3 and a self-diagnosis to the cache memory 3 is not taken when a connection state flag is not set and when the ready-to-use state flag is set.

01 Apr 1983
TL;DR: This paper describes a shared cache management scheme to maintain data integrity in a multiprocessing system and describes how this scheme can be implemented in a distributed system.
Abstract: This paper describes a shared cache management scheme to maintain data integrity in a multiprocessing system.

01 Mar 1983
TL;DR: The authors describe a new shared cache scheme for multiprocessors (MPS) that permits each processor to execute directly out of shared cache.
Abstract: The authors describe a new shared cache scheme for multiprocessors (MPS) The scheme permits each processor to execute directly out of shared cache The access time is longer than that to private cache Store-through local caches are used

Patent
Edward George Drimak1
15 Dec 1983
TL;DR: In this article, a cache memory between a CPU and a main memory is employed to store vectors in a cache vector space, and three vector operand address registers (48-50) are employed for reading vector operands from said cache memory and for writing results of vector operations back into cache memory.
Abstract: A cache memory (10), between a CPU and a main memory, is employed to store vectors in a cache vector space (11). Three vector operand address registers (48-50) are employed for reading vector operand elements from said cache memory and for writing results of vector operations back into cache memory. A data path from the cache memory allows vector operand elements to be written into selected local storage registers of the CPU, and a path from the local storage registers to the cache memory includes a buffer. This apparatus allows overlapped reading and writing of vector elements to minimize the time required for vector processing.

Journal ArticleDOI
Dae-Wha Seo1, Jung Wan Cho1
TL;DR: A new directory-based scheme (BIND) based on a number-balanced binary tree can significantly reduce invalidation latency, directory memory requirements, and network traffic as compared to the existing directory- based schemes.