scispace - formally typeset
Search or ask a question

Showing papers on "Cache pollution published in 1981"


Proceedings ArticleDOI
12 May 1981
TL;DR: A cache organization is presented that essentially eliminates a penalty on subsequent cache references following a cache miss and has been incorporated in a cache/memory interface subsystem design, and the design has been implemented and prototyped.
Abstract: In the past decade, there has been much literature describing various cache organizations that exploit general programming idiosyncrasies to obtain maximum hit rate (the probability that a requested datum is now resident in the cache). Little, if any, has been presented to exploit: (1) the inherent dual input nature of the cache and (2) the many-datum reference type central processor instructions.No matter how high the cache hit rate is, a cache miss may impose a penalty on subsequent cache references. This penalty is the necessity of waiting until the missed requested datum is received from central memory and, possibly, for cache update. For the two cases above, the cache references following a miss do not require the information of the datum not resident in the cache, and are therefore penalized in this fashion.In this paper, a cache organization is presented that essentially eliminates this penalty. This cache organizational feature has been incorporated in a cache/memory interface subsystem design, and the design has been implemented and prototyped. An existing simple instruction set machine has verified the advantage of this feature; future, more extensive and sophisticated instruction set machines may obviously take more advantage. Prior to prototyping, simulations verified the advantage.

504 citations


Patent
31 Dec 1981
TL;DR: In this paper, a multiprocessing three level memory hierarchy implementation is described which uses a "write" flag and a "share" flag per page of information stored in a level three main memory.
Abstract: A multiprocessing three level memory hierarchy implementation is described which uses a "write" flag and a "share" flag per page of information stored in a level three main memory. These two flag bits are utilized to communicate from main memory at level three to private and shared caches at memory levels one and two how a given page of information is to be used. Essentially, pages which can be both written and shared are moved from main memory to the shared level two cache and then to the shared level one cache, with the processors executing from the shared level one cache. All other pages are moved from main memory to the private level two and level one caches of the requesting processor. Thus, a processor executes either from its private or shared level one cache. This allows several processors to share a level three common main memory without encountering cross interrogation overhead.

142 citations


Patent
27 Nov 1981
TL;DR: In this article, a buffered cache memory subsystem is described which features a solid-state cache memory connected to a storage director which interfaces a host channel with a control module controlling operation of a long-term data storage device such as a disk drive.
Abstract: A buffered cache memory subsystem is disclosed which features a solid-state cache memory connected to a storage director which interfaces a host channel with a control module controlling operation of a long-term data storage device such as a disk drive. The solid-state cache memory is connected to plural directors which in turn may be connected to differing types of control modules, whereby the cache is usable with more than one type of long-term data storage means within a given system. The cache memory may be field-installed in a preexisting disk drive storage system and is software transparent to the host computer, while providing improvements in overall operating efficiency. In a preferred embodiment, data is only cached when it is expected to be the subject of a future host request.

128 citations


Patent
Robert Percy Fletcher1
31 Mar 1981
TL;DR: In this article, the authors propose a control system for interlocking processors in a multiprocessing organization where each processor has its own high speed store in buffer (SIB) cache and each processor shares a common cache with the other processors.
Abstract: A control system for interlocking processors in a multiprocessing organization. Each processor has its own high speed store in buffer (SIB) cache and each processor shares a common cache with the other processors. The control system insures that all processors access the most up-to-date copy of memory information with a minimal performance impact. The design allows read only copies of the same shared memory block (line) to exist simultaneously in all private caches. Lines that are both shared and changed are stored in the common shared cache, which each processor can directly fetch from and store into. The shared cache system dynamically detects and moves lines, which are both shared and changed, to the common shared cache and moves lines from the shared cache once sharing has ceased.

124 citations


Patent
28 Jan 1981
TL;DR: In this article, the authors describe a multiprocessor data processing system including a main memory system, the processors (30) of which share a common control unit (CCU 10) that includes a write-through cache memory (20), for accessing copies of memory data without undue delay in retrieving data from the main memory systems.
Abstract: A multiprocessor data processing system including a main memory system, the processors (30) of which share a common control unit (CCU 10) that includes a write-through cache memory (20), for accessing copies of memory data therein without undue delay in retrieving data from the main memory system A synchronous processor bus (76) having conductors (104) couples the processors (30) to the CCU An asynchronous input/output bus (60) couples input/output devices (32) to an interface circuit (64) which, in turn, couples the information signals thereof to the synchronous processor bus (76) of the CCU so that both the processors (30) and the I/O devices (32) can gain quick access to memory data rather than in the cache memory (20) When a read command "misses" the cache memory (20), the CCU accesses the memory modules (28) for allocating its cache memory (20) and for returning read data to the processors (30) or input/output devices (32) To inhibit reads to locations in the cache for which there is a write-in-progress, the CCU includes a Processor Index random-access-memory (PIR 20) that temporarily stores memory addresses for which there is a write-in-progress The PIR is used by the cache memory to force a "miss" for all references to the memory address contained therein until the CCU updates the cache memory The CCU also includes a duplicate tag store (67) that maintains a copy of the cache memory address tag store (20A) thereby to enable the CCU to update its cache memory when data is written into a main memory location that is to be maintained in the cache memory

112 citations


Patent
05 Jun 1981
TL;DR: In this paper, a cache memory is provided for storing blocks of data which are most likely to be needed by the host processor in the near future, and a directory table is maintained wherein all data in cache is listed at a "home" position.
Abstract: In a data processing system of the type wherein a host processor transfers data to or from a plurality of attachment devices, a cache memory is provided for storing blocks of data which are most likely to be needed by the host processor in the near future. The host processor can then merely retrieve the necessary information from the cache memory without the necessity of accessing the attachment devices. When transferring data to cache from an attachment disk, additional unrequested information can be transferred at the same time if it is likely that this additional data will soon be requested. Further, a directory table is maintained wherein all data in cache is listed at a "home" position and, if more than one block of data in cache have the same home position, a conflict chain is set-up so that checking the contents of the cache can be done simply and quickly.

80 citations


Patent
Robert Percy Fletcher1
06 Jul 1981
TL;DR: In this paper, the replacement selection of entries in a second level cache directory of a storage hierarchy using replaced and hit addresses of a dynamic look-aside translation buffer (DLAT) at the first level (L1) in the hierarchy which receives CPU storage requests along with the CPU cache and its directory.
Abstract: The disclosure controls the replacement selection of entries in a second level (L2) cache directory of a storage hierarchy using replaced and hit addresses of a dynamic look-aside translation buffer (DLAT) at the first level (L1) in the hierarchy which receives CPU storage requests along with the CPU cache and its directory. The DLAT entries address page size blocks in main storage (MS). The disclosure provides a replacement (R) flag for each entry in the L2 directory, which represents a page size block in the L2 cache. An R bit is selected and turned on by the address of a DLAT replaced page which is caused by a DLAT miss to indicate its associated page is a candidate for replacement in the L2 cache. However, the page may continue to be accessed in the L2 cache until it is actually replaced. An R bit is selected and turned off by a CPU request address causing a DLAT hit and a L1 cache miss to indicate its associated L2 page is not a candidate for replacement.

78 citations


Patent
24 Mar 1981
TL;DR: In this paper, the integrity of data in each cache with respect to the shared memory modules is maintained by providing each shared memory with a cache monitoring and control capability which monitors processor reading and writing requests and, in response to this monitoring, maintains an accurate, updatable record of the data addresses in cache while also providing for invalidating data in a cache when it is no longer valid.
Abstract: A data processing system having a plurality of processors and a plurality of dedicated and shared memory modules. Each processor includes a cache for speeding up data transfers between the processor and its dedicated memory and also between the processor and one or more shared memories. The integrity of the data in each cache with respect to the shared memory modules is maintained by providing each shared memory with a cache monitoring and control capability which monitors processor reading and writing requests and, in response to this monitoring, maintains an accurate, updatable record of the data addresses in each cache while also providing for invalidating data in a cache when it is no longer valid.

72 citations


Patent
22 Jan 1981
TL;DR: In this article, a data processing system includes at least two processors, each having a cache memory containing an index section and a memory section, each of which can respond to an external request derived from the other processor which is simultaneously processing a task.
Abstract: A data processing system includes at least two processors, each having a cache memory containing an index section and a memory section. A first processor performs a task by deriving internal requests for its cache memory which also may respond to an external request derived from the other processor which is simultaneously processing a task. To avoid a conflict between the simultaneous processing of an internal request and of an external request by the same cache memory, one request may act on the other by delaying its enabling or by suspending its processing from the instant at which these requests are required to operate simultaneously on the index section or the memory section of the cache memory of the processor affected by these requests. Thereby, the tasks are performed by the system at an increased speed.

71 citations


Patent
22 May 1981
TL;DR: A fast cache flush mechanism includes, associated with the cache, an auxiliary portion (termed a flush count memory) that references a flush counter during the addressing of the cache as mentioned in this paper.
Abstract: A fast cache flush mechanism includes, associated with the cache, an auxiliary portion (termed a flush count memory) that references a flush counter during the addressing of the cache. This flush counter preferably has a count capacity of the same size as the number of memory locations in the cache. Whenever the cache is updated, the current value of the flush counter is written into the location in the flush count memory associated with the memory location in the cache pointed to by an accessing address, and the valid bit is set. Whenever it is desired to flush the cache, the contents of the flush counter are changed (e.g. incremented) to a new value which is then written as the new cache index into the location of the flush count memory associated with that flush count, and the associated valid bit is cleared or reset. Any access to the cache by the address requires that the cache index in the associated flush count memory location match the current contents of the flush counter and that the valid bit be set. When the cache is flushed by the above procedure, these conditions cannot be fulfilled, since the current contents of the flush counter do not match any cache index or the valid bit has been reset. As a result, for each addressed memory location that has not been accessed since the last cache flush command (corresponding to the latest incrementing of the flush counter), the total contents of that memory location (i.e. data, cache index and validity bit) are updated in the manner described above. Through this procedure, once the contents of the flush counter have recycled back to a previous value, it is guaranteed that each memory location in the cache will have been flushed and, in many instances, updated with valid data.

52 citations


Patent
03 Aug 1981
TL;DR: In this paper, data is promoted from a backing store (disk storage apparatus termed DASD) to a random access cache in a peripheral data storage system by sending a sequential access bit to the storage system.
Abstract: Data is promoted from a backing store (disk storage apparatus termed DASD) to a random access cache in a peripheral data storage system. When a sequential access bit is sent to the storage system, all data specified in a read command is fetched to the cache from DASD. If such prefetched data is replaced from cache and the sequential bit is on, a subsequent host access request for such data causes all related data, up to a predetermined maximum, not yet read to be promoted to cache.

Patent
03 Aug 1981
TL;DR: In this article, the cache directory is used to determine if an invalid signal group is stored in the associated cache storage unit, and when an invalid group is found in the cache storage, this group is rendered unavailable to the data processing unit during the present cache memory cycle without interrupting the normal cache memory operation during succeeding cache memory cycles.
Abstract: In a cache memory unit including a cache directory identifying signal groups stored in an associated cache storage unit, apparatus and method are disclosed for searching the cache directory during a second portion of the cache memory cycle when the cache directory is not needed for normal operation, to determine if an invalid signal group is stored in the associated cache storage. When an invalid signal is found in the cache storage, this signal group is rendered unavailable to the data processing unit during the present cache memory cycle without interrupting the normal cache memory operation during succeeding cache memory cycles.

Patent
19 Aug 1981
TL;DR: In this paper, the cache buffer circuit is coupled to the main memory to compare all of the memorized store address data with the accompanying readout address data and to make the first and second cache control circuits preferentially process the buffer store request prior to each of the readout requests.
Abstract: In a cache memory arrangement used between a control processor (21) and a main memory (22) and comprising operand and instruction cache memories (31, 32), a cache buffer circuit (40) is responsive to storage requests from the central processor to individually memorize the accompanying storage data and store address data and to produce the memorized storage data and store address data as buffer output data and buffer output address data together with a buffer store request. Responsive to the buffer store request, first and second cache control circuits (36, 37) transfer for accompanying buffer output address data to the operand and the instruction cache memories, if each of the operand and the instruction cache memories is not supplied with any readout requests. Preferably, first and second coincidence circuits (51, 52) are coupled to the cache buffer circuit and responsive to the readout requests to compare all of the memorized store address data with the accompanying readout address data and to make the first and the second cache control circuits preferentially process the buffer store request prior to each of the readout requests. The buffer circuit may comprise two pairs of buffers (41, 42; 63, 64), each pair being for memorizing each of the store address data and the storage data. An address converter (70) may be attached to the arrangement to convert a logical address represented by each address data into a physical address.

Patent
Wing N. Toy1
02 Feb 1981
TL;DR: In this paper, the cache memory comprises a cache address unit which stores the subset of the real address bits from the address translation buffer (ATB) in order to increase cache memory performance.
Abstract: In a computer system having a cache memory and using virtual addressing, effectiveness of the cache is improved by storing a subset of the least significant real address bits obtained by translation of a previous virtual address and by using this subset in subsequent cache addressing operations. The system functions in the following manner. In order to access a memory location in either the main memory or cache memory, a processor generates and transmits virtual address bits to the memories. The virtual address bits comprise segment, page and word address bits. The word address bits do not have to be translated, but an address translation buffer (ATB) translates the segment and page address into real address bits. A subset of the least significant bits of the latter word address bits represent the address needed for accessing the cache. In order to increase cache memory performance, the cache memory comprises a cache address unit which stores the subset of the real address bits from the ATB. These stored address bits are used in subsequent operations along with the word address bits for accessing the cache memory until the stored address bits no longer equal the current subset of least significant real address bits transmitted from the ATB. When the stored address bits no longer equal the current subset, the cache address unit then stores the current subset; and the cache memory is reaccessed utilizing the word address bits and current subset.

Journal ArticleDOI
TL;DR: The memory system of the Dorado, a compact high- performance personal computer, has very high I/O bandwidth, a large paged virtual memory, a cache, and heavily pipelined control; this paper discusses all of these in detail.
Abstract: The memory system of the Dorado, a compact high- performance personal computer, has very high I/O bandwidth, a large paged virtual memory, a cache, and heavily pipelined control; this paper discusses all of these in detail. Relatively low-speed I/O devices transfer single words to or from the cache; fast devices, such as a color video display, transfer directly to or from main storage while the processor uses the cache. Virtual addresses are used in the cache and for all I/O transfers. The memory is controlled by a seven-stage pipeline, which can deliver a peak main-storage bandwidth of 533 million bits/s to service fast I/O devices and cache misses. Interesting problems of synchronization and scheduling in this pipeline are discussed. The paper concludes with some performance measurements that show, among other things, that the cache hit rate is over 99 percent.

Journal ArticleDOI
TL;DR: This paper summarizes research by the author on two topics: cache disks and file migration, by which files are migrated between disk and mass storage as needed in order to effectively maintain on-line a much larger amount of information than the disks can hold.

Patent
Dana R Spencer1
15 Jun 1981
TL;DR: In this article, the cache allocates a line for LS use by sending a special signal with an address for a line in a special in main storage which is non-program addressable (i.e. not addressable by any of the architected instructions of the processor).
Abstract: The disclosure pertains to a relatively small local storage (LS) in a processor's IE which can be effectively expanded by utilizing a portion of a processor's store-in-cache. The cache allocates a line (i.e. block) for LS use by the instruction unit sending a special signal with an address for a line in a special in main storage which is non-program addressable (i.e. not addressable by any of the architected instructions of the processor). The special signal suppresses the normal line fetch operation of the cache from main storage caused when the cache does not have a requested line. After the initial allocation of the line space in the cache to LS use, the normal cache operation is again enabled, and the LS line can be castout to the special area in main storage and be retrieved therefrom to the cache for LS use.

Patent
16 Sep 1981
TL;DR: In this article, a logic control system comprised of a cache memory system and a transfer control logic unit is disclosed for accommodating the flow of both procedural information and CPU (central processing unit) instructions from a central memory system on a common communication bus to a CPU.
Abstract: A logic control system comprised of a cache memory system and a transfer control logic unit is disclosed for accommodating the flow of both procedural information and CPU (central processing unit) instructions from a central memory system on a common communication bus to a CPU. The CPU and the transfer control logic unit communicate by way of the cache memory system with the common communication bus. In response to a CPU request to the central memory system, procedural information and instructions are requested by the transfer control logic unit from the cache memory system and presented to the CPU in such a manner as to avoid interruptions in CPU activity caused by information transfer delays.

Patent
11 Sep 1981
TL;DR: In this paper, a digital computer system for encaching data stored in the computer system's memory in a cache internal to the processor includes a cache for storing the encacheable items.
Abstract: Apparatus in a digital computer system for encaching data stored in the computer system's memory in a cache internal to the computer system's processor. The processor executes procedures (sequences of instructions). Data processed during an execution of a procedure is represented by operands associated with the procedure. Certain of the operands designate encacheable data items associated with each execution of the procedure. The values of the encacheable data items do not change for the duration of the execution. The operands designating encacheable data do so by means of codes specifying the designated encacheable items. The processor includes a cache for storing the encacheable items. The cache responds to a code specifying an encacheable item by outputting the value of the encacheable item specified by the code. The processor further includes cache loading apparatus for loading the encacheable items into the cache. The operations executed by the processor include a call operation which commences an execution of a procedure. The cache loading apparatus responds to the call operation by loading the encacheable items for the execution being commenced into the cache. The encacheable items include pointers to arguments for each execution of a procedure. The pointers are stored in the frame corresponding to the execution at negative offsets from the frame pointer. The codes in the operands specify the negative offsets.

Patent
05 Oct 1981
TL;DR: In this article, the cache directories (PD) are addressed by the non-translatable address part and by some bits of the translatable address part so that cache synonyms occur.
Abstract: In a multiprocessor (MP) system each processor (CP) has an associated store-in-buffer cache (BCE) whose set-associative classes (lines) are designated exclusive (EX) or shareable by all processors (RO). The cache directories (PD) are addressed by the non-translatable address part and by some bits of the translatable address part so that cache synonyms occur. Accesses of a processor to its cache are either flagged exclusive (EX) for store instructions and operand fetches or shareable (RO) for instruction fetches. To reduce the number of castouts in case of cache conflicts the shareability of a new cache line brought in after a cache miss is determined from its original shareability, the type of access to be performed and the change status of the line. An exclusive (operand) fetch to a cache line a copy of which is in a remote processor and designated exclusive is changed to RO if no changes have been made to that line. A synonym designated RO in the processors own cache is, duplicated to speed up further accesses. Synonym detection in all caches is performed in parallel by permuting the translatable address bits. For faster operation each processor has a further cache directory (CD) which is used for synonym detection and cross-interrogation.

Proceedings ArticleDOI
12 May 1981
TL;DR: Queuing models were developed to analyze alternative main memory update policies in a multiprocessor system and results predicted by the models were validated by a set of simulations.
Abstract: Cache memory has played a significant role in the memory hierarchy and has been used extensively in large systems and minisystems. The effectiveness of cache memories with alternative main memory update policies in a multiprocessor system is a major concern in this paper. The performances of write-through with write-allocation or no-write allocation, buffered write-through, flag-swap, and buffered flag-swap policies have been analyzed. Because of the dominating cost of the interface between processors and main memory modules in the multiprocessor system, the effect of varying the bus width or block size has also been considered. Queuing models were developed to analyze these alternative organizations, and results predicted by the models were validated by a set of simulations.

01 Jan 1981
TL;DR: The study shows that the set associative mapping mechanism, the write through with buffering updating scheme and the no write allocation bloc fetch strategy are suitable for shared cache systems, however, for private cache systems the write back withbuffering updating schemes and the write allocation block fetch strategies are considered in this thesis.
Abstract: : Organizations of shared two-level memory hierarchies for parallel-pipelined multiple instruction stream processors are studied. The multicopy of data problems are totally eliminated by sharing the caches. All memory modules are assumed to be identical and cache addresses are interleaved by sets. For a parallel-pipelined processor of order (s,p), which consists of p parallel processors each of which is a pipelined processor with degree of multiprogramming, s, there can be up to sp cache requests from distinct instruction streams in each instruction cycle. The cache memory interference and shared cache hit ratio in such systems are investigated. The study shows that the set associative mapping mechanism, the write through with buffering updating scheme and the no write allocation bloc fetch strategy are suitable for shared cache systems. However, for private cache systems, the write back with buffering updating scheme and the write allocation block fetch strategy are considered in this thesis. Performance analysis is carried out by using discrete Markov Chain and probability based theorems. Some design tradeoffs are discussed and examples are given to illustrate a wide variety of design options that can be obtained. Performance differences due to alternative architectures are also shown by a performance comparison between shared cache and private cache for a wide range of parameters.

Journal ArticleDOI
TL;DR: A growth path for current microprocessors is suggested which includes bus enhancements and cache memories, and several differences from the mainframe world are pointed out.
Abstract: A growth path for current microprocessors is suggested which includes bus enhancements and cache memories. The implications are examined, and several differences from the mainframe world are pointed out.

Patent
02 Apr 1981
TL;DR: In this paper, the addressable cache memory feature overcomes the latency delay which inherently occurs in seeking the beginning of a region to be accessed on the disk drive mass storage by all of the processors.
Abstract: In a multiprocessor system (2), a controllable cache store (18) interface to a shared disk memory (24) employs a plurality of storage partitions whose access is interleaved in a time domain multiplexed manner on a common bus (16) with the shared disk to enable high speed sharing of the disk storage by all of the processors (4, 6, 8). The communication between each processor and its corresponding cache memory partition can be overlapped with each other and with accesses between the cache memory (18) and the commonly shared disk memory (24). The addressable cache memory feature overcomes the latency delay which inherently occurs in seeking the beginning of a region to be accessed on the disk drive mass storage (24).

DOI
01 Nov 1981
TL;DR: The paper reviews the potential of extending cache techniques to the control memory level of a computer and discusses control-store cache applications and the advantages that should be made possible by the use of a hierarchial writeable control- store technique.
Abstract: The high speed buffer or cache type of memory hierarchy is a widely used technique to provide an effective memory access speed up in the main memory of a computer system. The paper reviews the potential of extending cache techniques to the control memory level of a computer. A brief introductory review is made of microprogramming and cache techniques before discussing control-store cache applications and the advantages that should be made possible by the use of a hierarchial writeable control-store technique. So far, little quantitative measurement has been made of this type of memory hierarchy, and mention is made of work currently in progress to provide some of the information necessary to design such systems.