scispace - formally typeset
Search or ask a question

Showing papers on "Cache invalidation published in 1981"


Proceedings ArticleDOI
12 May 1981
TL;DR: A cache organization is presented that essentially eliminates a penalty on subsequent cache references following a cache miss and has been incorporated in a cache/memory interface subsystem design, and the design has been implemented and prototyped.
Abstract: In the past decade, there has been much literature describing various cache organizations that exploit general programming idiosyncrasies to obtain maximum hit rate (the probability that a requested datum is now resident in the cache). Little, if any, has been presented to exploit: (1) the inherent dual input nature of the cache and (2) the many-datum reference type central processor instructions.No matter how high the cache hit rate is, a cache miss may impose a penalty on subsequent cache references. This penalty is the necessity of waiting until the missed requested datum is received from central memory and, possibly, for cache update. For the two cases above, the cache references following a miss do not require the information of the datum not resident in the cache, and are therefore penalized in this fashion.In this paper, a cache organization is presented that essentially eliminates this penalty. This cache organizational feature has been incorporated in a cache/memory interface subsystem design, and the design has been implemented and prototyped. An existing simple instruction set machine has verified the advantage of this feature; future, more extensive and sophisticated instruction set machines may obviously take more advantage. Prior to prototyping, simulations verified the advantage.

504 citations


Patent
27 Nov 1981
TL;DR: In this article, a buffered cache memory subsystem is described which features a solid-state cache memory connected to a storage director which interfaces a host channel with a control module controlling operation of a long-term data storage device such as a disk drive.
Abstract: A buffered cache memory subsystem is disclosed which features a solid-state cache memory connected to a storage director which interfaces a host channel with a control module controlling operation of a long-term data storage device such as a disk drive. The solid-state cache memory is connected to plural directors which in turn may be connected to differing types of control modules, whereby the cache is usable with more than one type of long-term data storage means within a given system. The cache memory may be field-installed in a preexisting disk drive storage system and is software transparent to the host computer, while providing improvements in overall operating efficiency. In a preferred embodiment, data is only cached when it is expected to be the subject of a future host request.

128 citations


Patent
Robert Percy Fletcher1
31 Mar 1981
TL;DR: In this article, the authors propose a control system for interlocking processors in a multiprocessing organization where each processor has its own high speed store in buffer (SIB) cache and each processor shares a common cache with the other processors.
Abstract: A control system for interlocking processors in a multiprocessing organization. Each processor has its own high speed store in buffer (SIB) cache and each processor shares a common cache with the other processors. The control system insures that all processors access the most up-to-date copy of memory information with a minimal performance impact. The design allows read only copies of the same shared memory block (line) to exist simultaneously in all private caches. Lines that are both shared and changed are stored in the common shared cache, which each processor can directly fetch from and store into. The shared cache system dynamically detects and moves lines, which are both shared and changed, to the common shared cache and moves lines from the shared cache once sharing has ceased.

124 citations


Patent
05 Jun 1981
TL;DR: In this paper, a cache memory is provided for storing blocks of data which are most likely to be needed by the host processor in the near future, and a directory table is maintained wherein all data in cache is listed at a "home" position.
Abstract: In a data processing system of the type wherein a host processor transfers data to or from a plurality of attachment devices, a cache memory is provided for storing blocks of data which are most likely to be needed by the host processor in the near future. The host processor can then merely retrieve the necessary information from the cache memory without the necessity of accessing the attachment devices. When transferring data to cache from an attachment disk, additional unrequested information can be transferred at the same time if it is likely that this additional data will soon be requested. Further, a directory table is maintained wherein all data in cache is listed at a "home" position and, if more than one block of data in cache have the same home position, a conflict chain is set-up so that checking the contents of the cache can be done simply and quickly.

80 citations


Patent
Robert Percy Fletcher1
06 Jul 1981
TL;DR: In this paper, the replacement selection of entries in a second level cache directory of a storage hierarchy using replaced and hit addresses of a dynamic look-aside translation buffer (DLAT) at the first level (L1) in the hierarchy which receives CPU storage requests along with the CPU cache and its directory.
Abstract: The disclosure controls the replacement selection of entries in a second level (L2) cache directory of a storage hierarchy using replaced and hit addresses of a dynamic look-aside translation buffer (DLAT) at the first level (L1) in the hierarchy which receives CPU storage requests along with the CPU cache and its directory. The DLAT entries address page size blocks in main storage (MS). The disclosure provides a replacement (R) flag for each entry in the L2 directory, which represents a page size block in the L2 cache. An R bit is selected and turned on by the address of a DLAT replaced page which is caused by a DLAT miss to indicate its associated page is a candidate for replacement in the L2 cache. However, the page may continue to be accessed in the L2 cache until it is actually replaced. An R bit is selected and turned off by a CPU request address causing a DLAT hit and a L1 cache miss to indicate its associated L2 page is not a candidate for replacement.

78 citations


Patent
22 May 1981
TL;DR: A fast cache flush mechanism includes, associated with the cache, an auxiliary portion (termed a flush count memory) that references a flush counter during the addressing of the cache as mentioned in this paper.
Abstract: A fast cache flush mechanism includes, associated with the cache, an auxiliary portion (termed a flush count memory) that references a flush counter during the addressing of the cache. This flush counter preferably has a count capacity of the same size as the number of memory locations in the cache. Whenever the cache is updated, the current value of the flush counter is written into the location in the flush count memory associated with the memory location in the cache pointed to by an accessing address, and the valid bit is set. Whenever it is desired to flush the cache, the contents of the flush counter are changed (e.g. incremented) to a new value which is then written as the new cache index into the location of the flush count memory associated with that flush count, and the associated valid bit is cleared or reset. Any access to the cache by the address requires that the cache index in the associated flush count memory location match the current contents of the flush counter and that the valid bit be set. When the cache is flushed by the above procedure, these conditions cannot be fulfilled, since the current contents of the flush counter do not match any cache index or the valid bit has been reset. As a result, for each addressed memory location that has not been accessed since the last cache flush command (corresponding to the latest incrementing of the flush counter), the total contents of that memory location (i.e. data, cache index and validity bit) are updated in the manner described above. Through this procedure, once the contents of the flush counter have recycled back to a previous value, it is guaranteed that each memory location in the cache will have been flushed and, in many instances, updated with valid data.

52 citations


Patent
03 Aug 1981
TL;DR: In this paper, data is promoted from a backing store (disk storage apparatus termed DASD) to a random access cache in a peripheral data storage system by sending a sequential access bit to the storage system.
Abstract: Data is promoted from a backing store (disk storage apparatus termed DASD) to a random access cache in a peripheral data storage system. When a sequential access bit is sent to the storage system, all data specified in a read command is fetched to the cache from DASD. If such prefetched data is replaced from cache and the sequential bit is on, a subsequent host access request for such data causes all related data, up to a predetermined maximum, not yet read to be promoted to cache.

52 citations


Patent
03 Aug 1981
TL;DR: In this article, the cache directory is used to determine if an invalid signal group is stored in the associated cache storage unit, and when an invalid group is found in the cache storage, this group is rendered unavailable to the data processing unit during the present cache memory cycle without interrupting the normal cache memory operation during succeeding cache memory cycles.
Abstract: In a cache memory unit including a cache directory identifying signal groups stored in an associated cache storage unit, apparatus and method are disclosed for searching the cache directory during a second portion of the cache memory cycle when the cache directory is not needed for normal operation, to determine if an invalid signal group is stored in the associated cache storage. When an invalid signal is found in the cache storage, this signal group is rendered unavailable to the data processing unit during the present cache memory cycle without interrupting the normal cache memory operation during succeeding cache memory cycles.

50 citations


Patent
Robert Percy Fletcher1
30 Dec 1981
TL;DR: The hybrid cache control as mentioned in this paper provides a sharing (SH) flag with each line representation in each private CP cache directory in a multiprocessor (MP) to uniquely indicate for each line in the associated cache whether it is to be handled as a store-in-cache (SIC) line when its SH flag is in non-sharing state, and as a ST line in sharing state.
Abstract: The hybrid cache control provides a sharing (SH) flag with each line representation in each private CP cache directory in a multiprocessor (MP) to uniquely indicate for each line in the associated cache whether it is to be handled as a store-in-cache (SIC) line when its SH flag is in non-sharing state, and as a store-through (ST) cache line when its SH flag is in sharing state. At any time the hybrid cache can have some lines operating as ST lines, and other lines as SIC lines. A newly fetched line (resulting from a cache miss) has its SH flag set to non-sharing (SIC) state in its location determined by cache replacement selection circuits, unless the SH flag for the requested line is dynamically set to sharing (ST) state and if a cross-interrogation (XI) hit in another cache is found by cross-interrogation (XI) controls, which XIs all other cache directories in the MP for every store or fetch cache miss and for every store cache hit of a ST line (having SH= 1). A XI hit signals that a conflicting copy of the line has been found in another cache. If the conflicting cache line is changed from its corresponding MS line, the cache line is castout to MS. The sharing (SH) flag for the conflicting line is set to sharing state for a fetch miss, but the conflicting line is invalidated for a store miss.

49 citations


Journal ArticleDOI
TL;DR: The memory system of the Dorado, a compact high- performance personal computer, has very high I/O bandwidth, a large paged virtual memory, a cache, and heavily pipelined control; this paper discusses all of these in detail.
Abstract: The memory system of the Dorado, a compact high- performance personal computer, has very high I/O bandwidth, a large paged virtual memory, a cache, and heavily pipelined control; this paper discusses all of these in detail. Relatively low-speed I/O devices transfer single words to or from the cache; fast devices, such as a color video display, transfer directly to or from main storage while the processor uses the cache. Virtual addresses are used in the cache and for all I/O transfers. The memory is controlled by a seven-stage pipeline, which can deliver a peak main-storage bandwidth of 533 million bits/s to service fast I/O devices and cache misses. Interesting problems of synchronization and scheduling in this pipeline are discussed. The paper concludes with some performance measurements that show, among other things, that the cache hit rate is over 99 percent.

36 citations


Journal ArticleDOI
TL;DR: This paper summarizes research by the author on two topics: cache disks and file migration, by which files are migrated between disk and mass storage as needed in order to effectively maintain on-line a much larger amount of information than the disks can hold.

Patent
Dana R Spencer1
15 Jun 1981
TL;DR: In this article, the cache allocates a line for LS use by sending a special signal with an address for a line in a special in main storage which is non-program addressable (i.e. not addressable by any of the architected instructions of the processor).
Abstract: The disclosure pertains to a relatively small local storage (LS) in a processor's IE which can be effectively expanded by utilizing a portion of a processor's store-in-cache. The cache allocates a line (i.e. block) for LS use by the instruction unit sending a special signal with an address for a line in a special in main storage which is non-program addressable (i.e. not addressable by any of the architected instructions of the processor). The special signal suppresses the normal line fetch operation of the cache from main storage caused when the cache does not have a requested line. After the initial allocation of the line space in the cache to LS use, the normal cache operation is again enabled, and the LS line can be castout to the special area in main storage and be retrieved therefrom to the cache for LS use.

Patent
16 Sep 1981
TL;DR: In this article, a data processing system includes a plurality of central processing units (CPO to CP3) each including an instruction execution unit (IE) and a buffer control unit (BCE).
Abstract: A data processing system includes a plurality of central processing units (CPO to CP3) each including an instruction execution unit (IE) and a buffer control unit (BCE). Each buffer control unit includes a store in cache, a cache directory and cache controls. The processing units are coupled to a shared main storage system (MS) through system control means (SCO, SC1). The system control means includes a plurality of copy directories, each corresponding to an associated one of the cache directories. When a processing unit issues a clear storage command, the address thereof is directed to all of the copy directories. For each one found to refer to the address, an invalidate signal is sent to the associated cache directory to invalidate its corresponding entry, if permitted by its processing unit, and a resend signal is sent to the processor which issued the clear storage command. The entries in the copy directories corresponding to invalidated entries in the cache directories are then also invalidated. The procedure continues until all corresponding entries in the directories are invalidated, at which time the directory search results in an accept signal which enables the main storage area defined by the command to be cleared.

Patent
11 Sep 1981
TL;DR: In this paper, a digital computer system for encaching data stored in the computer system's memory in a cache internal to the processor includes a cache for storing the encacheable items.
Abstract: Apparatus in a digital computer system for encaching data stored in the computer system's memory in a cache internal to the computer system's processor. The processor executes procedures (sequences of instructions). Data processed during an execution of a procedure is represented by operands associated with the procedure. Certain of the operands designate encacheable data items associated with each execution of the procedure. The values of the encacheable data items do not change for the duration of the execution. The operands designating encacheable data do so by means of codes specifying the designated encacheable items. The processor includes a cache for storing the encacheable items. The cache responds to a code specifying an encacheable item by outputting the value of the encacheable item specified by the code. The processor further includes cache loading apparatus for loading the encacheable items into the cache. The operations executed by the processor include a call operation which commences an execution of a procedure. The cache loading apparatus responds to the call operation by loading the encacheable items for the execution being commenced into the cache. The encacheable items include pointers to arguments for each execution of a procedure. The pointers are stored in the frame corresponding to the execution at negative offsets from the frame pointer. The codes in the operands specify the negative offsets.

Patent
05 Oct 1981
TL;DR: In this article, the cache directories (PD) are addressed by the non-translatable address part and by some bits of the translatable address part so that cache synonyms occur.
Abstract: In a multiprocessor (MP) system each processor (CP) has an associated store-in-buffer cache (BCE) whose set-associative classes (lines) are designated exclusive (EX) or shareable by all processors (RO). The cache directories (PD) are addressed by the non-translatable address part and by some bits of the translatable address part so that cache synonyms occur. Accesses of a processor to its cache are either flagged exclusive (EX) for store instructions and operand fetches or shareable (RO) for instruction fetches. To reduce the number of castouts in case of cache conflicts the shareability of a new cache line brought in after a cache miss is determined from its original shareability, the type of access to be performed and the change status of the line. An exclusive (operand) fetch to a cache line a copy of which is in a remote processor and designated exclusive is changed to RO if no changes have been made to that line. A synonym designated RO in the processors own cache is, duplicated to speed up further accesses. Synonym detection in all caches is performed in parallel by permuting the translatable address bits. For faster operation each processor has a further cache directory (CD) which is used for synonym detection and cross-interrogation.

01 Jan 1981
TL;DR: The study shows that the set associative mapping mechanism, the write through with buffering updating scheme and the no write allocation bloc fetch strategy are suitable for shared cache systems, however, for private cache systems the write back withbuffering updating schemes and the write allocation block fetch strategies are considered in this thesis.
Abstract: : Organizations of shared two-level memory hierarchies for parallel-pipelined multiple instruction stream processors are studied. The multicopy of data problems are totally eliminated by sharing the caches. All memory modules are assumed to be identical and cache addresses are interleaved by sets. For a parallel-pipelined processor of order (s,p), which consists of p parallel processors each of which is a pipelined processor with degree of multiprogramming, s, there can be up to sp cache requests from distinct instruction streams in each instruction cycle. The cache memory interference and shared cache hit ratio in such systems are investigated. The study shows that the set associative mapping mechanism, the write through with buffering updating scheme and the no write allocation bloc fetch strategy are suitable for shared cache systems. However, for private cache systems, the write back with buffering updating scheme and the write allocation block fetch strategy are considered in this thesis. Performance analysis is carried out by using discrete Markov Chain and probability based theorems. Some design tradeoffs are discussed and examples are given to illustrate a wide variety of design options that can be obtained. Performance differences due to alternative architectures are also shown by a performance comparison between shared cache and private cache for a wide range of parameters.

Patent
25 Feb 1981
TL;DR: In this article, the authors describe a fast synonym detection and handling mechanism for a cache utilizing virtual addressing in data processing systems, where the cache directory is divided into two N groups of classes, in which N is the number of cache address bits derived from the translatable part (PX) of a requested logical address in register.
Abstract: The specification describes a fast synonym detection and handling mechanism for a cache (311) utilizing virtual addressing in data processing systems. The cache directory (309) is divided into 2 N groups of classes, in which N is the number of cache address bits derived from the translatable part (PX) of a requested logical address in register (301). The cache address part derived from the non-translatable part of the logical address, i.e. real part (D) is used to simultaneously access 2 N classes, each in a different group. All class entries are simultaneously compared with one or more DLAT (307) translated absolute addresses. Compare signals, one for each class entry per DLAT absolute address, are routed to a synonym detection circuit (317). The detection circuit simultaneously interprets all directory compare signals and determines if a principle hit, synonym hit or a miss occurred in the cache (309) for each request. If a synonym hit is detected, group identifier (GID) bits are generated to select the data in the cache at the synonym class location. To generate the synonym cache address, the group identifier bits are substituted for the translatable bits in the cache address for locating the required synonym class. For a set-associative cache, set identifier (SID) bits are simultaneously generated for cache addressing.

Journal ArticleDOI
TL;DR: A growth path for current microprocessors is suggested which includes bus enhancements and cache memories, and several differences from the mainframe world are pointed out.
Abstract: A growth path for current microprocessors is suggested which includes bus enhancements and cache memories. The implications are examined, and several differences from the mainframe world are pointed out.