scispace - formally typeset
Search or ask a question

Showing papers on "Smart Cache published in 1982"


Journal ArticleDOI
TL;DR: In this article, an analytical model for the program behavior of a multitasked system is introduced, including the behavior of each process and the interactions between processes with regard to the sharing of data blocks.
Abstract: In many commercial multiprocessor systems, each processor accesses the memory through a private cache. One problem that could limit the extensibility of the system and its performance is the enforcement of cache coherence. A mechanism must exist which prevents the existence of several different copies of the same data block in different private caches. In this paper, we present an in-depth analysis of the effects of cache coherency in multiprocessors. A novel analytical model for the program behavior of a multitasked system is introduced. The model includes the behavior of each process and the interactions between processes with regard to the sharing of data blocks. An approximation is developed to derive the main effects of the cache coherency contributing to degradations in system performance.

133 citations


Journal ArticleDOI
01 Apr 1982
TL;DR: An in-depth analysis of the effects of cache coherency in multiprocessors is presented and a novel analytical model for the program behavior of a multitasked system is introduced.
Abstract: In many commercial multiprocessor systems, each processor accesses the memory through a private cache. One problem that could limit the extensibility of the system and its performance is the enforcement of cache coherence. A mechanism must exist which prevents the existence of several different copies of the same data block in different private caches. In this paper, we present an indepth analysis of the effect of cache coherency in multiprocessors. A novel analytical model for the program behavior of a multitasked system is introduced. The model includes the behavior of each process and the interactions between processes with regard to the sharing of data blocks. An approximation is developed to derive the main effects of the cache coherency contributing to degradations in system performance.

109 citations


Journal ArticleDOI
TL;DR: An approximate analytical model for the performance of multiprocessors with private cache memories and a single shared main memory is presented and is found to be very good over a broad range of parameters.
Abstract: This paper presents an approximate analytical model for the performance of multiprocessors with private cache memories and a single shared main memory. The accuracy of the model is compared with simulation results and is found to be very good over a broad range of parameters. The parameters of the model are the size of the multiprocessor, the size and type of the interconnection network, the cache miss-ratio, and the cache block transfer time. The analysis is extended to include several different read/write policies such as write-through, load-through, and buffered write-back. The analytical technique presented is also applicable to the performance of interconnection networks under block transfer mode.

78 citations


Patent
18 Oct 1982
TL;DR: In this paper, the authors modify cache addressing in order to decrease the cache miss rate based on a statistical observation that the lowest and highest locations in pages in main storage page frames are usually accessed at a higher frequency than intermediate locations in the pages.
Abstract: The described embodiment modifies cache addressing in order to decrease the cache miss rate based on a statistical observation that the lowest and highest locations in pages in main storage page frames are usually accessed at a higher frequency than intermediate locations in the pages. Cache class addressing controls are modified to change the distribution of cache contained data more uniformly among the congruence classes in the cache (by comparison with conventional cache class distribution). The cache addressing controls change the congruence class address as a function of the state of a higher-order bit or field in any CPU requested address.

54 citations


Patent
31 Mar 1982
TL;DR: In this paper, the directory and cache store of a multilevel set associative cache system are organized in levels of memory locations, and a round robin replacement apparatus is used to identify in which one of the multi-levels information is to be replaced.
Abstract: The directory and cache store of a multilevel set associative cache system are organized in levels of memory locations. Round robin replacement apparatus is used to identify in which one of the multilevels information is to be replaced. The directory includes parity detection apparatus for detecting errors in the addresses being written in the directory during a cache memory cycle of operation. Control apparatus combines such parity errors with signals indicative of directory hits to produce invalid hit detection signals. The control apparatus in response to the occurrence of a first invalid hit detection signal conditions the round robin apparatus as well as other portions of the cache system to limit cache operation to those sections whose levels are error free thereby gracefully degrading cache operation.

52 citations


Patent
03 Mar 1982
TL;DR: In this article, a command queue is maintained for storing commands waiting to be executed, and each command is assigned a priority level for execution vis-a-vis other commands in the queue.
Abstract: In a system having a cache memory and a bulk memory, and wherein a command queue is maintained for storing commands waiting to be executed, each command is assigned a priority level for execution vis-a-vis other commands in the queue. Commands are generated for transferring from the cache memory to the bulk memory segments of data which have been written to while resident in the cache memory. Each generated command may be assigned a generated priority level which is dependent upon the number of segments in the cache memory which have been written to but not yet copied into the bulk memory. A second priority level may be generated which is dependent on the time which has elapsed since the first write into any of the cache memory segments, and the priority level assigned to any given generated command is the higher of the two generated priority levels.

50 citations


Patent
25 Jan 1982
TL;DR: In this paper, a cache clearing apparatus for a multiprocessor data processing system having a cache unit and a duplicate directory associated with each processor is described, where commands affecting information segments within the main memory are transferred by the system controller unit to each of the duplicate directories to determine if the information segment affected is stored in the cache memory of its associated cache memory.
Abstract: A cache clearing apparatus for a multiprocessor data processing system having a cache unit and a duplicate directory associated with each processor. The duplicate directory, which reflects the contents of the cache directory within its associated cache unit, and the cache directory are connected through a system controller unit. Commands affecting information segments within the main memory are transferred by the system controller unit to each of the duplicate directories to determine if the information segment affected is stored in the cache memory of its associated cache memory. If the information segment is stored therein the duplicate directory issues a clear command through the system controller to clear the information segment from the associated cache unit.

44 citations


Dissertation
01 Jan 1982
TL;DR: Two cache management models are developed: the prompting model, and the explicit management model that rely on software based enhancement methods that proved to be successful in boosting main memory performance, and it is found that optimal data packing is a hard problem.
Abstract: An ideal high performance computer includes a fast processor and a multi-million byte memory of comparable speed. Since it is currently economically infeasible to have large memories with speeds matching the processor, hardware designers have included the cache. Because of its small size, and its effectiveness in eliminating the speed mismatch, the cache has become a common feature of high performance computers. Enhancing cache performance proved to be instrumental in the speed up of cache-based computers. In most cases enhancement methods could be classified as either software based, or hardware controlled. In most cases, software based improvement methods that proved to be very effective in main memory were considered to be inapplicable to the cache. A main reason has been the cache's transparency to programs, and the fast response time of main memory. This resulted in only hardware enhancement features being considered, and implemented for the cache. Developments in program optimization by the compiler were successful in improving the program's performance, and the understanding of program behavior. Coupling the information about a program's behavior with knowledge of the hardware structure became a good approach to optimization. With this premise we developed two cache management models: the prompting model, and the explicit management model. Both models rely on the underlying concepts of: prefetching, clustering (packing), and loop transformations. All three are software based enhancement methods that proved to be successful in boosting main memory performance. In analyzing these methods for possible implementation in the cache we found that optimal data packing is a hard problem. Nevertheless, we suggested various heuristic methods for effective packing. We then set forth a number of conditions for loop transformations. The aim of these transformations is to facilitate prefetching (preloading) of cache blocks during loop execution. In both models the compiler places preload requests within the program's code. These requests are serviced in parallel with program execution. Replacement decisions are determined at compile time in the explicit model, but are fully controlled by the hardware in the prompting model. In this model special tag bits are introduced to each cache block in order to facilitate replacement decisions. The handling of aggregate data elements (arrays) are also discussed in the thesis. In the explicit model a special indexing scheme is introduced for controlling array access in the cache. In addition, main memory addresses are only generated for block load requests, all other addresses are for the cache.

38 citations


Patent
Gerner Manfred Dipl Ing1
17 Aug 1982
TL;DR: In this article, a cache store comprising an associative store (CAM) and a write/read store (RAM) is integrated in a micropressor chip, which can be divided on the logic level into a program cache store, a micro-programme cache store and a data cache store of variable size.
Abstract: 1. A cache store comprising an associative store (CAM) and a write/read store (RAM), characterized by the following features : - the cache store is integrated in a micropressor chip, - it can be divided on the logic level into a programme cache store, a micro-programme cache store and a data cache store of variable size.

13 citations


Patent
Robert Percy Fletcher1
16 Nov 1982
TL;DR: In this article, the authors propose a hybrid cache storage system, where a sharing flag is provided with each line representation in the cache directory to uniquely indicate for each cache line whether it is to be handled as a store-in-cache (SIC) line when its SH flag is in non-sharing state, and as a SIC line in sharing state.
Abstract: In a cache storage system a sharing (SH) flag is provided with each line representation in the cache directory to uniquely indicate for each cache line whether it is to be handled as a store-in-cache (SIC) line when its SH flag is in non-sharing state, and as a store-through (ST) cache line when its SH flag is in sharing state. At any time the hybrid cache can have some lines operating as ST lines, and other lines as SIC lines. Such cache storage systems are used as private caches in a multiprocessor (MP) system. A newly fetched line (resulting from a cache miss) has its SH flag set to non-sharing (SIC) state in its location determined by cache replacement selection circuits, unless the SH flag for the requested line is dynamically set to sharing (ST) state if a cross-interrogation (XI) hit in another cache is found by cross-interrogation (XI) controls, which interrogates all other cache directories in the MP for every store or fetch cache miss and for every store cache hit of a ST line (having SH=1). A XI hit signals that a conflicting copy of the line has been found in another cache. In the conflicting cache line is changed from its corresponding main storage (MS) line, the cache line is castout to MS. The sharing (SH) flag for the conflicting line is set to sharing state for a fetch miss, but the conflicting line is invalidated for a store miss.

6 citations


Patent
Dana R Spencer1
28 May 1982
TL;DR: In this paper, the cache allocates a line (i.e. block) for LS use by the instruction unit (IE) sending a special signal with an address for a line in a special area in main storage which is non-program addressable.
Abstract: The processor contains a relatively small local storage (LS 12) which can be effectively expanded by utilizing a portion of a processor's store-in-cache (63). The cache allocates a line (i.e. block) for LS use by the instruction unit (IE) sending a special signal with an address for a line in a special area in main storage which is non-program addressable (i.e. not addressable by any of the architected instructions of the processor). The special signal suppresses the normal line fetch operation of the cache from main storage caused when the cache does not have a requested line. After the initial allocation of the line space in the cache to LS use, the normal cache operation is again enabled, and the LS line can be castout to the special area in main storage and be retrieved therefrom to the cache for LS use.

Patent
26 Nov 1982
TL;DR: In this paper, a host computer (10) is backed up by long term secondary magnetic disk storage means (14) coupled to the computer by channels (12), a storage director (16), and a control module (18).
Abstract: A host computer (10) is backed up by long term secondary magnetic disk storage means (14) coupled to the computer by channels (12), a storage director (16) and a control module (18). A cache memory (22) with an associated cache manager (24) are also connected to the storage director (16) for storing data which the host computer (10) is likely to require. In order to allow automatic transfer to the cache memory (22) of only that data which is likely to be required, the storage director (16) and cache manager (24) determine when accessed data from the disk storage means (14) appears to be part of sequential data because it lacks indications to the contrary, such as embedded SEEK instructions. When data lacks such counter indications, automatic transfers to the cache memory (22) occur a track at a time.

Patent
14 Jul 1982
TL;DR: In this paper, the authors propose an efficient promotion of data from a backing store (disk storage apparatus 16-18 termed DASD) to a random access cache 40 in a storage system such as used for swap and paging data transfers.
Abstract: The efficient promotion of data from a backing store (disk storage apparatus 16-18 termed DASD) to a random access cache 40 in a storage system such as used for swap and paging data transfers. When a sequential access indicator (SEQ in 22) is sent to and retained in the storage system, all data specified in a subsequent read «paging mode» command is fetched to the cache from DASD. If such prefetched data is replaced from cache and the sequential bit is on, a subsequent host access request for such data causes all related data not yet read to be promoted to cache. A maximal amount only of related data may be promoted; such maximal amount is determined by cache addressing characteristics and DASD access delay boundaries. Without the sequential bit on, only the addressed data block is promoted to cache.

Patent
25 Aug 1982
Abstract: A storage hierarchy has a backing store (14) and a cache (15). During a series of accesses to the hierarchy by a user (10) write commands are monitored and analysed. Writing data to the hierarchy results in data being selectively removed from the cache. Space in the cache not being allocated to data being written, results in such data being written to the backing store to the exclusion of the cache. Writing as part of a chain or sequential set of commands causes further removal of the data from the cache at the end of the chain or sequence. Removal of data increases the probability of writing data directly to the backing store with data readings from the cache.