scispace - formally typeset
Search or ask a question

Showing papers on "Cache published in 1976"


Proceedings ArticleDOI
C. K. Tang1
07 Jun 1976
TL;DR: System requirements in the multiprocessor environment as well as the cost-performance trade-offs of the cache system design are given in detail and the possibility of sharing the Cache system hardware with other multiprocessioning facilities (such as dynamic address translation, storage protection, locks, serialization, and the system clocks) is discussed.
Abstract: Cache is a fast buffer memory between the processor and the main memory and has been extensively used in the larger computer systems. The principle of operation and the various designs of the cache in the uniprocessor system are well documented. The memory system of multiprocessors has also received much attention recently; however, they are limited to the systems without a cache. Little if any information exists in the literature addressing the principle and design considerations of the cache system in the tightly coupled multiprocessor environment. This paper describes such a cache design. System requirements in the multiprocessor environment as well as the cost-performance trade-offs of the cache system design are given in detail. The possibility of sharing the cache system hardware with other multiprocessing facilities (such as dynamic address translation, storage protection, locks, serialization, and the system clocks) is also discussed.

180 citations


Proceedings ArticleDOI
13 Oct 1976
TL;DR: A brief account is presented of the recovery block scheme, together with a description of a new implementation of the underlying cache mechanism, which incorporates this implementation and also provides a high level of detection for errors such as the corruption of code and data.
Abstract: The need for reliable complex systems motivates the development of techniques by which acceptable service can be maintained, even in the presence of residual errors. Recovery blocks allow a software designer to include tests on the acceptability of the various phases of a system's operation, and to specify alternative actions should the acceptance tests fail. This approach relies on certain architectural features, ideally implemented in hardware, by which control and data structures can be retrieved after errors.A brief account is presented of the recovery block scheme, together with a description of a new implementation of the underlying cache mechanism. The salient features of a proposed computer architecture are described, which incorporates this implementation and also provides a high level of detection for errors such as the corruption of code and data. A prototype system has been constructed to test the viability of these techniques by executing programs containing recovery blocks on an emulator for the proposed architecture. Experiences in running this system are recounted with respect to the execution of programs based on erroneous algorithms and also with respect to errors introduced by deliberate attempts to corrupt the system.

95 citations


Patent
30 Dec 1976
TL;DR: In this paper, a local memory of an input/output system includes a cache store and a backing store, and each memory command applied to the memory unit incudes a predetermined bit which is coded to designate when the information requested from the local memory unit is to be written into the cache store.
Abstract: A local memory of an input/output system includes a cache store and a backing store. The system includes a plurality of command modules. The cache store provides fast access to blocks of information previously fetched from the backing store in response to memory commands generated by any one of a plurality of command modules during both data transfer and data processing operations. Each memory command applied to the memory unit incudes a predetermined bit which is coded to designate when the information requested from the local memory unit is to be written into the cache store. The local memory unit includes aparatus operative in response to each memory command to enable the command module to bypass selectively the cache store in accordance with the coding of the predetermined bit thereby enabling the command modules to execute operations more expeditiously during the performance of input/output data transfer operations.

80 citations


Journal ArticleDOI
17 Jan 1976
TL;DR: The concept of cache memory is introduced together with its major organizational parameters: size, associativity, block size, replacement algorithm, and write strategy, and simulation results are given showing how the performance of the cache varies with changes in these parameters.
Abstract: This paper gives a summary of the research which led to the design of the cache memory in the DEC PDP-11/70. The concept of cache memory is introduced together with its major organizational parameters: size, associativity, block size, replacement algorithm, and write strategy. Simulation results are given showing how the performance of the cache varies with changes in these parameters. Based on these simulation results the design of the 11/70 cache is justified.

73 citations


Journal ArticleDOI
C. K. Chow1
TL;DR: The optimum capacity of a cache memory with given access time is determined analytically based upon a model of linear storage hierarchies wherein both the hit ratio function and the device technology-cost function are assumed to be power functions.
Abstract: The optimum capacity of a cache memory with given access time is determined analytically based upon a model of linear storage hierarchies wherein both the hit ratio function and the device technology-cost function are assumed to be power functions. Explicit formulas for the capacities and access times of the storage levels in the matching hierarchy of required capacity and allowable cost are derived. The optimal number of storage levels in a hierarchy is shown to increase linearly with the logarithm of the ratio of the required hierarchy capacity and the cache capacity.

49 citations


Patent
20 Sep 1976
TL;DR: In this paper, the priority determination of what Requestor, of R Requestors, is to be granted priority by the Priority Network while simultaneously comparing, in parallel, all of the Requestors' addresses for a Match condition in R Cache memories.
Abstract: A method of and an apparatus for performing, in a Cache memory system, the Priority determination of what Requestor, of R Requestors, is to be granted priority by the Priority Network while simultaneously comparing, in parallel, all of the R Requestors' addresses for a Match condition in R Cache memories. The Cache memory system incorporates a separate Cache memory or associative memory for each Requestor, each of which Cache memories is comprised of an Address Buffer or Search memory, in which the associated Requestors' addresses are stored, and a Data Buffer or Associated memory, in which the data that are associated with each of the Requestors' addresses are stored. Thus, while the Priority Request signals from all of the requesting Requestors are being coupled to the single Priority Network, each of the requesting Requestors' addresses is coupled to each of the requesting Requestor separately associated Cache memory. As the Priority determination by the Priority Network and the Match determination by the Cache memories require approximately the same time to complete, the parallel operation thereof substantially reduces memory access time to either the Main memory or the Cache memory.

33 citations


Proceedings ArticleDOI
13 Oct 1976
TL;DR: This paper shows how to calculate analytically the effectiveness of set associative paging relative to full associative (unconstrained mapping) paging, and suggests that as electronically accessed third level memories become available, algorithms currently used only for cache paging will be applied to main memory, for the same reasons of efficiency, implementation ease and cost.
Abstract: Set associative page mapping algorithms have become widespread for the operation of cache memories for reasons of cost and efficiency. In this paper we show how to calculate analytically the effectiveness of set associative paging relative to full associative (unconstrained mapping) paging. For two miss ratio models, Saltzer's linear model and a mixed geometric model, we are able to obtain simple, closed form expressions for the relative LRU fault rates. Trace driven simulations are used to verify the accuracy of our results. We suggest that as electronically accessed third level memories, such as electron beam memories, magnetic bubbles or charge coupled devices become available, algorithms currently used only for cache paging will be applied to main memory, for the same reasons of efficiency, implementation ease and cost.

31 citations


Proceedings ArticleDOI
Peter Schneider1
07 Jun 1976
TL;DR: A comparison between a three-level storage system, consisting of cache, page buffer and CCD main memory, and a conventional two-level main memory system will show that the three- level hierarchy using the transfer on demand strategy has an effective access time which is higher by about a factor of 2.
Abstract: The emergence of new storage technologies such as Charge Coupled Devices (CCD) and Bubbles with access times which lie in the access gap between semiconductor memories and rotating magnetic storage media is another important step toward implementing multilevel storage hierarchies.However, a comparison between a three-level storage system, consisting of cache, page buffer and CCD main memory, and a conventional two-level main memory system will show that the three-level hierarchy using the transfer on demand strategy has an effective access time which is higher by about a factor of 2.Yet, through better use of the program locality the access time of a three-level system can be reduced to that of the two-level cache/page buffer system. Using this method, the so-called working set restoration, the working set of pages of the next program to be run is loaded into the page buffer during execution of the active program. The required page transfer operations are executed concealed and are thus not time-critical for the processor. This means that for program processing only the access time to the two-level system becomes apparent.The advantage of a three-level system of this type lies not so much in the improved performance but rather in the lower costs, since it permits the use of a large-capacity main memory on a technology level which is cheaper by a factor of 2 to 4 as compared with MOS RAM.

6 citations




01 Feb 1976
TL;DR: In this article, the authors present a technique for constructing a detailed, deterministic model of the system, in which a control stream replaces the instruction and data streams of the real system.
Abstract: : Simulation is presented as a practical technique for performance evaluation of alternative configurations of a complex, highly concurrent computer system. A technique is described for constructing a detailed, deterministic model of the system. In the model, a control stream replaces the instruction and data streams of the real system. Simulation of the system model yields the timing and resource usage statistics needed for performance evaluation, without the necessity of performing the actual system computation. As an example, the implementation of a simulator of a model of the CPU-memory subsystem of the IBM 360/91 is described. The results of evaluating some alternative system designs are discussed. It appears that many of the sophisticated architectural features of the IBM 360/91 CPU would be of little value if high-speed (cache) memory is used, as in the 360/195. (Author)

ReportDOI
01 Sep 1976
TL;DR: This specification defines the technical requirements for a Security Protection Module (SPM), the hardware portion of a security kernel whose function is to mediate through a descriptor structure, all interactions between elements of a protected minicomputer.
Abstract: : This specification defines the technical requirements for a Security Protection Module (SPM). An SPM is the hardware portion of a security kernel whose function is to mediate through a descriptor structure, all interactions between elements of a protected minicomputer. The SPM evaluates the propriety of all requests, and performs address translation between virtual requests and physical resources. The SPM contains a fast-access cache for storing copies of descriptors in an effort to minimize the performance overhead associated with security.

Proceedings ArticleDOI
20 Oct 1976
TL;DR: It is shown that all the three organizations have a place in a memory system design depending on the user requirements and that a three level memory system using the different CCD chip organizations has a considerable advantage in performance over a two level (Bipolar, MOS) organization for the same total cost.
Abstract: This paper discusses the computer memory system design implications of different CCD devices, such as Circulating Shift Register, Serial Parallel Serial, and Line Addressable Organizations.The performance of the memory using these devices is evaluated in a stand alone mode using a single server queuing model, and as a buffer between the disk and main memory in a computer system by using a two server cyclic queuing model. The performance of the stand alone mode for CCD memory system is defined as the time elapsed between the request for a record and the completion of that request. The performance comparison of stand alone mode shows that Serial Parallel Serial Organization has the worst performance as one should expect. It is shown that the Circulating Shift Register with burst mode for refreshing has better performance than the Line Addressable Organization. Circulating Shift Register Organization with cache has the best performance of all but would require a considerable amount of extra cost.In evaluating the CCD chip organizations as a buffer between the disk and the main memory, it is shown that all the three organizations (Serial Parallel Serial, Circulating Shift Register, and Line Addressable) have a place in a memory system design depending on the user requirements. It is also shown that a three level (Bipolar, MOS and CCD) memory system using the different CCD chip organizations has a considerable advantage in performance over a two level (Bipolar, MOS) organization for the same total cost.