scispace - formally typeset
Search or ask a question

Showing papers on "Cache published in 1977"


Patent
06 Oct 1977
TL;DR: In this paper, a queueing mechanism is used for insuring that all units that wish to interrogate a memory line will be permitted to do so, and the access requests that pertain to particular data are issued from the queue on a first-in-first-out basis.
Abstract: In a multi-processor system which utilizes store-in-cache techniques, a queueing mechanism for insuring that all units that wish to interrogate a memory line will be permitted to interrogate the line. Backing store access requests for data that is contained in the cache of another processor are queued until they can be serviced. The access requests that pertain to particular data are issued from the queue on a first-in-first-out basis. No group of interrogating units will be able to lock out another unit that wishes to interrogate.

75 citations


Patent
28 Nov 1977
TL;DR: In this article, a filter memory is provided with each buffer invalidation address stack (BIAS) in a multiprocessor (MP) system to reduce unnecessary interrogations of the cache directories of processors.
Abstract: The disclosed embodiments filter out many unnecessary interrogations of the cache directories of processors in a multiprocessor (MP) system, thereby reducing the required size of the buffer invalidation address stack (BIAS) with each associated processor, and increasing the efficiency of each processor by allowing it to access its cache during the machine cycles which in prior MP's had been required for invalidation interrogation. Invalidation interrogation of each remote processor cache directory may be done when each channel or processor generates a store request to a shared main storage. A filter memory is provided with each BIAS in the MP. The filter memory records the cache block address in each invalidation request transferred to its associated BIAS. The filter memory deletes an address when it is deleted from the cache directory and retains the most recent cache access requests. The filter memory may have one or more registers, or be an array. Invalidation interrogation addresses from each remote processor and from local and/or remote channels are received and compared against each valid address recorded in the filter memory. If they compare unequal, the received address is recorded in the filter memory as a valid address, and it is gated into BIAS to perform a cache interrogation. If equal, the inputted address is prevented from entering the filter memory or the BIAS, so that it cannot cause any cache interrogation. Deletion from the filter memory is done when the associated processor fetches a block of data into its cache. Deletion may be of all entries in the filter memory, or of only a valid entry having an address equal to the block fetch address in a fetch address register (FAR). Deletion may be done by resetting a valid bit with each entry.

59 citations


Patent
22 Dec 1977
TL;DR: In this paper, a data processing system having a system bus; a plurality of system units including a main memory, a cache memory, and a central processing unit (CPU) and a communications controller all connected in parallel to the system bus is described.
Abstract: A data processing system having a system bus; a plurality of system units including a main memory, a cache memory, a central processing unit (CPU) and a communications controller all connected in parallel to the system bus. The controller operates to supervise interconnection between the units via the system bus to transfer data therebetween, and the CPU includes a memory request device for generating data requests in response to the CPU. The cache memory includes a private interface connecting the CPU to the cache memory for permitting direct transmission of data requests from the CPU to the cache memory and direct transmission of requested data from the cache memory to the CPU; a cache directory and data buffer for evaluating the data requests to determine when the requested data is not present in the cache memory; and a system bus interface connecting the cache memory to the system bus for obtaining CPU requested data not found in the cache memory from the main memory via the system bus in response to the cache directory and data buffer. The cache memory may also include replacement and update apparatus for determining when the system bus is transmitting data to be written into a specific address in main memory and for replacing the data in a corresponding specific address in the cache memory with the data then on the system bus.

53 citations


Patent
17 Feb 1977
TL;DR: In this paper, the cache store provides fast access to blocks of information previously fetched from the backing store in response to memory commands generated by any one of a plurality of command modules during both data transfer and data processing operations.
Abstract: An input/output system includes a local memory module including a cache store and a backing store. The system includes a plurality of command modules and a system interface unit having a plurality of ports, each connected to a different one of the command modules and to the local memory module. The cache store provides fast access to blocks of information previously fetched from the backing store in response to memory commands generated by any one of a plurality of command modules during both data transfer and data processing operations. The local memory module includes apparatus operative in response to each memory command to enable the command module to write into cache store the data which is requested to be written into backing store when it is established that such data has been previously stored in cache store.

50 citations


Patent
18 Feb 1977
TL;DR: In this article, the cache store includes parity generation circuits which generate check bits for the addresses to be written into a directory associated therewith, and parity check circuits for detecting errors in the addresses and information read from the cache stores during a read cycle of operation.
Abstract: A memory system includes a cache store and a backing store. The cache store provides fast access to blocks of information previously fetched from the backing store in response to commands. The backing store includes error detection and correction apparatus for detecting and correcting errors in the information read from backing store during a backing store cycle of operation. The cache store includes parity generation circuits which generate check bits for the addresses to be written into a directory associated therewith. Additionally, the cache store includes parity check circuits for detecting errors in the addresses and information read from the cache store during a read cycle of operation. The memory system further includes control apparatus for enabling for operation, the backing store and cache store in response to the commands. The control apparatus includes circuits which couples to the parity check circuits. Such circuits are operative upon detecting an error in either an address or information read from the cache store to simulate a condition that the information requested was not stored in cache store. This causes the control apparatus to initiate a backing store cycle of operation for read out of a correct version of the requested information thereby eliminating the necessity of including in cache store more complex detection and correction circuits.

47 citations


Patent
Charles P. Ryan1
22 Nov 1977
TL;DR: In this paper, a data processing system comprises a cache unit coupled with a data processor, which couples to a main store, and control logic circuits include decoder circuits operative to set to a predetermined state the contents of a predetermined bit of the control directory multibit locations identified by the memory command when the data directory indicates that information does not reside in the cache unit store.
Abstract: A data processing system comprises a data processing unit coupled to a cache unit which couples to a main store. The cache unit includes a store having a plurality of word locations arranged into a number of groups or sets of blocks of word locations, a data directory for storing addresses within a plurality of locations corresponding in number to the number of groups and a control directory including a plurality of multibit locations corresponding in number to the number of groups of blocks. The cache unit further includes an input command buffer for storing commands received by the data processing unit and control logic circuits. The control logic circuits include decoder circuits operative to set to a predetermined state the contents of a predetermined bit of the control directory multibit locations identified by the memory command when the data directory indicates that the information does not reside in the cache unit store. The control logic circuits include circuits for forwarding the command to main store. In response to each subsequentread command generated by the data processing unit, the control and data directories are accessed and upon detecting that the predetermined bit of a control directory multibit location associated with the block specified by such command, the control logic circuits generate signals to inhibit the processing unit from performing further operations and the cache unit from transferring a duplicate command to the main store.

43 citations


Patent
Thomas F. Joyce1
22 Dec 1977
TL;DR: A first in-first out buffer memory coupled to a system bus receives all information transferred over the bus as mentioned in this paper, and checks if the information received is intended to update main memory or is in response to a cache request.
Abstract: A first in-first out buffer memory coupled to a system bus receives all information transferred over the bus. Logic associated with the buffer memory tests if the information received is intended to update main memory or is in response to a cache request. The information is written into cache if the main memory address location is stored in a cache directory. The information received in response to a cache request is stored in a cache data buffer. Other information is discarded.

40 citations


Patent
22 Dec 1977
TL;DR: In this paper, a data processing system which includes a plurality of system units such as a central processing unit (CPU), main memory, and cache memory all connected in common to a system bus and communicating with each other via the system bus is described.
Abstract: In a data processing system which includes a plurality of system units such as a central processing unit (CPU), main memory, and cache memory all connected in common to a system bus and communicating with each other via the system bus, and also having a private CPU-cache memory interface for permitting direct cache memory read access by the CPU, a multi-configur-able cache store control unit for permitting cache memory to operate in any of the following word modes:1. Single pull banked;2. Double pull banked;3. Single pull interleaved;4. Double pull interleaved.The number of words read is a function of the main store configuration and the amount of memory interference from I/O controllers and other subsystems. The number ranges from one to four under the various conditions.

36 citations


Patent
22 Dec 1977
TL;DR: In this article, the cache store monitors each communication between system units to determine if it is a communication from a system unit to main memory which will update a word location in main memory.
Abstract: A data processing system includes a plurality of system units all connected in common to a system bus. Included are a main memory system and a high speed buffer or cache store. System units communicate with each other over the system bus. Apparatus in the cache store monitors each communication between system units to determine if it is a communication from a system unit to main memory which will update a word location in main memory. If that word location is also stored in cache then the word location in cache will be updated in addition to the word location in main memory.

35 citations


Patent
13 Jun 1977
TL;DR: In this article, error correction circuitry (ECC) is provided at the output of the address buffer (CAB) portion of the cache memory system so that the address word that specifies the addressable location in the main storage unit (MSU) into which the block of data words, which block is stored in the CDB portion of cache memory, is to be stored or written-back is error corrected upon readout.
Abstract: An apparatus for and a method of providing error correction of the address word of a cache memory system (CMS) utilizing post-write storage of the least recently used (LRU) block of data words. Error correction circuitry (ECC) is provided at the output of the address buffer (CAB) portion of the cache memory system so that the address word that specifies the addressable location in the main storage unit (MSU) into which the block of data words, which block of data words is stored in the data buffer (CDB) portion of the cache memory, is to be stored or written-back is error corrected upon readout. This error correction of the address word ensures that correctable errors in the address buffer provided address words do not generate a Miss signal by the storage interface unit (SIU) which, in turn, requires a MSU reference even though the desired address word and the associated data word are available in the cache memory system.

34 citations


Patent
22 Dec 1977
TL;DR: In this paper, the cache system is word oriented and comprises a directory, a data buffer and associated control logic, and the CPU requests data words by sending a main memory address of the requested data word to the cache.
Abstract: A data processing system includes a plurality of system units all connected in common to a system bus. The system units include a central processor (CPU), a memory system and a high speed buffer or cache system. The cache system is word oriented and comprises a directory, a data buffer and associated control logic. The CPU requests data words by sending a main memory address of the requested data word to the cache system. If the cache does not have the information, apparatus in the cache requests the information from main memory, and in addition, the apparatus requests additional information from consecutively higher addresses. If main memory is busy, the cache has apparatus to request fewer words.

Patent
Thomas F. Joyce1
22 Dec 1977
TL;DR: In this article, the directory and data buffer are organized in levels of memory locations, and each level of the directory is loaded in turn from main memory, and the round robin count for each address location of cache indentifying the next level to be written is stored in a random access memory.
Abstract: During system initialization, a cache is completely loaded with valid information from main memory. The directory and data buffer are organized in levels of memory locations. Each level of the directory and data buffer is loaded in turn from main memory. Round Robin apparatus, which is preset during system initialization, identifies the next level into which a replacement data word is written on a first in-first out basis. The round robin count for each address location of cache indentifying the next level to be written is stored in a random access memory (RAM). The contents of a particular address location of RAM is incremented each time replacement information is written into that address location in cache.

Patent
22 Dec 1977
TL;DR: A Data Processing System comprises a central processor unit, a main memory and a cache, all coupled in common to a system bus as discussed by the authors, which is also separately coupled to the cache.
Abstract: A Data Processing System comprises a central processor unit, a main memory and a cache, all coupled in common to a system bus. The central processor unit is also separately coupled to the cache. Apparatus in cache is responsive to signals received from the central processor unit to initiate a test and verification mode of operation in cache. This mode enables the cache to exercise various logic areas of cache and to indicate to the central processor unit hardware faults.

Patent
22 Dec 1977
TL;DR: In this article, a word oriented data processing system includes a plurality of system units all connected in common to a system bus, including a central processor unit (CPU), a memory system and a high speed buffer or cache system.
Abstract: A word oriented data processing system includes a plurality of system units all connected in common to a system bus. Included are a central processor unit (CPU), a memory system and a high speed buffer or cache system. The cache system is also coupled to the CPU. The cache includes an address directory and a data store with each address location of directory addressing its respective word in data store. The CPU requests a word of cache by sending a memory request to cache which includes a memory address location. If the requested word is stored in the data store, then it is sent to the CPU. If the word is not stored in cache, the cache requests the word of memory. When the cache receives the word from memory, the word is sent to the CPU and also stored in the data store.

Patent
22 Dec 1977
TL;DR: In this article, the cache subsystem is coupled with a central processor, a main memory subsystem and a cache subsystem, all coupled in common to a system bus, and the transfer of information from the main memory to the cache starts from the lowest order address locations in main memory and continues from successive address locations until the cache is fully loaded.
Abstract: A data processing system includes a central processor subsystem, a main memory subsystem and a cache subsystem, all coupled in common to a system bus. During the overall system initialization process, apparatus in the cache subsystem effects the transfer of information from the main memory subsystem to the cache subsystem to load all address locations of the cache subsystem. The transfer of information from the main memory subsystem to the cache subsystem starts from the lowest order address locations in main memory and continues from successive address locations until the cache subsystem is fully loaded. This assures that the cache subsystem contains valid information during normal data processing.

Journal ArticleDOI
Bhandarkar1
TL;DR: This correspondence applies existing analytic and simulation models to the multiprocessor design space and presents guidelines for the multip rocessor system architect on preferred design alternatives and tradeoffs.
Abstract: Analytic and simulation models of memory interference have been reported in the literature. These models provide tools for analyzing various system architecture alternatives. Some of the design parameters are processor speed, memory speed, number of processors, number of memories, use of cache memories, high-order versus low-order interleaving, and memory allocation. This correspondence applies existing analytic and simulation models to the multiprocessor design space and presents guidelines for the multiprocessor system architect. Preferred design alternatives and tradeoffs are outlined.

01 May 1977
TL;DR: The concept of the Stack Working Set is introduced and expressions are derived for the forward recurrence time to the next reference to a page, for the time that a page spends in a cache of a given size and for theTime from last reference to the page being replaced.
Abstract: The Least-Recently-Used Stack Model (LRUSM) is known to be a good model of temporal locality. Yet, little analysis of this model has been performed and documented. Certain properties of the LRUSM are developed here. In particular, the concept of the Stack Working Set is introduced and expressions are derived for the forward recurrence time to the next reference to a page, for the time that a page spends in a cache of a given size and for the time from last reference to the page being replaced. The fault stream out of a cache memory is modelled and it is shown how this can be used to partially analyze a multilevel memory hierarchy. In addition, the Set Associative Buffer is analyzed and a necessary and sufficient condition for the optimality of the LRU replacement algorithm is advanced.


Proceedings ArticleDOI
13 Jun 1977
TL;DR: By appropriate cache system design, adequate memory system speed can be achieved to keep the processors busy and smaller cache memories are required for dedicated processors than for standard processors.
Abstract: The performances of two types of multiprocessor systems with cache memories dedicated to each processor are analyzed. It is demonstrated that by appropriate cache system design, adequate memory system speed can be achieved to keep the processors busy. A write through algorithm is used for each cache to minimize directory searching and several main memory modules are used to provide interleaved write. In large memories a cost performance analysis shows that with an increase in per bit costs of 5 to 20 percent, the memory throughput can be enhanced by a factor of 10 and by a factor of 3 or more over simple interleaving of the modules for random memory requests. Experimental evidence indicates smaller cache memories are required for dedicated processors than for standard processors. All memories and buses can be of modest speed.

Journal ArticleDOI
Guy Mazare1
01 Mar 1977
TL;DR: An overview of an architecture of a multi-micro-processor architecture in which all processors are equivalent, characterized by a central memory and several caches, and is designed to avoid incoherent data is presented.
Abstract: This paper presents an overview of an architecture of a multi-micro-processor architecture in which all processors are equivalent. The structure is characterized by a central memory and several caches, and is designed to avoid incoherent data. A fast mechanism of “subcontracting” between one processor and another is described.The execution of two simple programs under this architecture is studied. The parallel algorithms are described and compared with the original (sequential) algorithms; it is shown that the overhead remains small, few extra memories are necessary and synchronization do not slow down the execution unduly. Furthermore, the locality of the parallel algorithm is compared to that of the sequential algorithm and is found to be less good, but in a reasonable way. As well, an estimation of the cost of the coherence keeping mechanism is given using the difference between the miss ratio of the classical cache and the “coherence keeping cache”, respectively.

Proceedings ArticleDOI
01 May 1977
TL;DR: A novel structure derived from threaded code -- knotted code -- is proposed and shown to incorporate an excellent time/space tradeoff on both conventional machines and those having cache memories.
Abstract: It has previously been shown that automatically generated digital signal processings software can incorporate precomputed, data independent, program control and data access parameters. Such software is "time-efficient"; that is, at run time, only data dependent computation occurs. In this paper, alternative program structures, all of which accommondate the autogen feature, are examined for time/ space (memory) efficiency considerations on both conventional machines and those having cache memories. A novel structure derived from threaded code -- knotted code -- is proposed and shown to incorporate an excellent time/space tradeoff.

Journal ArticleDOI
01 Mar 1977
TL;DR: Hardware costs and timing considerations are given, and show that the multiple-buffer approach to variable word length buffering provides a reasonable cost-performance solution.
Abstract: Variable word length processing is valuable for data base manipulations, editing functions in time-sharing systems, input-output data formatting, and vector operations, but current computer architectures seldom provide efficient means for manipulating variable length operands. A specialized computer architecture has been proposed to deal with the problems of variable length byte string processing. Operand buffering is a key part of the proposed architecture because the buffer: (1) replaces registers for variable length operands, (2) resolves address precision and operand boundary problems between the main memory and the ALU, and (3) can provide simultaneous access to several operands for faster execution. General observations are made about requirements which constrain memory buffer design and the merits of four memory buffering schemes are discussed. The most suitable of the buffering schemes is discussed in some detail. This scheme provides a separate cache for each of the operands. Hardware costs and timing considerations are given, and show that the multiple-buffer approach to variable word length buffering provides a reasonable cost-performance solution.

01 Jan 1977
TL;DR: The Cache County Snowmobiler: An Empirical Study by Michael William Dierker, Master of Science Utah State University, 1977 Major Professor: Dr. Gary Madsen Department: Sociology as discussed by the authors.
Abstract: The Cache County Snowmobiler: An Empirical Study by Michael William Dierker, Master of Science Utah State University, 1977 Major Professor: Dr. Gary Madsen Department: Sociology Snowmobiling is one of the major outdoor winter sports in Cache County, Utah. Despite its popularity, it has run into several vii problems, among which the most noticeable is its conflict with other winter recreationists, namely, cross-country skiers and snowshoers. In order to resolve this conflict, one must first understand more about each group involved. As such, the purpose of this research was to obtain information on the snowmobiler in Cache County, Utah. Specifically, the objectives of the study were: (1) to identify the attitudes of the snowmobiler toward leisure and the environment; (2) to identify and compare occupations, SES, and social characteristics with studies in other regions; (3) to identify and compare aspects such as when, where, and why they go snowmobiling and the areas preferred by them with studies in other regions; and (4) to identify their other leisure-time activities. To collect the data, the names of the Cache County snowmobilers were obtained from tax assessment receipts at the Cache County Courthouse. From a total list of 501 names, a sample of 250 was selected by a simple


Book ChapterDOI
31 Mar 1977
TL;DR: It is concluded that very simple models can quite accurately predict such performance improvements if properly abstracted from the actual or proposed system architecture and can do so with a small expenditure of computer time and human effort.
Abstract: It is reasonable to attempt to improve performance of an existing computer system by incorporation of a cache or buffer memory. Furthermore, it is also reasonable to attempt to predict the effect of that inclusion by system models. This paper reports on such an effort. We begin by describing the system, devising a methodology to use a processor dedicated cache in the multi-processor system, and conclude by examining a series of modeling efforts germane to predicting the performance effects of the cache. We are interested in and conclude that very simple models can quite accurately predict such performance improvements if properly abstracted from the actual or proposed system architecture and can do so with a small expenditure of computer time and human effort.