scispace - formally typeset
Search or ask a question

Showing papers on "Memory management published in 1986"


Proceedings ArticleDOI
Kai Li1, Paul Hudak1
01 Nov 1986
TL;DR: Both theoretical and practical results show that the memory coherence problem can indeed be solved efficiently on a loosely-coupled multiprocessor.
Abstract: This paper studies the memory coherence problem in designing and implementing a shared virtual memory on looselycoupled multiprocessors. Two classes of algorithms for solving the problem are presented. A prototype shared virtual memory on an Apollo ring has been implemented based on these algorithms. Both theoretical and practical results show that the memory coherence problem can indeed be solved efficiently on a loosely-coupled multiprocessor.

580 citations


Journal ArticleDOI
15 Jun 1986
TL;DR: This paper presents an architecture for a main memory DBMS, discussing the ways in which a memory resident database differs from a disk-based database, and considers alternative algorithms for selection, projection, and join operations and studying their performance.
Abstract: Most previous work in the area of main memory database systems has focused on the problem of developing query processing techniques that work well with a very large buffer pool. In this paper, we address query processing issues for memory resident relational databases, an environment with a very different set of costs and priorities. We present an architecture for a main memory DBMS, discussing the ways in which a memory resident database differs from a disk-based database. We then address the problem of processing relational queries in this architecture, considering alternative algorithms for selection, projection, and join operations and studying their performance. We show that a new index structure, the T Tree, works well for selection and join processing in memory resident databases. We also show that hashing methods work well for processing projections and joins, and that an old join method, sort-merge, still has a place in main memory.

143 citations


Journal ArticleDOI
TL;DR: The integration of virtual memory management and interprocess communication in the Accent network operating system kernel is examined and its performance, both on a series of message-oriented benchmarks and in normal operation, is analyzed in detail.
Abstract: The integration of virtual memory management and interprocess communication in the Accent network operating system kernel is examined. The design and implementation of the Accent memory management system is discussed and its performance, both on a series of message-oriented benchmarks and in normal operation, is analyzed in detail.

122 citations


Patent
Ann Marie Rudy1
22 Apr 1986
TL;DR: In this article, a method and apparatus for simulating memory devices in a logic simulation machine include a finite state machine (FSM) having input/output (I/O) sources, instruction storage resources, a real memory resource, and instruction execution resources.
Abstract: A method and apparatus for simulating memory devices in a logic simulation machine include a finite state machine (FSM) having input/output (I/O) sources, instruction storage resources, a real memory resource, and instruction execution resources. A plurality of memory device ports to be simulated are defined and associated with corresponding respective subsets of the I/O resources of the FSM. Permutated sets of simulated memory array access signals, such as data, address, and control, are bound to selectable ones of the simulated memory ports and stored in the FSM, with the parameters of the memory operation established by the simulated signals. Stored sets of access instructions, representative of memory access operations, are augmented by the simulated signals and executed by the FSM against the real memory resource. All array instructions representing the same memory array share the same address space in the real memory resource.

111 citations


Journal ArticleDOI
TL;DR: A simulation study of the CRAY X-MP interleaved memory system with attention focused on steady state performance for sequences of vector operations, identifying the occurrence of linked conflicts, repeating sequences of conflicts between two or more vector streams that result in reduced steadyState performance.
Abstract: One of the significant differences between the CRAY X-MP and its predecessor, the CRAY-1S, is a considerably increased memory bandwidth for vector operations. Up to three vector streams in each of the two processors may be active simultaneously. These streams contend for memory banks as well as data paths. All memory conflicts are resolved dynamically by the memory system. This paper describes a simulation study of the CRAY X-MP interleaved memory system with attention focused on steady state performance for sequences of vector operations. Because it is more amenable to analysis, we first study the interaction of vector streams issued from a single processor. We identify the occurrence of linked conflicts, repeating sequences of conflicts between two or more vector streams that result in reduced steady state performance. Both worst case and average case performance measures are given. The discussion then turns to dual processor interactions. Finally, based on our simulations, possible modifications to the CRAY X-MP memory system are proposed and compared. These modifications are intended to eliminate or reduce the effects of linked conflicts.

81 citations


01 Sep 1986
TL;DR: The persistent memory system is based on a uniform memory abstraction, which eliminates the distinction between transient objects (data structures) and persistent objects (files and databases), and therefore, allows the same set of powerful and flexible operations to be applied with equal efficiency on both transient and persistence objects from a programming language such as Lisp or Prolog.
Abstract: Object-oriented databases are needed to support database objects with a wide variety of types and structures. A persistent memory system provides a storage architecture for long-term, reliable retention of objects with rich types and structures in the virtual memory itself. It is based on a uniform memory abstraction, which eliminates the distinction between transient objects (data structures) and persistent objects (files and databases), and therefore, allows the same set of powerful and flexible operations to be applied with equal efficiency on both transient and persistent objects from a programming language such as Lisp or Prolog. Because no separate file system is assumed for long-term, reliable storage of objects, the system requires a crash recovery scheme at the level of the virtual memory, which is a major contribution of the paper. It is expected that the persistent memory system will lead to significant simplifications in implementing applications such as object-oriented databases.

68 citations


Patent
22 Dec 1986
TL;DR: In this article, a multiprocessor system is described where the process manager assigns processes to processors and satisfies their initial memory requirements through global memory allocations, and the process kernel satisfies processes' dynamic memory allocation requests from uncommitted memory, and deallocated to the committed memory both memory that is dynamically requested to be removed and memory of terminating processes.
Abstract: In a multiprocessor system (FIG. 1), memory (22) of each adjunct processor (11-12) comprises global memory (42) and local memory (41). All global memory is managed by a process manager (30) of host processor (10). Each processor's local memory is managed by its operating system kernel (31). Local memory comprises uncommitted memory (45) not allocated to any process and committed memory (46) allocated to processes. The process manager assigns processes to processors and satisfies their initial memory requirements through global memory allocations. Each kernel satisfies processes' dynamic memory allocation requests from uncommitted memory, and deallocated to uncommitted memory both memory that is dynamically requested to be deallocated and memory of terminating processes. Each processor's kernel and the process manager cooperate to transfer memory between global memory and uncommitted memory to keep the amount of uncommitted memory within a predetermined range.

58 citations


Journal ArticleDOI
Ted Kaehler1
01 Jun 1986
TL;DR: The paper explains why the unusual design choices in LOOM were made, and provides an interesting example of the process of designing an integrated virtual memory and storage management system.
Abstract: LOOM (Large Object-Oriented Memory) is a virtual memory implemented in software that supports the Smalltalk-80(™) programming language and environment on the Xerox Dorado computer. LOOM provides 8 billion bytes of secondary memory address space and is specifically designed to run on computers with a narrow word size (16-bit wide words). All storage is viewed as objects that contain fields. Objects may have an average size as small as 10 fields. LOOM swaps objects between primary and secondary memory, and addresses each of the two memories with a different sized object pointer. When objects are cached in primary memory, they are known only by their short pointers. On a narrow word size machine, the narrow object pointers in primary memory allow a program such as the Smalltalk-80 interpreter to enjoy a substantial speed advantage. Interesting design problems and solutions arise from the mapping between the two address spaces and the temporary nature of an object's short address. The paper explains why the unusual design choices in LOOM were made, and provides an interesting example of the process of designing an integrated virtual memory and storage management system.

58 citations


Patent
Alan Jay Smith1
12 Dec 1986
TL;DR: In this article, the authors propose an instruction execution accelerator for a pipelined digital machine with virtual memory, which includes a small associative memory which the page number of a target address of a store operation is compared, if there is a match, it is know that the target address relates to a page within the real memory and the instruction can complete asynchronously.
Abstract: An instruction execution accelerator for a pipelined digital machine with virtual memory. The digital machine includes a pipelined processor which on memory accesses outputs a virtual address to a data cache unit (DCU). On particular memory accesses, such as store or similar operations, the pipelined processor can be advanced or accelerated to the next instruction once the memory access is known not to cause a page fault. The pipeline accelerator includes a small associative memory which the page number of a target address of a store operation is compared. If there is a match, it is know that the target address relates to a page within the real memory and the instruction can complete asynchronously. Otherwise if there is no match, the page address is inserted in the associative memory to become the most recent addition. On the recognition of a page fault by the DCU, the associative memory will be cleared to make room for the new entry and others. The instruction execution accelerator can also be used for load instructions. If an address match is found on a load instruction, then the pipeline can be advanced to the next instruction, and must wait for the completion of the present load instruction only when another instruction attempts to reference the data prior to its being loaded.

57 citations


Patent
02 Apr 1986
TL;DR: In this paper, the authors describe a computer system where two independent processors communicate via a bus system and operate substantially concurrently, each computer having its own operating system software and share a common memory.
Abstract: 57 A computer system is described wherein two independent processors communicate via a bus system and operate substantially concurrently, each computer having its own operating system software and share a common memory. The architecture of the computer system is such that one of the processors is allocated the bulk of memory band-width with the other processor taking the remainder. Arbitration for memory allocation is accomplished via a combination of a new firmware instruction and a semaphore.

43 citations


Patent
17 Jan 1986
TL;DR: In this article, a virtual resource manager (VRM) is used to manage the operation of a coprocessor in a virtual memory type data processing system, in which an Input/Output channel and an input/output channel controller interconnect the coprecessor to the main processor and system memory.
Abstract: A method to manage the operation of a coprocessor in a virtual memory type data processing system in which an Input/Output channel and an Input/output channel controller interconnect the coprocessor to the main processor and system memory. A Virtual Resource Manager (VRM) comprising a group of interrelated software programs function in the system to establish virtual machines that execute application programs concurrently. A Coprocessor Manager component of the VRM establishes a Virtual Machine in which the compressor executes application programs that cannot be executed on the main processor. The coprocessor is mounted on an integrated circuit card which is inserted into a `mother board` socket which contains the main processor and related components.

Proceedings ArticleDOI
01 May 1986
TL;DR: It is shown that the logical problem of buffering is directly related to the problem of synchronization, and a simple model is presented to evaluate the performance improvement resulting from buffering.
Abstract: In highly-pipelined machines, instructions and data are prefetched and buffered in both the processor and the cache. This is done to reduce the average memory access latency and to take advantage of memory interleaving. Lock-up free caches are designed to avoid processor blocking on a cache miss. Write buffers are often included in a pipelined machine to avoid processor waiting on writes. In a shared memory multiprocessor, there are more advantages in buffering memory requests, since each memory access has to traverse the memory- processor interconnection and has to compete with memory requests issued by different processors. Buffering, however, can cause logical problems in multiprocessors. These problems are aggravated if each processor has a private memory in which shared writable data may be present, such as in a cache-based system or in a system with a distributed global memory. In this paper, we analyze the benefits and problems associated with the buffering of memory requests in shared memory multiprocessors. We show that the logical problem of buffering is directly related to the problem of synchronization. A simple model is presented to evaluate the performance improvement resulting from buffering.

Book
01 Jan 1986
TL;DR: Applications of Data Structures to Sorting and Recursion, and some Miscellaneous Applications of Trees in Modula-2.
Abstract: ion and Problem Solving. The Stack and Queue Abstractions and Some Implementations. Applications of Stack and Queue Abstractions in Modula-2. The List (Search Table) ion and Some Implementations. Applications of List ion in Modula-2. Review of Recursion. The Tree and Some Implementations. Some Miscellaneous Applications of Trees in Modula-2. The Search Table Implemented with Binary Trees in Modula-2. The Search Table Implemented Using Hash Tables. Applications of Data Structures to Sorting. Memory Management. Appendixes. Index.

Patent
24 Nov 1986
TL;DR: In this paper, a method for determining the maximum amount of physical memory present in a data processing system that can be configured to have one or more memory modules where the memory modules may be one of several types having different amounts of memory locations is presented.
Abstract: A method for determining the maximum amount of physical memory present in a data processing system that can be configured to have one or more memory modules where the memory modules may be one of several types having different amounts of memory locations. By having signals indicating the presence of a memory module and the module type directly available with minimal intervening logic, a diagnostic process can accurately determine the amount of memory present in the system and reduce the possibility of a failed memory module going undetected. A method is also descibed using these memory module present and module type signals for detecting an attempt by either the central processor or an input/output controller to access a memory location that is not physically present within the data processing system.

Journal ArticleDOI
01 Jan 1986
TL;DR: As the cost of computer memory continues to decline faster than that of processors, it may be realistic to effectively apply pattern recognition methodology to security evaluation of an electric power system with a modest level of memory requirement.
Abstract: As the cost of computer memory continues to decline faster than that of processors, it may be realistic to effectively apply pattern recognition methodology to security evaluation of an electric power system. Efficient implementation techniques are developed to achieve assessment in real time with a modest level of memory requirement. The basic idea is to recognize the unknown security of a particular system state operation from stored knowledge about similar operating patterns. Two efficient data structures are proposed here for its implementation. First, a distributed memory device, an associative memory is developed for recognition. This particular memory is found to be capable of parallel pattern matching along with reduced computer storage. Second, for an efficient implementation of the memory structure, these associative memories are configured in a hierarchical structure which not only expands storage capacity but also utilizes the speed of tree search. This structure provides a basis of an error-free, rapid, and memory-saving recognition algorithm.

Patent
Teru Shinohara1, Hideki Osone1
26 Mar 1986
TL;DR: In this paper, a buffer memory control system for executing an immediate instruction, including a block fetch control unit for generating a first movein complete signal indicating that the move-in of the heading subblock from a main memory to the buffer memory is completed.
Abstract: A buffer memory control system for executing an immediate instruction, including a block fetch control unit for generating a first move-in complete signal indicating that the move-in of the heading subblock from a main memory to the buffer memory is completed. In response to the first move-in complete signal, the fetch and store operation starts without waiting for the completion of the move-in of a full block.

Patent
20 May 1986
TL;DR: In this paper, a production system for performing predetermined processes on products which each have a memory incorporated therein, information necessary for those processes is sequentially written by an external computer in the memory incorporated in each product, and such information is read out from the memory and used by the production system to carry out the respective processes on the products in a low cost and efficient way.
Abstract: In a production system for performing predetermined processes on products which each have a memory incorporated therein, information necessary for those processes is sequentially written by an external computer in the memory incorporated in each product, and such information is read out from the memory and used by the production system to carry out the respective processes on the products in a low cost and efficient way.

Journal ArticleDOI
Kai Li1, Paul Hudak1
TL;DR: A new list compaction method is presented that performs well during both sequential and ‘parallel’ list generation and performance figures in a simulated environment suggest that the strategy consistently performs better than conventional cdr‐coding, with essentially the same complexity.
Abstract: List compaction, or so-called ‘cdr-coding’, can greatly reduce the storage needs of list processing languages. However, existing methods do not perform well when several lists are being constructed simultaneously from the same heap, since the non-contiguous nature of the cells being allocated eliminates the opportunity for compaction. This situation arises not only in true parallel systems sharing a common memory, and sequential systems supporting multiple processes, but also quite often in purely sequential systems, where it is not uncommon to build several different lists simultaneously within a single loop. In this paper, a new list compaction method is presented that performs well during both sequential and ‘parallel’ list generation. The method is essentially a generalization of cdr-coding, in which lists are represented explicitly as linked vectors rather than implicitly as compacted memory. In addition, an encoding scheme is used that is as simple or simpler than all known encodings, and destructive operations are supported with no greater overhead than existing schemes. Performance figures in a simulated environment suggest that the strategy consistently performs better than conventional cdr-coding, with essentially the same complexity.

Journal ArticleDOI
W Oed, O Lange1
01 Oct 1986
TL;DR: Some analytical results regarding the access in vector mode to an interleaved memory system and the number and type of memory conflicts that were encountered are presented.
Abstract: Memory interleaving and multiple access ports are the key to a high memory bandwidth in vector processing systems. Each of the active ports supports an independent access stream to memory among which access conflicts may arise. Such conflicts lead to a decrease in memory bandwidth and consequently to longer execution times. We present some analytical results regarding the access in vector mode to an interleaved memory system. In order to demonstrate the practical effects of our analytical results we have done time measurements of some simple vector loops on a 2-CPU, 16-bank CRAY X-MP. By corresponding simulations we obtained the number and type of memory conflicts that were encountered.

Patent
11 Mar 1986
TL;DR: An electronic postage meter with a non-volatile memory security circuit apparatus is shown in this paper, where the security circuit comprises means for limiting the amount of time the memories may be continuously enabled, means for preventing simultaneous enabling of both memories, and mean for preventing the write enabling of a memory if the write enable signal is active before a memory select signal was active.
Abstract: An electronic postage meter with a non-volatile memory security circuit apparatus is disclosed The security circuit comprises means for limiting the amount of time the memories may be continuously enabled, means for preventing simultaneous enabling of both memories and means for preventing the write enabling of a memory if the write enable signal is active before a memory select signal is active The circuit prevents memory access when a conflict is sensed across in an output related to the non-volatile memories The security circuit provides additional protection to the non-volatile memory so that valuable critical accounting information located therein cannot be modified or destroyed

Journal ArticleDOI
01 May 1986
TL;DR: A heuristic algorithm for register spilling within basic blocks is introduced and trace optimization techniques can extend the use of the algorithm to global allocation and it is shown that theUse of registers can be more effective in reducing the bus traffic than cache memory of the same size.
Abstract: Single-chip computers are becoming increasingly limited by the access constraints to off-chip memory To achieve high performance, the structure of on-chip memory must be appropriate, and it must be allocated effectively to minimize off-chip communication We report experiments that demonstrate that on-chip memory can be effective for local variable accesses For best use of the limited on-chip area, we suggest organizing memory as registers and argue that an effective register spilling scheme is required We introduce a heuristic algorithm for register spilling within basic blocks and demonstrate that trace optimization techniques can extend the use of the algorithm to global allocation Through trace simulation, we show that the use of registers can be more effective in reducing the bus traffic than cache memory of the same size

Patent
07 May 1986
TL;DR: In this paper, a high-speed, intelligent, distributed control memory system is described, which is comprised of an array of modular, cascadable, integrated circuit devices, referred to as "memory elements".
Abstract: A highspeed, intelligent, distributed control memory system is comprised of an array of modular, cascadable, integrated circuit devices, hereinafter referred to as "memory elements." Each memory element is further comprised of storage means, programmable on board processing ("distributed control") means and means for interfacing with both the host system and the other memory elements in the array utilizing a single shared bus. Each memory element of the array is capable of transferring (reading or writing) data between adjacent memory elements once per clock cycle. In addition, each memory element is capable of broadcasting data to all memory elements of the array once per clock cycle. This ability to asynchronously transfer data between the memory elements at the clock rate, using the distributed control, facilitates unburdening host system hardware and software from tasks more efficiently performed by the distributed control. As a result, the memory itself can, for example, perform such tasks as sorting and searching, even across memory element boundaries, in a manner which conserves, is faster and more efficient then using, host system resources.


Patent
Hartmut Schrenk1
07 Jul 1986
TL;DR: In this article, a method for controlling memory access to a user area and an initial code area of a main memory of a chip card includes carrying out an internal release procedure with a data comparison of an initialization code from the initialization code area and a data word from a terminal; firmly coupling addresses of the main memory and of a control memory to each other; marking several storage locations of the primary memory as the initial code areas with one control bit at a time in the control memory; marking a first code deposited in the associated storage location of the first code area as activated or deactivated
Abstract: A method for controlling memory access to a user area and an initial code area of a main memory of a chip card includes carrying out an internal release procedure with a data comparison of an initial code from the initial code area and a data word from a terminal; firmly coupling addresses of the main memory and of a control memory to each other; marking several storage locations of the main memory as the initial code area with one control bit at a time in the control memory; marking a first code deposited in the associated storage location of the first code area as activated or deactivated with one further control bit at a time in the control memory; generating an initial release signal in a release procedure only if a storage location is addressed by an activated initial code and if agreement with the data word entered by the terminal prevails; and preventing generation of the initial release signal if a deactivated code word is addressed and/or if the respective first code does not agree with the data word, and an apparatus for carrying out the method.


Journal ArticleDOI
Brad Cohen1, Ralph Mcgarity1
TL;DR: Pipelining, microsequencer start-up in parallel with bus arbitration, and a fully associative translation cache enhanced the performance of this 32-bit memory management device.
Abstract: Pipelining, microsequencer start-up in parallel with bus arbitration, and a fully associative translation cache enhanced the performance of this 32-bit memory management device.

Proceedings ArticleDOI
01 Jan 1986
TL;DR: The integrated memory management unit (MMU) has a 16-entry full associative Translation Look-aside Buffer (TLB) and a protection check circuitry, which can translate the virtual address to real address in 36ns in worst case.
Abstract: mented by using a double-metal layer CMOS process technology with 1.5pm design rule to integrate 375,000 transistors on a single-chip. It operates at 16MHz, and consumes 1.5". The processor has six independently-operational function-units that form a pipeline structure, as shown in Figures 2 and 3. The PFU (Prefetch Unit) prefetches instructions into a 16-byte prefetch queue. The IDU (Instruction Decode Unit) decodes the instructions, and sets commands into a two words by 53b decoded instruction queue (IDQ). The EAG (Effective Address Generator) calculates the operand address, while the MMU (Memory Management Unit) translates virtual address into real address. A BCU (Bus Control Unit) initiates memory access for instruction/data fetch. The EXL (Execution Unit) carries out the instruction-set function. The integrated memory management unit (MMU) has a 16-entry full associative Translation Look-aside Buffer (TLB) and a protection check circuitry. The TLB holds sixteen virtual-to-real address pairs in full associative manner, each consists of a 21b contents addressable memory (CAM) for virtual address tag and a 28b data memory for real address. The TLB can translate the virtual address to real address in 36ns in worst case. The chip microphotograph is shown in Figure 1. It has been imple-

Proceedings ArticleDOI
01 Jan 1986
TL;DR: The development of a cache memory to support 32b microprocessors will be offered and the circuit operates at 33MHz, delivers data to the CPU in 2/4 clock cycles and is fabricated in 2μm CMOS.
Abstract: The development of a cache memory to support 32b microprocessors will be offered. Including an on-chip memory unit the circuit operates at 33MHz, delivers data to the CPU in 2/4 clock cycles and is fabricated in 2μm CMOS.

Patent
06 Oct 1986
TL;DR: In this article, a system for sequentially providing external circuit data for an external circuit comprising of a memory having a plurality of memory locations identified by addresses, each memory location containing an instruction comprising external circuits data and memory location data, the memory further comprising an address input terminal for currently accessing one of the memory locations in response to receipt of the address for that memory location at the address input terminals, and a sequencer coupled to the memory to receive memory location location data from the currently addressed one of these memory locations, for selecting the address of another of the memories locations, in
Abstract: A system for sequentially providing external circuit data for an external circuit comprising: a memory having a plurality of memory locations identified by addresses, each memory location containing an instruction comprising external circuit data and memory location data, the memory further comprising an address input terminal for currently accessing one of the memory locations in response to receipt of the address for that memory location at the address input terminal; a sequencer, coupled to the memory to receive memory location data from the currently addressed one of the memory locations, for selecting the address of another of the memory locations in response to the next memory location data; a first register, coupled between the sequencer and the address input terminal of the memory to receive the address of the other of the memory locations from the sequencer for storing the address selected by the sequencer to make the address available at the address input terminal of the memory to access the other of the memory locations, after the storing; and a second register, coupled between the memory and the external circuit, for storing the external circuit data of the currently accessed memory location for use, after the storing, by the external circuit. A related method is also provided.

Journal ArticleDOI
R. E. Matick1
TL;DR: This paper demonstrates the relationship between central processing unit and overall system influence and how it has affected computer architecture over the years with a direct look at how specific architectures attempt to circumvent the limitations of the associated memory system.
Abstract: The largest part of computer architecture, in both the central processing unit and the overall system, has been and continues to be directly influenced in one way or another by the types of memory systems available. This is readily apparent in certain areas such as I/O architecture and memory hierarchies. However the pervasiveness of this influence throughout the entire system is not so obvious. This paper demonstrates this relationship and shows how it has affected computer architecture over the years. Two approaches are used, the first being a direct look at how specific architectures attempt to circumvent the limitations of the associated memory system. This includes such topics as the internal architecture of CPUs: memory hierarchies and virtual memory, I/O architecture, file structuring, and data base architecture. Second, a gedanken (thought) experimentis used to predict future trends. It is assumed that very large-scale integration will evolve to the point at which we can have nearly any main memory system we desire with some reasonable constraints. The architectural changes that might take place will be seen to be precisely related to the weaknesses in current memory systems which various architectures currently attempt to circumvent.