scispace - formally typeset
Search or ask a question

Showing papers on "Memory management published in 1999"


Patent
Walter A. Hubis1, William G. Deitz1
13 Sep 1999
TL;DR: In this paper, the authors propose a method for controlling access to a hardware device in a computer system having a plurality of computers and at least one hardware device connected to the plurality of host computers.
Abstract: The invention provides structure and method for controlling access to a shared storage device, such as a disk drive storage array, in computer systems and networks having a plurality of host computers. A method for controlling access to a hardware device in a computer system having a plurality of computers and at least one hardware device connected to the plurality of computers. The method includes the steps of associating a locally unique identifier with each the plurality of computers, defining a data structure in a memory identifying which particular ones of the computers based on the locally unique identifier may be granted access to the device; and querying the data structure to determine if a requesting one of the computers should be granted access to the hardware device. In one embodiment, the procedure for defining the data structure in memory includes defining a host computer ID map data structure in the memory; defining a port mapping table data structure comprising a plurality of port mapping table entries in the memory; defining a host identifier list data structure in the memory; defining a volume permission table data structure in the memory; and defining a volume number table data structure in the memory. In one particular embodiment, the memory is a memory of a memory controller controlling the hardware device, and the hardware device is a logical volume of a storage subsystem. The invention also provides an inventive controller structure, and a computer program product implementing the inventive method.

644 citations


Journal ArticleDOI
TL;DR: This paper considers how computational RAM integrates processing power with memory by using an architecture that preserves and exploits the features of memory.
Abstract: Computational RAM is a processor-in-memory architecture that makes highly effective use of internal memory bandwidth by pitch-matching simple processing elements to memory columns Computational RAM can function either as a conventional memory chip or as a SIMD (single-instruction stream, multiple-data stream) computer When used as a memory, computational RAM is competitive with conventional DRAM in terms of access time, packaging and cost Adding logic to memory is not a simple question of bolting together two existing designs The paper considers how computational RAM integrates processing power with memory by using an architecture that preserves and exploits the features of memory

235 citations


Proceedings ArticleDOI
01 Jan 1999
TL;DR: The Capability Calculus as discussed by the authors is a compiler intermediate language that supports region-based memory management, enjoys a provably safe type system, and is straightforward to compile to a typed assembly language.
Abstract: An increasing number of systems rely on programming language technology to ensure safety and security of low-level code. Unfortunately, these systems typically rely on a complex, trusted garbage collector. Region-based type systems provide an alternative to garbage collection by making memory management explicit but verifiably safe. However, it has not been clear how to use regions in low-level, type-safe code.We present a compiler intermediate language, called the Capability Calculus, that supports region-based memory management, enjoys a provably safe type system, and is straightforward to compile to a typed assembly language. Source languages may be compiled to our language using known region inference algorithms. Furthermore, region lifetimes need not be lexically scoped in our language, yet the language may be checked for safety without complex analyses. Finally, our soundness proof is relatively simple, employing only standard techniques.The central novelty is the use of static capabilities to specify the permissibility of various operations, such as memory access and deallocation. In order to ensure capabilities are relinquished properly, the type system tracks aliasing information using a form of bounded quantification.

234 citations


Journal ArticleDOI
TL;DR: This paper uses the non-update-in-place scheme to implement a flash memory server and proposes a new cleaning policy that uses a fine-grained method to effectively cluster hot data and cold data in order to reduce cleaning overhead.

229 citations


Patent
26 Apr 1999
TL;DR: The Compression Enhanced Flash Memory Controller (CEFMC) as discussed by the authors uses parallel lossless compression and decompression engines embedded into the flash memory controller unit for improved memory density and data bandwidth.
Abstract: A flash memory controller and/or embedded memory controller including MemoryF/X Technology that uses data compression and decompression for improved system cost and performance. The Compression Enhanced Flash Memory Controller (CEFMC) of the present invention preferably uses parallel lossless compression and decompression engines embedded into the flash memory controller unit for improved memory density and data bandwidth. In addition, the invention includes a Compression Enhanced Memory Controller (CEMC) where the parallel compression and decompression engines are introduced into the memory controller of the microprocessor unit. The Compression Enhanced Memory Controller (CEMC) invention improves system wide memory density and data bandwidth. The disclosure also indicates preferred methods for specific applications such as usage of the invention for solid-state disks, embedded memory and Systems on Chip (SOC) environments. The disclosure also indicates a novel memory control method for the execute in place (XIP) architectural model. The integrated parallel data compression and decompression capabilities of the CEFMC and CEMC inventions remove system bottle-necks and increase performance matching the data access speeds of the memory subsystem to that of the microprocessor. Thus, the invention allows lower cost systems due to smaller data storage, reduced bandwidth requirements, reduced power and noise.

225 citations


Patent
09 Aug 1999
TL;DR: A hardware assisted memory module (HAMM) is coupled to a conventional computer system as discussed by the authors, which can be configured to copy all or part of the digital information to nonvolatile memory.
Abstract: A hardware assisted memory module (HAMM) is coupled to a conventional computer system. During normal operation of the computer system, the HAMM behaves like a conventional memory module. The HAMM, however, detects and responds to at least one of the following trigger events: 1) power failure, 2) operating system hang-up, or 3) unexpected system reset. Upon detection of a trigger event, the HAMM electronically isolates itself from the host computer system before copying digital information from volatile memory to nonvolatile memory. Once isolated, the HAMM takes its power from an auxiliary power supply. The HAMM can be configured to copy all or part of the digital information to nonvolatile memory. Upon either a request or at power-up, the HAMM copies the digital information from the nonvolatile memory into the volatile memory. If there is a normal computer shutdown, the operating system will first warn the HAMM before shutting down, thus precluding it from performing a backup operation. The operating system determines whether the last shutdown was unexpected by reading a register stored in a reserved area of memory. If the operating system wants the digital information restored, it orders the HAMM to restore the backed-up digital information from nonvolatile memory to volatile memory.

224 citations


Patent
28 Oct 1999
TL;DR: In this article, a stream of instructions is executed on a computer and a series of memory loads are issued from a computer CPU to a bus, some directed to well-behaved memory and some to non-well-behaving devices in I/O space.
Abstract: A method and a computer for execution of the method. As part of executing a stream of instructions, a series of memory loads is issued from a computer CPU to a bus, some directed to well-behaved memory and some directed to non-well-behaved devices in I/O space. Computer addresses are stored of instructions of the stream that issued memory loads to the non-well-behaved memory, the storage form of the recording allowing determination of whether the memory load was to well-behaved memory or not-well-behaved memory without resolution of any memory address stored in the recording.

205 citations


Journal ArticleDOI
01 Apr 1999
TL;DR: These are the key issues which one inevitably encounters when one tries to achieve giga-to-tera bit memory integration and its positioning among various memory architectures.
Abstract: Starting with a brief review on the single-electron memory and its significance among various single-electron devices, this paper addresses the key issues which one inevitably encounters when one tries to achieve giga-to-tera bit memory integration. Among the issues discussed are: room-temperature operation; memory-cell architecture; sensing scheme; cell-design guideline; use of nanocrystalline silicon versus lithography; array architecture; device-to-device variations; read/write error rate; and CMOS/single-electron-memory hybrid integration and its positioning among various memory architectures.

174 citations


Proceedings Article
06 Jun 1999
TL;DR: This study shows that technology trends favor compressed virtual memory--it is attractive now, offering reduction of paging costs of several tens of percent, and it will be increasingly attractive as CPU speeds increase faster than disk speeds.
Abstract: Compressed caching uses part of the available RAM to hold pages in compressed form, effectively adding a new level to the virtual memory hierarchy. This level attempts to bridge the huge performance gap between normal (uncompressed) RAM and disk. Unfortunately, previous studies did not show a consistent benefit from the use of compressed virtual memory. In this study, we show that technology trends favor compressed virtual memory--it is attractive now, offering reduction of paging costs of several tens of percent, and it will be increasingly attractive as CPU speeds increase faster than disk speeds. Two of the elements of our approach are innovative. First, we introduce novel compression algorithms suited to compressing in-memory data representations. These algorithms are competitive with more mature Ziv-Lempel compressors, and complement them. Second, we adaptively determine how much memory (if at all) should be compressed by keeping track of recent program behavior. This solves the problem of different programs, or phases within the same program, performing best for different amounts of compressed memory.

167 citations


Journal ArticleDOI
TL;DR: An incomplete Cholesky factorization for the solution of large-scale trust region subproblems and positive definite systems of linear equations depends on a parameter p that specifies the amount of additional memory that is available; there is no need to specify a drop tolerance.
Abstract: We propose an incomplete Cholesky factorization for the solution of large-scale trust region subproblems and positive definite systems of linear equations. This factorization depends on a parameter p that specifies the amount of additional memory (in multiples of n, the dimension of the problem) that is available; there is no need to specify a drop tolerance. Our numerical results show that the number of conjugate gradient iterations and the computing time are reduced dramatically for small values of p. We also show that in contrast with drop tolerance strategies, the new approach is more stable in terms of number of iterations and memory requirements.

166 citations


Patent
25 Feb 1999
TL;DR: In this paper, a memory request detector generates memory request indication data, such as data representing whether memory requests have been received within a predetermined time, based on detection of graphics and/or video memory requests during an active mode of the display system operation.
Abstract: An apparatus and method dynamically controls the graphics and/or video memory power dynamically during idle periods of the memory interface during active system modes. In one embodiment, a memory request detector generates memory request indication data, such as data representing whether memory requests have been received within a predetermined time, based on detection of graphics and/or video memory requests during an active mode of the display system operation. A dynamic activity based memory power controller analyzes the memory request indication data and controls the power consumption of the graphics and/or video memory based on whether memory requests are detected.

Patent
Jay Wang1
20 Apr 1999
TL;DR: In this article, a method and system for storing data in data blocks of predetermined size in an electronic memory (e.g., FLASH memory), particularly data such as updatable record of database transactions, is presented.
Abstract: A method and system for storing data in data blocks of predetermined size in an electronic memory (e.g. FLASH memory), particularly data such as updatable record of database transactions. The FLASH operates logically as two stacks where data is pushed into either end of the memory in alternating cycles. Between each push or write cycle, a garbage collection cycle is performed whereby only the most recent transaction performed on any particular record is preserved at one end of the stack, while the rest of the stack is made available for new data. When database being monitored is written to permanent memory, the entire FLASH is again made available for new data. If the database is periodically backed up to permanent memory, it can be restored to RAM by reducing the copy from the permanent memory and modifying it according to the record of database transactions in the electronic memory.

Proceedings ArticleDOI
01 May 1999
TL;DR: Maps, a compiler managed memory system for Raw architectures, is implemented based on the SUIF infrastructure and it is demonstrated that the exclusive use of static promotion yields roughly 20-fold speedup on 32 tiles for regular applications and about 5-foldspeedup on 16 or more tiles for irregular applications.
Abstract: This paper describes Maps, a compiler managed memory system for Raw architectures. Traditional processors for sequential programs maintain the abstraction of a unified memory by using a single centralized memory system. This implementation leads to the infamous "Von Neumann bottleneck," with machine performance limited by the large memory latency and limited memory bandwidth. A Raw architecture addresses this problem by taking advantage of the rapidly increasing transistor budget to move much of its memory on chip. To remove the bottleneck and complexity associated with centralized memory, Raw distributes the memory with its processing elements. Unified memory semantics are implemented jointly by the hardware and the compiler. The hardware provides a clean compiler interface to its two inter-tile interconnects: a fast, statically schedulable network and a traditional dynamic network. Maps then uses these communication mechanisms to orchestrate the memory accesses for low latency and parallelism while enforcing proper dependence. It optimizes for speed in two ways: by finding accesses that can be scheduled on the static interconnect through static promotion, and by minimizing dependence sequentialization for the remaining accesses. Static promotion is performed using equivalence class unification and modulo unrolling; memory dependences are enforced through explicit synchronization and software serial ordering. We have implemented Maps based on the SUIF infrastructure. This paper demonstrates that the exclusive use of static promotion yields roughly 20-fold speedup on 32 tiles for our regular applications and about 5-fold speedup on 16 or more tiles for our irregular applications. The paper also shows that selective use of dynamic accesses can be a useful complement to the mostly static memory system.

Patent
30 Apr 1999
TL;DR: In this article, a failing memory module may be replaced by copying its contents to a new memory module in a background operation while the computer system runs its operating system and applications programs.
Abstract: A computer system adapted for hot-pluggable components such as memory modules that may be replaced, upgraded and/or added without disturbing normal operation of the computer system. A failing memory module may be replaced by copying its contents to a new memory module in a background operation while the computer system runs its operating system and applications programs. When all contents are copied to the new memory module, the failing memory module may be removed without having to shut down the computer system. Computer system memory may be upgraded or added to by inserting the new memory module(s) into vacant disconnected memory connectors, whereupon the computer system automatically recognizes the new memory module(s), synchronously connects the new memory module(s) to the computer system memory bus, initializes the new memory module(s), and then notifies the operating system that the new memory module(s) is available, all without disturbing normal operation of the computer system.

Proceedings ArticleDOI
19 Oct 1999
TL;DR: A new way of managing flash memory space for flash memory-specific file systems based on a log-structured file system that has a maximum of 35% reduction in the cleaning cost with evenly-spread writes across segments.
Abstract: Proposes a new way of managing flash memory space for flash memory-specific file systems based on a log-structured file system. Flash memory has attractive features such as non-volatility and fast I/O speed, but it also suffers from an inability to update in place, and limited usage cycles. These drawbacks require many changes to conventional storage (file) management techniques. Our focus is on lowering the cleaning cost and evenly utilizing flash memory cells while maintaining a balance between these two often-conflicting goals. The cleaning efficiency is enhanced by dynamically separating cold data and non-cold data. The second goal, cycle leveling, is achieved to the degree where the maximum difference between erase cycles is below the error range of the hardware. Simulation results show that the proposed method has a significant benefit over naive methods: a maximum of 35% reduction in the cleaning cost with evenly-spread writes across segments.

Patent
22 Sep 1999
TL;DR: In this article, an instruction of one of the execution entities is retrieved and an associated identifier is decoded and information associated with the instruction is stored in a cache section based on the identifier.
Abstract: A system includes multiple program execution entities (e.g., tasks, processes, threads, and the like) and a cache memory having multiple sections. An identifier is assigned to each execution entity. An instruction of one of the execution entities is retrieved and an associated identifier is decoded. Information associated with the instruction is stored in one of the cache sections based on the identifier.

Book ChapterDOI
Wim De Pauw1, Gary Sevitsky1
14 Jun 1999
TL;DR: This paper shows a new methodology for finding the causes of memory leaks in Java, and proposes a novel combination of visual syntax and reference pattern extraction to manage this additional complexity.
Abstract: Many Java programmers believe they do not have to worry about memory management because of automatic garbage collection. In fact, many Java programs run out of memory unexpectedly after performing a number of operations. A memory leak in Java is caused when an object that is no longer needed cannot be reclaimed because another object is still referring to it. Memory leaks can be difficult to solve, since the complexity of most programs prevents us from manually verifying the validity of every reference. In this paper we show a new methodology for finding the causes of memory leaks. We have identified a basic memory leak scenario which fits many important cases. In this scenario, we allow the programmer to identify a period of time in which temporary objects are expected to be created and released. Using this information we are able to identify objects that persist beyond this period and the references which are holding on to them. Scaling this methodology to real-world systems brings additional challenges. We propose a novel combination of visual syntax and reference pattern extraction to manage this additional complexity. We also describe how these techniques can be applied to a wider class of memory problems, including the exploration of large data structures. These techniques have been implemented and have been proven successful on large projects.

Journal Article
TL;DR: This paper gives an overview of self-tuning methods for a spectrum of memory management issues, ranging from traditional caching to exploiting distributed memory in a server cluster and speculative prefetching in a Web-based system.
Abstract: Although today’s computers provide huge amounts of main memory, the ever-increasing load of large data servers, imposed by resource-intensive decision-support queries and accesses to multimedia and other complex data, often leads to memory contention and may result in severe performance degradation. Therefore, careful tuning of memory mangement is crucial for heavy-load data servers. This paper gives an overview of self-tuning methods for a spectrum of memory management issues, ranging from traditional caching to exploiting distributed memory in a server cluster and speculative prefetching in a Web-based system. The common, fundamental elements in these methods include on-line load tracking, near-future access prediction based on stochastic models and the available on-line statistics, and dynamic and automatic adjustment of control parameters in a feedback loop. 1 The Need for Memory Tuning Although memory is relatively inexpensive and modern computer systems are amply equipped with it, memory contention on heavily loaded data servers is a common cause of performance problems. The reasons are threefold: Servers are operating with a multitude of complex software, ranging from the operating system to database systems, object request brokers, and application services. Much of this software has been written so as to quickly penetrate the market rather than optimizing memory usage and other resource consumption. The distinctive characteristic and key problem of a data server is that it operates in multi-user mode, serving many clients concurrently or in parallel. Therefore, a server needs to divide up its resources among the simultaneously active threads for executing queries, transactions, stored procedures, Web applications, etc. Often, multiple data-intensive decision-support queries compete for memory. The data volumes that need to be managed by a server seem to be growing without limits. One part of this trend is that multimedia data types such as images, speech, or video have become more popular and are being merged into conventional-data applications (e.g., images or videos for insurance claims). The other Copyright 1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering

Journal ArticleDOI
TL;DR: This paper examines the theoretical upper bounds on the cache hit ratio that cache bypassing can provide for integer applications, including several Windows applications with OS activity, and proposes a microarchitecture scheme where the hardware determines data placement within the cache hierarchy based on dynamic referencing behavior.
Abstract: The growing disparity between processor and memory performance has made cache misses increasingly expensive. Additionally, data and instruction caches are not always used efficiently, resulting in large numbers of cache misses. Therefore, the importance of cache performance improvements at each level of the memory hierarchy will continue to grow. In numeric programs, there are several known compiler techniques for optimizing data cache performance. However, integer (nonnumeric) programs often have irregular access patterns that are more difficult for the compiler to optimize. In the past, cache management techniques such as cache bypassing were implemented manually at the machine-language-programming level. As the available chip area grows, it makes sense to spend more resources to allow intelligent control over the cache management. In this paper, we present an approach to improving cache effectiveness, taking advantage of the growing chip area, utilizing run-time adaptive cache management techniques, optimizing both performance and cost of implementation. Specifically, we are aiming to increase data cache effectiveness for integer programs. We propose a microarchitecture scheme where the hardware determines data placement within the cache hierarchy based on dynamic referencing behavior. This scheme is fully compatible with existing instruction set architectures. This paper examines the theoretical upper bounds on the cache hit ratio that cache bypassing can provide for integer applications, including several Windows applications with OS activity. Then, detailed trace-driven simulations of the integer applications are used to show that the implementation described in this paper can achieve performance close to that of the upper bound.

Patent
31 Aug 1999
TL;DR: In this article, a system and method for memory management in a smart card is described, where memory allocation for new data objects and memory deallocation as the result of data object deletion are made by reference to a memory management record, preferably a bitmap, which is stored in RAM and formed upon smart card initialization.
Abstract: A system and method for memory management in a smart card are disclosed. The memory manager, preferably part of a true operating system, is the single device by which memory in the smart card is allocated and deallocated. Memory allocation for new data objects and memory deallocation as the result of data object deletion are made by reference to a memory management record, preferably a bitmap, which is stored in RAM and formed upon smart card initialization.

Journal ArticleDOI
TL;DR: This article examines the problem of an increasing Processor - Memory Performance Gap, which now is the primary obstacle to improved computer system performance.
Abstract: The rate of improvement in microprocessor speed exceeds the rate of improvement in DRAM (Dynamic Random Access Memory) speed. So although the disparity between processor and memory speed is already an issue, downstream someplace it will be a much bigger one. Hence computer designers are faced with an increasing Processor - Memory Performance Gap [1], which now is the primary obstacle to improved computer system performance. This article examines this problem as well as its various solutions.

Patent
19 Nov 1999
TL;DR: In this article, a highly intelligent programmable multi-tasking memory management system manages memory requests associated with a system on chip (SOC) device, which includes a routing controller or central processing unit (RCPU) that is used for routing/switching stream data between communication cores and digital signal processors.
Abstract: A highly intelligent programmable multi-tasking memory management system manages memory requests associated with a system on chip (SOC) device. The memory management system includes a routing controller or central processing unit (RCPU) that is used for routing/switching stream data between communication cores and digital signal processors with minimum reliance and demand on a main or virtual central processing unit (VCPU) residing on a system bus. Tasks are partitioned between the VCPU and the RCPU within the SOC architecture for communication applications. The VCPU performs system/application tasks while the RCPU simultaneously performs multiple memory routing/switching tasks and multiple concurrent memory access connections. The memory management system also enables other processors and communication cores to update their internal data once new data is written in the memory system. In addition, a method and system are provided for performing predictive protocol fetch for multiple DSPs on the SOC to increase data processing throughput.

Patent
26 Apr 1999
TL;DR: In this article, a B-tree structure is used to map physical memory locations to logical addresses, where each key in the tree structure contains the physical address corresponding to the logical address identifying the key and also contains the size of the data block at that address.
Abstract: A memory management system for random access memories employs a novel B-tree structure to map physical memory locations to logical addresses. In the preferred arrangement each key in the tree structure contains the physical address corresponding to the logical address identifying the key and also contains the size of the data block at that address. The invention also provides a novel arrangement for updating B-trees in response to changes in the keys. The tree buckets containing modified keys are recorded in storage locations other than the locations containing the keys prior to modification. Thus, until the modification of the tree is complete, the system contains a record of the entire tree structure prior to the beginning of the modification.

Proceedings ArticleDOI
01 Oct 1999
TL;DR: This research explores any potential for an on-chip cache compression which can reduce not only cache miss ratio but also miss penalty, if main memory is also managed in compressed form, and suggests several techniques to reduce the decompression overhead and to manage the compressed blocks efficiently.
Abstract: This research explores any potential for an on-chip cache compression which can reduce not only cache miss ratio but also miss penalty, if main memory is also managed in compressed form. However, the decompression time causes a critical effect on the memory access time and variable-sized compressed blocks tend to increase the design complexity of the compressed cache architecture. This paper suggests several techniques to reduce the decompression overhead and to manage the compressed blocks efficiently which include selective compression, fixed space allocation for the compressed blocks, parallel decompression, the use of a decompression buffer, and so on. Moreover a simple compressed cache architecture based on the above techniques and its management method are proposed. The results from trace-driven simulation show that this approach can provide around 35% decrease in the on-chip cache miss ratio as well as a 53% decrease in the data traffic over the conventional memory systems. Also, a large amount of the decompression overhead can be reduced, and thus the average memory access time can also be reduced by maximum 20% against the conventional memory systems.

Journal ArticleDOI
Yutai Ma1
TL;DR: The memory organization of FFT processors is considered and a new memory addressing assignment allows simultaneous access to all the data needed for butterfly calculations.
Abstract: The memory organization of FFT processors is considered. The new memory addressing assignment allows simultaneous access to all the data needed for butterfly calculations. The advantage of this memory addressing scheme lies in the fact that it reduces the delay of address generation nearly by half compared to existing ones.

Patent
Anthony Solomon1, Yan Li1
14 Apr 1999
TL;DR: In this paper, the authors present a method and apparatus for analyzing the configuration of a computer main memory, which can be used to program the memory controller and to determine whether a user-selected configuration is consistent with those restrictions.
Abstract: A method and apparatus for analyzing the configuration of a computer main memory. A complex memory controller, which imposes restrictions on the memory's configuration, determines whether a user-selected configuration is consistent with those restrictions. The results of the determination are then reported to the user. The results may also be used to program the memory controller.

01 Jan 1999
TL;DR: The Hot Pages system optimizes the translation code for the hit case and caches translated virtual page descriptions likely to be reused for nearby memory references, reducing the cost of performing translations, and eliminating it entirely in some cases.
Abstract: This paper describes Hot Pages, a software based solution for managing on-chip data on the MIT Raw Machine, a scalable, parallel, microprocessor architecture. This software system transparently manages the mapping between the program address space and available on-chip memory. Hot Pages implements a multi-bank memory structure, allowing multiple references in parallel, to provide memory bandwidth matched to the computational resources on the Raw microprocessor. Because virtualization is handled in software rather than hardware, the system is easier to design and test, and provides flexibility for customized solutions. The challenge for this kind of software based approach is to balance the tradeoffs between the added software overheads against opportunities provided by a memory management scheme specialized for each application. Our technique, called Hot Pages, combines both compile-time and runtime techniques to reduce the software overheads. The Hot Pages system optimizes the translation code for the hit case and caches translated virtual page descriptions likely to be reused for nearby memory references. We use pointer analysis to identify program memory references that can reuse a translated virtual page description. This allows us to specialize the code for each memory reference, reducing the cost of performing translations, and eliminating it entirely in some cases. The framework also provides additional opportunities for optimization because the cost of sophisticated memory management schemes can be relegated to translation misses. We present simulation results for a variety of applications running with Hot Pages on a prototype Raw system.

Patent
16 Sep 1999
TL;DR: In this article, a system for data access in a packet-switched network, including a sender/computer including an operating unit, a first memory, a permanent storage memory and a processor and a remote receiver/computer, was presented.
Abstract: The invention provides a system for data access in a packet-switched network, including a sender/computer including an operating unit, a first memory, a permanent storage memory and a processor and a remote receiver/computer including an operating unit, a first memory, a permanent storage memory and a processor, the sender/computer and the receiver/computer communicating through the network; the sender/computer further including device for calculating digital digests on data; the receiver/computer further including a network cache memory and device for calculating digital digests on data in the network cache memory; and the receiver/computer and/or the sender/computer including device for comparison between digital digests. The invention also provides a method and apparatus for increased data access in a packet-switched network.

Journal ArticleDOI
TL;DR: This work addresses the problem of system power reduction through transition count minimization on the memory address bus when behavioral arrays are accessed from memory by exploiting regularity and spatial locality in the memory accesses and determining the mapping of behavioral array references to physical memory locations to minimize address bus transitions.
Abstract: Arrays in behavioral specifications that are too large to fit into on-chip registers are usually mapped to off-chip memories during behavioral synthesis. We address the problem of system power reduction through transition count minimization on the memory address bus when these arrays are accessed from memory. We exploit regularity and spatial locality in the memory accesses and determine the mapping of behavioral array references to physical memory locations to minimize address bus transitions. We describe array mapping strategies for two important memory configurations: all behavioral arrays mapped to a single off-chip memory and arrays mapped into multiple memory modules drawn from a library. For the single memory configuration, we describe a heuristic for selecting a memory mapping scheme to achieve low power for each behavioral array. For mapping into a library of multiple memory modules, we formulate the problem as three logical-to-physical memory mapping subtasks and present experiments demonstrating the transition count reductions based on our approach. Our experiments on several image processing benchmarks show power savings of up to 63% through reduced transition activity on the memory address bus in the single memory case. We also observe a further transition count reduction by a factor of 1.5-6.7 over a straightforward mapping scheme in the multiple memories configuration.

Patent
Paul Crowley1, John M. Jaugilas1, David S. Lampert1, Alex Nash1, Senthil K. Natesan1 
24 Mar 1999
TL;DR: In this article, a method and system for managing memory resources in a system used in conjunction with a navigation application program that accesses geographic data is presented, where the data records in each portion of the plurality of data records that forms each parcel are accessed together.
Abstract: A method and system for managing memory resources in a system used in conjunction with a navigation application program that accesses geographic data. The geographic data are comprised of a plurality of data records. The plurality of data records are organized into parcels, each of which contains a portion of the plurality of data records, such that the data records in each portion of the plurality of data records that forms each parcel are accessed together. One or more buffers each that forms a contiguous portion of the memory of the navigation system is provided as a cache to store a plurality of parcels. One or more data structures located outside the contiguous portion of memory identify the parcels of data stored in the cache and the locations in the cache at which the parcels are stored. The one or more data structures located outside the contiguous portion of memory in which the parcels are cached are used to manage the parcel cache to use it efficiently. These one or more data structures located outside the contiguous memory in which the parcels are cached are also used to defragment the parcel cache.