scispace - formally typeset
Search or ask a question
Topic

Memory management

About: Memory management is a research topic. Over the lifetime, 16743 publications have been published within this topic receiving 312028 citations. The topic is also known as: memory allocation.


Papers
More filters
Journal ArticleDOI
Maurice V. Wilkes1
TL;DR: Since 1980, the memory gap has been increasing steadily, and during the last ten years, processors have been improving in speed by 60% per annum, whereas DRAM memory access has been improving at barely 10%.
Abstract: The first main memories to be used on digital computers were constructed using a technology much slower than that used for the logic circuits, and it was taken for granted that there would be a memory gap. Mercury delay line memories spent a lot of their time waiting for the required word to come round and were very slow indeed. CRT (Williams Tube) memories and the core memories that followed them were much better. By the early 1970s semiconductor memories were beginning to appear. This did not result in memory performance catching up fully with processor performance, although in the 1970s it came close. It might have expected that from that point memories and processors would scale together, but this did not happen. This was because of significant differences in the DRAM semiconductor technology used for memories compared with the technology used for circuits. The memory gap makes itself felt when a cache miss occurs and the missing word must be be supplied from main memory. It thus only affects users whose programs do not fit into the L2 cache. As far as a workstation user is concerned, the most noticeable effect of an increased memory gap is to make the observed performance more dependent on the application area than it would otherwise be. Since 1980, the memory gap has been increasing steadily. During the last ten years, processors have been improving in speed by 60% per annum, whereas DRAM memory access has been improving at barely 10%. It may thus be said that, while the memory gap is not at present posing a major problem, the writing is on the wall. On an Alpha 21264 667 MHz workstation (XP1000) in 2000, a cache miss cost about 128 clock cycles. This may be compared with the 8 – 32 clock cycles in the minicomputer and workstations of 1990 [1]. If the memory latency remains unchanged, the number of cycles of processor idle time is doubled with each doubling of speed of the processor. A factor of four will bring us to about 500 clock cycles.

115 citations

Journal ArticleDOI
TL;DR: STARAN® has a number of array modules, each with a multidimensional access (MDA) memory, which can be accessed in either the word direction or the bit-slice direction, making associative processing possible without the need for costly, custom-made logic-in-memory chips.
Abstract: STARAN® has a number of array modules, each with a multidimensional access (MDA) memory. The implementation of this memory with random-access memory (RAM) chips is described. Because data can be accessed in either the word direction or the bit-slice direction, associative processing is possible without the need for costly, custom-made logic-in-memory chips.

114 citations

Proceedings ArticleDOI
Govindaraju, Lloyd, Dotsenko, Smith, Manferdelli 
01 Jan 2008

114 citations

Journal ArticleDOI
TL;DR: A user-transparent checkpointing recovery scheme and a new twin-page disk storage management technique are presented for implementing recoverable distributed shared virtual memory.
Abstract: The problem of rollback recovery in distributed shared virtual environments, in which the shared memory is implemented in software in a loosely coupled distributed multicomputer system, is examined. A user-transparent checkpointing recovery scheme and a new twin-page disk storage management technique are presented for implementing recoverable distributed shared virtual memory. The checkpointing scheme can be integrated with the memory coherence protocol for managing the shared virtual memory. The twin-page disk design allows checkpointing to proceed in an incremental fashion without an explicit undo at the time of recovery. The recoverable distributed shared virtual memory allows the system to restart computation from a checkpoint without a global restart. >

114 citations

Proceedings ArticleDOI
07 Dec 2013
TL;DR: This work designs and evaluates a locality-aware memory hierarchy for throughput processors, such as GPUs, that retains the advantages of coarse-grained accesses for spatially and temporally local programs while permitting selective fine- grained access to memory.
Abstract: As GPU's compute capabilities grow, their memory hierarchy increasingly becomes a bottleneck. Current GPU memory hierarchies use coarse-grained memory accesses to exploit spatial locality, maximize peak bandwidth, simplify control, and reduce cache meta-data storage. These coarse-grained memory accesses, however, are a poor match for emerging GPU applications with irregular control flow and memory access patterns. Meanwhile, the massive multi-threading of GPUs and the simplicity of their cache hierarchies make CPU-specific memory system enhancements ineffective for improving the performance of irregular GPU applications. We design and evaluate a locality-aware memory hierarchy for throughput processors, such as GPUs. Our proposed design retains the advantages of coarse-grained accesses for spatially and temporally local programs while permitting selective fine-grained access to memory. By adaptively adjusting the access granularity, memory bandwidth and energy are reduced for data with low spatial/temporal locality without wasting control overheads or prefetching potential for data with high spatial locality. As such, our locality-aware memory hierarchy improves GPU performance, energy-efficiency, and memory throughput for a large range of applications.

114 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
94% related
Scalability
50.9K papers, 931.6K citations
92% related
Server
79.5K papers, 1.4M citations
89% related
Virtual machine
43.9K papers, 718.3K citations
87% related
Scheduling (computing)
78.6K papers, 1.3M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202333
202288
2021629
2020467
2019461
2018591