scispace - formally typeset
Search or ask a question
Topic

Memory management

About: Memory management is a research topic. Over the lifetime, 16743 publications have been published within this topic receiving 312028 citations. The topic is also known as: memory allocation.


Papers
More filters
Proceedings ArticleDOI
01 Oct 1998
TL;DR: It is shown that using the register allocation's coloring paradigm to assign spilled values to memory can greatly reduce the amount of memory required by a program, and speedups from using CCM may be sizable.
Abstract: Optimizations aimed at reducing the impact of memory operations on execution speed have long concentrated on improving cache performance. These efforts achieve a. reasonable level of success. The primary limit on the compiler's ability to improve memory behavior is its imperfect knowledge about the run-time behavior of the program. The compiler cannot completely predict runtime access patterns.There is an exception to this rule. During the register allocation phase, the compiler often must insert substantial amounts of spill code; that is, instructions that move values from registers to memory and back again. Because the compiler itself inserts these memory instructions, it has more knowledge about them than other memory operations in the program.Spill-code operations are disjoint from the memory manipulations required by the semantics of the program being compiled, and, indeed, the two can interfere in the cache. This paper proposes a hardware solution to the problem of increased spill costs---a small compiler-controlled memory (CCM) to hold spilled values. This small random-access memory can (and should) be placed in a distinct address space from the main memory hierarchy. The compiler can target spill instructions to use the CCM, moving most compiler-inserted memory traffic out of the pathway to main memory and eliminating any impact that those spill instructions would have on the state of the main memory hierarchy. Such memories already exist on some DSP microprocessors. Our techniques can be applied directly on those chips.This paper presents two compiler-based methods to exploit such a memory, along with experimental results showing that speedups from using CCM may be sizable. It shows that using the register allocation's coloring paradigm to assign spilled values to memory can greatly reduce the amount of memory required by a program.

95 citations

Proceedings ArticleDOI
01 Jun 1998
TL;DR: Measurements and analysis show that by using available global resources, cooperative prefetching can obtain significant speedups for I/O-bound programs, and shows that for a graphics rendering application, the PGMS system achieves a speedup of 4.9 over a non-prefetching version of the same program, and a 3.1-fold improvement over that program using local-diskPrefetching alone.
Abstract: This paper presents cooperative prefetching and caching --- the use of network-wide global resources (memories, CPUs, and disks) to support prefetching and caching in the presence of hints of future demands Cooperative prefetching and caching effectively unites disk-latency reduction techniques from three lines of research: prefetching algorithms, cluster-wide memory management, and parallel I/O When used together, these techniques greatly increase the power of prefetching relative to a conventional (non-global-memory) system We have designed and implemented PGMS, a cooperative prefetching and caching system, under the Digital Unix operating system running on a 128 Gb/sec Myrinet-connected cluster of DEC Alpha workstations Our measurements and analysis show that by using available global resources, cooperative prefetching can obtain significant speedups for I/O-bound programs For example, for a graphics rendering application, our system achieves a speedup of 49 over a non-prefetching version of the same program, and a 31-fold improvement over that program using local-disk prefetching alone

95 citations

Proceedings ArticleDOI
31 Aug 2012
TL;DR: This paper proposes a distributed PIT table, named DiPIT, where a part of the PIT is on every interface, and relies on Bloom Filters in order to reduce the necessary memory space for implementing the PIT, completed with a central Bloom Filter for limiting the false positives, generated by the individual Bloom Filtered.
Abstract: Content-Centric Network is a novel Internet design that investigates the shifting of the Internet usage from browsing to content dissemination. This new Internet architecture proposal can bring many benefits and it has attracted many research works. When we study on these research, we found that one of the important components, the Pending Interest Table (PIT), did not get much attention. Since that in CCN the networking behaviours are no longer based on endpoints location but rather on every piece of the content itself, and the PIT is involved in both the forwarding processes upstream and downstream, in case of large amount of requests, the table size is a big issue, that leads to a large required memory space for implementing such a CCN node. In this paper, we propose a distributed PIT table, named DiPIT, where a part of the PIT is on every interface. Our approach relies on Bloom Filters in order to reduce the necessary memory space for implementing the PIT, completed with a central Bloom Filter for limiting the false positives, generated by the individual Bloom Filters. The evaluations we have performed highlight that our DiPIT approach can significantly reduce the memory space (up to 63%) in the CCN node and support a higher incoming packet throughput, compared to the hash table technology, which is largely implemented in current routers.

95 citations

Proceedings ArticleDOI
10 Jun 2002
TL;DR: An automatic data migration strategy which dynamically places the arrays with temporal affinity into the same set of banks is described which increases the number of banks which can be put into low-power modes and allows the use of more aggressive energy-saving modes.
Abstract: An architectural solution to reducing memory energy consumption is to adopt a multi-bank memory system instead of a monolithic (single-bank) memory system. Some recent multi-bank memory architectures help reduce memory energy by allowing an unused bank to be placed into a low-power operating mode. This paper describes an automatic data migration strategy which dynamically places the arrays with temporal affinity into the same set of banks. This strategy increases the number of banks which can be put into low-power modes and allows the use of more aggressive energy-saving modes. Experiments using several array-dominated applications show the usefulness of data migration and indicate that large energy savings can be achieved with low overhead.

95 citations

Proceedings ArticleDOI
14 Mar 2015
TL;DR: A novel compiler framework, called Facade, that can generate highly-efficient data manipulation code by automatically transforming the data path of an existing Big Data application by leading to significantly reduced memory management cost and improved scalability.
Abstract: The past decade has witnessed the increasing demands on data-driven business intelligence that led to the proliferation of data-intensive applications. A managed object-oriented programming language such as Java is often the developer's choice for implementing such applications, due to its quick development cycle and rich community resource. While the use of such languages makes programming easier, their automated memory management comes at a cost. When the managed runtime meets Big Data, this cost is significantly magnified and becomes a scalability-prohibiting bottleneck. This paper presents a novel compiler framework, called Facade, that can generate highly-efficient data manipulation code by automatically transforming the data path of an existing Big Data application. The key treatment is that in the generated code, the number of runtime heap objects created for data types in each thread is (almost) statically bounded, leading to significantly reduced memory management cost and improved scalability. We have implemented Facade and used it to transform 7 common applications on 3 real-world, already well-optimized Big Data frameworks: GraphChi, Hyracks, and GPS. Our experimental results are very positive: the generated programs have (1) achieved a 3%--48% execution time reduction and an up to 88X GC reduction; (2) consumed up to 50% less memory, and (3) scaled to much larger datasets.

95 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
94% related
Scalability
50.9K papers, 931.6K citations
92% related
Server
79.5K papers, 1.4M citations
89% related
Virtual machine
43.9K papers, 718.3K citations
87% related
Scheduling (computing)
78.6K papers, 1.3M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202333
202288
2021629
2020467
2019461
2018591