Topic

Memory management

About: Memory management is a research topic. Over the lifetime, 16743 publications have been published within this topic receiving 312028 citations. The topic is also known as: memory allocation.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

In-place Activated BatchNorm for Memory-Optimized Training of DNNs

[...]

Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder

18 Jun 2018

TL;DR: In-Place Activated Batch Normalization (INPLACE-ABN) as mentioned in this paper replaces the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery.

...read moreread less

Abstract: In this work we present In-Place Activated Batch Normalization (INPLACE-ABN) - a novel approach to drastically reduce the training memory footprint of modern deep neural networks in a computationally efficient way. Our solution substitutes the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery while providing straightforward applicability for existing deep learning frameworks. We obtain memory savings of up to 50% by dropping intermediate results and by recovering required information during the backward pass through the inversion of stored forward results, with only minor increase (0.8-2%) in computation time. Also, we demonstrate how frequently used checkpointing approaches can be made computationally as efficient as INPLACE-ABN. In our experiments on image classification, we demonstrate on-par results on ImageNet-1k with state-of-the-art approaches. On the memory-demanding task of semantic segmentation, we report competitive results for COCO-Stuff and set new state-of-the-art results for Cityscapes and Mapillary Vistas. Code can be found at https://github.com/mapillary/inplace_abn.

...read moreread less

281 citations

Proceedings Article•DOI•

Reducing memory interference in multicore systems via application-aware memory channel partitioning

[...]

Sai Prashanth Muralidhara¹, Lavanya Subramanian², Onur Mutlu², Mahmut Kandemir¹, Thomas Moscibroda³ - Show less +1 more•Institutions (3)

Pennsylvania State University¹, Carnegie Mellon University², Microsoft³

03 Dec 2011

TL;DR: In this paper, the authors present an alternative approach to reduce inter-application interference in the memory system: application-aware memory channel partitioning (MCP), which maps the data of applications that are likely to severely interfere with each other to different memory channels.

...read moreread less

Abstract: Main memory is a major shared resource among cores in a multicore system. If the interference between different applications' memory requests is not controlled effectively, system performance can degrade significantly. Previous work aimed to mitigate the problem of interference between applications by changing the scheduling policy in the memory controller, i.e., by prioritizing memory requests from applications in a way that benefits system performance.In this paper, we first present an alternative approach to reducing inter-application interference in the memory system: application-aware memory channel partitioning (MCP). The idea is to map the data of applications that are likely to severely interfere with each other to different memory channels. The key principles are to partition onto separate channels 1) the data of light (memory non-intensive) and heavy (memory-intensive) applications, 2) the data of applications with low and high row-buffer locality.Second, we observe that interference can be further reduced with a combination of memory channel partitioning and scheduling, which we call integrated memory partitioning and scheduling (IMPS). The key idea is to 1) always prioritize very light applications in the memory scheduler since such applications cause negligible interference to others, 2) use MCP to reduce interference among the remaining applications.We evaluate MCP and IMPS on a variety of multi-programmed workloads and system configurations and compare them to four previously proposed state-of-the-art memory scheduling policies. Averaged over 240 workloads on a 24-core system with 4 memory channels, MCP improves system throughput by 7.1% over an application-unaware memory scheduler and 1% over the previous best scheduler, while avoiding modifications to existing memory schedulers. IMPS improves system throughput by 11.1% over an application-unaware scheduler and 5% over the previous best scheduler, while incurring much lower hardware complexity than the latter.

...read moreread less

281 citations

Journal Article•DOI•

Trapezoid self-scheduling: a practical scheduling scheme for parallel compilers

[...]

Ten H. Tzen¹, Lionel M. Ni¹•Institutions (1)

Michigan State University¹

01 Jan 1993-IEEE Transactions on Parallel and Distributed Systems

TL;DR: The experiments conducted in a 96-node Butterfly GP-1000 clearly show the advantage of the trapezoid self-scheduling over other well-known self- scheduling approaches.

...read moreread less

Abstract: A practical processor self-scheduling scheme, trapezoid self-scheduling, is proposed for arbitrary parallel nested loops in shared-memory multiprocessors. Generally, loops are the richest source of parallelism in parallel programs. To dynamically allocate loop iterations to processors, one may achieve load balancing among processors at the expense of run-time scheduling overhead. By linearly decreasing the chunk size at run time, the best tradeoff between the scheduling overhead and balanced workload can be obtained in the proposed trapezoid self-scheduling approach. Due to its simplicity and flexibility, this approach can be efficiently implemented in any parallel compiler. The small and predictable number of chores also allow efficient management of memory in a static fashion. The experiments conducted in a 96-node Butterfly GP-1000 clearly show the advantage of the trapezoid self-scheduling over other well-known self-scheduling approaches. >

...read moreread less

279 citations

Proceedings Article•DOI•

MemScale: active low-power modes for main memory

[...]

Qingyuan Deng¹, David Meisner², Luiz Ramos¹, Thomas F. Wenisch², Ricardo Bianchini¹ - Show less +1 more•Institutions (2)

Rutgers University¹, University of Michigan²

05 Mar 2011

TL;DR: The results demonstrate that MemScale reduces energy consumption significantly compared to modern memory energy management approaches, and it is concluded that the potential benefits of the MemScale mechanisms and policy more than compensate for their small hardware cost.

...read moreread less

Abstract: Main memory is responsible for a large and increasing fraction of the energy consumed by servers. Prior work has focused on exploiting DRAM low-power states to conserve energy. However, these states require entire DRAM ranks to be idled, which is difficult to achieve even in lightly loaded servers. In this paper, we propose to conserve memory energy while improving its energy-proportionality by creating active low-power modes for it. Specifically, we propose MemScale, a scheme wherein we apply dynamic voltage and frequency scaling (DVFS) to the memory controller and dynamic frequency scaling (DFS) to the memory channels and DRAM devices. MemScale is guided by an operating system policy that determines the DVFS/DFS mode of the memory subsystem based on the current need for memory bandwidth, the potential energy savings, and the performance degradation that applications are willing to withstand. Our results demonstrate that MemScale reduces energy consumption significantly compared to modern memory energy management approaches. We conclude that the potential benefits of the MemScale mechanisms and policy more than compensate for their small hardware cost.

...read moreread less

277 citations

Proceedings Article•DOI•

Fine-grain access control for distributed shared memory

[...]

Ioannis T. Schoinas¹, Babak Falsafi¹, Alvin R. Lebeck¹, Steven K. Reinhardt¹, James R. Larus¹, Darien Wood¹ - Show less +2 more•Institutions (1)

University of Wisconsin-Madison¹

01 Nov 1994

TL;DR: In this paper, the authors discuss implementations of fine-grain memory access control, which selectively restricts reads and writes to cache-block-sized memory regions, and incorporate three techniques that require no additional hardware into Blizzard.

...read moreread less

Abstract: This paper discusses implementations of fine-grain memory access control, which selectively restricts reads and writes to cache-block-sized memory regions. Fine-grain access control forms the basis of efficient cache-coherent shared memory. This paper focuses on low-cost implementations that require little or no additional hardware. These techniques permit efficient implementation of shared memory on a wide range of parallel systems, thereby providing shared-memory codes with a portability previously limited to message passing.This paper categorizes techniques based on where access control is enforced and where access conflicts are handled. We incorporated three techniques that require no additional hardware into Blizzard, a system that supports distributed shared memory on the CM-5. The first adds a software lookup before each shared-memory reference by modifying the program's executable. The second uses the memory's error correcting code (ECC) as cache-block valid bits. The third is a hybrid. The software technique ranged from slightly faster to two times slower than the ECC approach. Blizzard's performance is roughly comparable to a hardware shared-memory machine. These results argue that clusters of workstations or personal computers with networks comparable to the CM-5's will be able to support the same shared-memory interfaces as supercomputers.

...read moreread less

275 citations

Collapse

Network Information

Performance

Metrics

16,861

Papers

331,311

Citations

No. of papers in the topic in previous years
Year	Papers
2023	33
2022	88
2021	629
2020	467
2019	461
2018	591

Memory management

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics