scispace - formally typeset
Search or ask a question
Topic

Memory management

About: Memory management is a research topic. Over the lifetime, 16743 publications have been published within this topic receiving 312028 citations. The topic is also known as: memory allocation.


Papers
More filters
Patent
16 May 2008
TL;DR: In this paper, the authors propose a switchable virtualization of external memory resources as internal memory resources by connecting each storage device by a plurality of paths in switchable manner, thus improving reliability and ease of use.
Abstract: The present invention uses memory resources effectively and connects each storage device by a plurality of paths in a switchable manner, thus improving reliability and ease of use, by virtualizing external memory resources as internal memory resources. External storage 2 is connected to the main storage 1, and the actual volume 2A is mapped onto the virtual volume 1A. A plurality of paths is connected between the storage 1 and 2. When a failure occurs in a path in use (S3), the path having the next highest priority is selected (S4), and processing is continued using this path (S5).

72 citations

Proceedings ArticleDOI
16 Oct 2000
TL;DR: Measurements show that the performance advantage for the collector increases as the number of threads increase and that it provides uniformly low response times, compared with stop-the-world mark-sweep GC.
Abstract: Java uses garbage collection (GC) for the automatic reclamation of computer memory no longer required by a running application. GC implementations for Java Virtual Machines (JVM) are typically designed for single processor machines, and do not necessarily perform well for a server program with many threads running on a multiprocessor. We designed and implemented an on-the-fly GC, based on the algorithm of Doligez, Leroy and Gonthier [13, 12] (DLG), for Java in this environment. An on-the-fly collector, a collector that does not stop the program threads, allows all processors to be utilized during collection and provides uniform response times. We extended and adapted DLG for Java (e.g., adding support for weak references) and for modern multiprocessors without sequential consistency, and added performance improvements (e.g., to keep track of the objects remaining to be traced). We compared the performance of our implementation with stop-the-world mark-sweep GC. Our measurements show that the performance advantage for our collector increases as the number of threads increase and that it provides uniformly low response times.

72 citations

Proceedings ArticleDOI
13 Dec 2014
TL;DR: Equalizer, a low overhead hardware runtime system, that dynamically monitors the resource requirements of a kernel and manages the amount of on-chip concurrency, core frequency and memory frequency to adapt the hardware to best match the needs of the running kernel is proposed.
Abstract: GPUs use thousands of threads to provide high performance and efficiency. In general, if one thread of a kernel uses one of the resources (compute, bandwidth, data cache) more heavily, there will be significant contention for that resource due to the large number of identical concurrent threads. This contention will eventually saturate the performance of the kernel due to contention for the bottleneck resource, while at the same time leaving other resources underutilized. To overcome this problem, a runtime system that can tune the hardware to match the characteristics of a kernel can effectively mitigate the imbalance between resource requirements of kernels and the hardware resources present on the GPU. We propose Equalizer, a low overhead hardware runtime system, that dynamically monitors the resource requirements of a kernel and manages the amount of on-chip concurrency, core frequency and memory frequency to adapt the hardware to best match the needs of the running kernel. Equalizer provides efficiency in two modes. Firstly, it can save energy without significant performance degradation by GPUs use thousands of threads to provide high performance and efficiency. In general, if one thread of a kernel uses one of the resources (compute, bandwidth, data cache) more heavily, there will be significant contention for that resource due to the large number of identical concurrent threads. This contention will eventually saturate the performance of the kernel due to contention for the bottleneck resource, while at the same time leaving other resources underutilized. To overcome this problem, a runtime system that can tune the hardware to match the characteristics of a kernel can effectively mitigate the imbalance between resource requirements of kernels and the hardware resources present on the GPU. We propose Equalizer, a low overhead hardware runtime system, that dynamically monitors the resource requirements of a kernel and manages the amount of on-chip concurrency, core frequency and memory frequency to adapt the hardware to best match the needs of the running kernel. Equalizer provides efficiency in two modes. Firstly, it can save energy without significant performance degradation by throttling under-utilized resources. Secondly, it can boost bottleneck resources to reduce contention and provide higher performance without significant energy increase. Across a spectrum of 27 kernels, Equalizer achieves 15% savings in energy mode and 22% speedup in performance mode. Throttling under-utilized resources. Secondly, it can boost bottleneck resources to reduce contention and provide higher performance without significant energy increase. Across a spectrum of 27 kernels, Equalizer achieves 15% savings in energy mode and 22% speedup in performance mode.

72 citations

Patent
Landy Wang1
08 Feb 2000
TL;DR: In this paper, a memory manager maintains information related to the mapping of virtual addresses to physical pages, in order to verify remap requests and invalidate existing mappings from a virtual address to a previously mapped physical page.
Abstract: A system and method for providing applications with the ability to access an increased amount of memory. An application maps a specified address range in its (small) virtual memory space to a corresponding number of pages allocated thereto in (relatively large) physical memory. When the application accesses an address in that range in virtual memory, e.g., via a thirty-two-bit address, the mapping information is used to access the corresponding page currently pointed to in the physical memory, allowing access to significantly greater amounts of memory. Fine granularity of access (e.g., one page) is provided, along with fast remapping, cross-process security and coherency across multiple processors in a multiprocessor system. To this end, a memory manager maintains information related to the mapping of virtual addresses to physical pages, in order to verify remap requests and invalidate existing mappings from a virtual address to a previously mapped physical page. For coherency in a multi-processor system, a list is maintained for invalidating existing mappings in the translation buffers of other processors in a consolidated operation, thereby requiring only a single interrupt of each processor to invalidate mappings.

72 citations

Proceedings ArticleDOI
Chanik Park1, Junghee Lim2, Kiwon Kwon2, Jaejin Lee2, Sang Lyul Min2 
27 Sep 2004
TL;DR: This paper proposes a novel, application specific demand paging mechanism for low-end embedded systems with flash memory as secondary storage, and shows that this approach can reduce the code memory size by 33% on average with reasonable performance degradation and energy consumption.
Abstract: In this paper, we propose a novel, application specific demand paging mechanism for low-end embedded systems with flash memory as secondary storage. These systems are not equipped with virtual memory. A small memory space called an execution buffer is allocated to page an application. An application-specific page manager manages the buffer. The manager is generated by a compiler post-pass and combined with the application image. Our compiler post-pass analyzes the ELF executable image of an application and transforms function call/return instructions into calls to the page manager. As a result, each function of the code can be loaded into memory on demand at run time. To minimize the overhead of demand paging, code clustering algorithms are also presented. We evaluate our techniques with five embedded applications. We show that our approach can reduce the code memory size by 33% on average with reasonable performance degradation (8-20%) and energy consumption (10% more on average) for low-end embedded systems.

72 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
94% related
Scalability
50.9K papers, 931.6K citations
92% related
Server
79.5K papers, 1.4M citations
89% related
Virtual machine
43.9K papers, 718.3K citations
87% related
Scheduling (computing)
78.6K papers, 1.3M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202333
202288
2021629
2020467
2019461
2018591