FIRM: Fair and High-Performance Memory Control for Persistent Memory Systems
Jishen Zhao,Onur Mutlu,Yuan Xie +2 more
- Vol. 2014, pp 153-165
Reads0
Chats0
TLDR
The goal in this paper is to design a fair and high-performance memory control scheme for a persistent memory based system that runs both persistent and non-persistent applications, and detailed evaluations show that FIRM provides significantly higher system performance and fairness.Abstract:
Byte-addressable nonvolatile memories promise a new technology, persistent memory, which incorporates desirable attributes from both traditional main memory (byte-addressability and fast interface) and traditional storage (data persistence). To support data persistence, a persistent memory system requires sophisticated data duplication and ordering control for write requests. As a result, applications that manipulate persistent memory (persistent applications) have very different memory access characteristics than traditional (non-persistent) applications, as shown in this paper. Persistent applications introduce heavy write traffic to contiguous memory regions at a memory channel, which cannot concurrently service read and write requests, leading to memory bandwidth underutilization due to low bank-level parallelism, frequent write queue drains, and frequent bus turnarounds between reads and writes. These characteristics undermine the high-performance and fairness offered by conventional memory scheduling schemes designed for non-persistent applications. Our goal in this paper is to design a fair and high-performance memory control scheme for a persistent memory based system that runs both persistent and non-persistent applications. Our proposal, FIRM, consists of three key ideas. First, FIRM categorizes request sources as non-intensive, streaming, random and persistent, and forms batches of requests for each source. Second, FIRM strides persistent memory updates across multiple banks, thereby improving bank-level parallelism and hence memory bandwidth utilization of persistent memory accesses. Third, FIRM schedules read and write request batches from different sources in a manner that minimizes bus turnarounds and write queue drains. Our detailed evaluations show that, compared to five previous memory scheduler designs, FIRM provides significantly higher system performance and fairness.read more
Citations
More filters
Proceedings ArticleDOI
High-Performance Transactions for Persistent Memories
TL;DR: This work presents a comprehensive analysis contrasting two transaction designs across three NVRAM programming interfaces, demonstrating up to 2.5x speedup.
Journal ArticleDOI
Research Problems and Opportunities in Memory Systems
Onur Mutlu,Lavanya Subramanian +1 more
TL;DR: This article describes three major new research challenges and solution directions in enabling new DRAM architectures, functions, interfaces, and better integration of the DRAM and the rest of the system and designs a memory system that employs emerging non-volatile memory technologies and takes advantage of multiple different technologies.
Proceedings ArticleDOI
Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms
Kevin K. Chang,Abdullah Giray Yağlıkçı,Saugata Ghose,Aditya Agrawal,Niladrish Chatterjee,Abhijith Kashyap,Donghyuk Lee,Mike O'Connor,Hasan Hassan,Onur Mutlu +9 more
TL;DR: This paper takes a comprehensive approach to understanding and exploiting the latency and reliability characteristics of modern DRAM when the supply voltage is lowered below the nominal voltage level specified by manufacturers.
Proceedings ArticleDOI
ThyNVM: enabling software-transparent crash consistency in persistent memory systems
TL;DR: A hardware-assisted DRAM+NVM hybrid persistent memory design, Transparent Hybrid NVM (ThyNVM), which supports software-transparent crash consistency of memory data in a hybrid memory system and efficiently enforce crash consistency through a new dual-scheme checkpointing mechanism.
Proceedings ArticleDOI
Delegated persist ordering
Aasheesh Kolli,Jeffrey Rosen,Stephan Diestelhorst,Ali G. Saidi,Steven Pelley,Sihang Liu,Peter M. Chen,Thomas F. Wenisch +7 more
TL;DR: Delegated ordering is proposed, wherein ordering requirements are communicated explicitly to the PM controller, fully decoupling PM write ordering from volatile execution and cache management, and it is demonstrated that delegated ordering can bring performance within 1.93x of volatile execution, improving over SO by 3.73x.
References
More filters
Journal ArticleDOI
Pin: building customized program analysis tools with dynamic instrumentation
Chi-Keung Luk,Robert Cohn,Robert Muth,Harish Patil,Artur Klauser,Geoff Lowney,Steven Wallace,Vijay Janapa Reddi,Kim Hazelwood +8 more
TL;DR: The goals are to provide easy-to-use, portable, transparent, and efficient instrumentation, and to illustrate Pin's versatility, two Pintools in daily use to analyze production software are described.
Proceedings ArticleDOI
Architecting phase change memory as a scalable dram alternative
TL;DR: This work proposes, crafted from a fundamental understanding of PCM technology parameters, area-neutral architectural enhancements that address these limitations and make PCM competitive with DRAM.
Proceedings ArticleDOI
Scalable high performance main memory system using phase-change memory technology
TL;DR: This paper analyzes a PCM-based hybrid main memory system using an architecture level model of PCM and proposes simple organizational and management solutions of the hybrid memory that reduces the write traffic to PCM, boosting its lifetime from 3 years to 9.7 years.
Journal ArticleDOI
NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory
TL;DR: NVSim is developed, a circuit-level model for NVM performance, energy, and area estimation, which supports various NVM technologies, including STT-RAM, PCRAM, ReRAM, and legacy NAND Flash and is expected to help boost architecture-level NVM-related studies.
Benchmarking modern multiprocessors
Kai Li,Christian Bienia +1 more
TL;DR: A methodology to design effective benchmark suites is developed and its effectiveness is demonstrated by developing and deploying a benchmark suite for evaluating multiprocessors called PARSEC, which has been adopted by many architecture groups in both research and industry.