Topic

Cache-only memory architecture

About: Cache-only memory architecture is a research topic. Over the lifetime, 5038 publications have been published within this topic receiving 114066 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Cache Memories

[...]

Alan Jay Smith¹•Institutions (1)

University of California, Berkeley¹

01 Sep 1982-ACM Computing Surveys

TL;DR: Specific aspects of cache memories investigated include: the cache fetch algorithm (demand versus prefetch), the placement and replacement algorithms, line size, store-through versus copy-back updating of main memory, cold-start versus warm-start miss ratios, mulhcache consistency, the effect of input /output through the cache, the behavior of split data/instruction caches, and cache size.

...read moreread less

Abstract: design issues. Specific aspects of cache memories tha t are investigated include: the cache fetch algorithm (demand versus prefetch), the placement and replacement algorithms, line size, store-through versus copy-back updating of main memory, cold-start versus warm-start miss ratios, mulhcache consistency, the effect of input /output through the cache, the behavior of split data/instruction caches, and cache size. Our discussion includes other aspects of memory system architecture, including translation lookaside buffers. Throughout the paper, we use as examples the implementation of the cache in the Amdahl 470V/6 and 470V/7, the IBM 3081, 3033, and 370/168, and the DEC VAX 11/780. An extensive bibliography is provided.

...read moreread less

1,614 citations

Proceedings Article•DOI•

Scalable high performance main memory system using phase-change memory technology

[...]

Moinuddin K. Qureshi¹, Vijayalakshmi Srinivasan¹, Jude A. Rivers¹•Institutions (1)

IBM¹

20 Jun 2009

TL;DR: This paper analyzes a PCM-based hybrid main memory system using an architecture level model of PCM and proposes simple organizational and management solutions of the hybrid memory that reduces the write traffic to PCM, boosting its lifetime from 3 years to 9.7 years.

...read moreread less

Abstract: The memory subsystem accounts for a significant cost and power budget of a computer system. Current DRAM-based main memory systems are starting to hit the power and cost limit. An alternative memory technology that uses resistance contrast in phase-change materials is being actively investigated in the circuits community. Phase Change Memory (PCM) devices offer more density relative to DRAM, and can help increase main memory capacity of future systems while remaining within the cost and power constraints.In this paper, we analyze a PCM-based hybrid main memory system using an architecture level model of PCM.We explore the trade-offs for a main memory system consisting of PCMstorage coupled with a small DRAM buffer. Such an architecture has the latency benefits of DRAM and the capacity benefits of PCM. Our evaluations for a baseline system of 16-cores with 8GB DRAM show that, on average, PCM can reduce page faults by 5X and provide a speedup of 3X. As PCM is projected to have limited write endurance, we also propose simple organizational and management solutions of the hybrid memory that reduces the write traffic to PCM, boosting its lifetime from 3 years to 9.7 years.

...read moreread less

1,451 citations

Journal Article•DOI•

PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory

[...]

Ping Chi¹, Shuangchen Li¹, Cong Xu², Tao Zhang³, Jishen Zhao¹, Yongpan Liu⁴, Yu Wang⁴, Yuan Xie¹ - Show less +4 more•Institutions (4)

University of California¹, Hewlett-Packard², Nvidia³, Tsinghua University⁴

18 Jun 2016

TL;DR: This work proposes a novel PIM architecture, called PRIME, to accelerate NN applications in ReRAM based main memory, and distinguishes itself from prior work on NN acceleration, with significant performance improvement and energy saving.

...read moreread less

Abstract: Processing-in-memory (PIM) is a promising solution to address the "memory wall" challenges for future computer systems. Prior proposed PIM architectures put additional computation logic in or near memory. The emerging metal-oxide resistive random access memory (ReRAM) has showed its potential to be used for main memory. Moreover, with its crossbar array structure, ReRAM can perform matrix-vector multiplication efficiently, and has been widely studied to accelerate neural network (NN) applications. In this work, we propose a novel PIM architecture, called PRIME, to accelerate NN applications in ReRAM based main memory. In PRIME, a portion of ReRAM crossbar arrays can be configured as accelerators for NN applications or as normal memory for a larger memory space. We provide microarchitecture and circuit designs to enable the morphable functions with an insignificant area overhead. We also design a software/hardware interface for software developers to implement various NNs on PRIME. Benefiting from both the PIM architecture and the efficiency of using ReRAM for NN computation, PRIME distinguishes itself from prior work on NN acceleration, with significant performance improvement and energy saving. Our experimental results show that, compared with a state-of-the-art neural processing unit design, PRIME improves the performance by ~2360× and the energy consumption by ~895×, across the evaluated machine learning benchmarks.

...read moreread less

1,197 citations

Proceedings Article•DOI•

Memory consistency and event ordering in scalable shared-memory multiprocessors

[...]

Kourosh Gharachorloo¹, Daniel E. Lenoski¹, James Laudon¹, Phillip B. Gibbons¹, Anoop Gupta¹, John L. Hennessy¹ - Show less +2 more•Institutions (1)

Stanford University¹

01 May 1990

TL;DR: A new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models is introduced and is shown to be equivalent to the sequential consistency model for parallel programs with sufficient synchronization.

...read moreread less

Abstract: Scalable shared-memory multiprocessors distribute memory among the processors and use scalable interconnection networks to provide high bandwidth and low latency communication. In addition, memory accesses are cached, buffered, and pipelined to bridge the gap between the slow shared memory and the fast processors. Unless carefully controlled, such architectural optimizations can cause memory accesses to be executed in an order different from what the programmer expects. The set of allowable memory access orderings forms the memory consistency model or event ordering model for an architecture.This paper introduces a new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models. A framework for classifying shared accesses and reasoning about event ordering is developed. The release consistency model is shown to be equivalent to the sequential consistency model for parallel programs with sufficient synchronization. Possible performance gains from the less strict constraints of the release consistency model are explored. Finally, practical implementation issues are discussed, concentrating on issues relevant to scalable architectures.

...read moreread less

1,169 citations

Proceedings Article•DOI•

Memory access scheduling

[...]

Scott Rixner¹, William J. Dally², Ujval J. Kapasi², Peter Mattson², John D. Owens² - Show less +1 more•Institutions (2)

Massachusetts Institute of Technology¹, Stanford University²

01 May 2000

TL;DR: This paper introduces memory access scheduling, a technique that improves the performance of a memory system by reordering memory references to exploit locality within the 3-D memory structure.

...read moreread less

Abstract: The bandwidth and latency of a memory system are strongly dependent on the manner in which accesses interact with the “3-D” structure of banks, rows, and columns characteristic of contemporary DRAM chips. There is nearly an order of magnitude difference in bandwidth between successive references to different columns within a row and different rows within a bank. This paper introduces memory access scheduling, a technique that improves the performance of a memory system by reordering memory references to exploit locality within the 3-D memory structure. Conservative reordering, in which the first ready reference in a sequence is performed, improves bandwidth by 40% for traces from five media benchmarks. Aggressive reordering, in which operations are scheduled to optimize memory bandwidth, improves bandwidth by 93% for the same set of applications. Memory access scheduling is particularly important for media processors where it enables the processor to make the most efficient use of scarce memory bandwidth.

...read moreread less

1,009 citations

Collapse

Network Information

Performance

Metrics

5,063

Papers

118,491

Citations

No. of papers in the topic in previous years
Year	Papers
2023	8
2022	17
2021	19
2020	26
2019	28
2018	38

Cache-only memory architecture

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics