scispace - formally typeset
Open AccessJournal ArticleDOI

NCDE: In-Network Caching for Directory Entries to Expedite Data Access in Tiled-Chip Multiprocessors

- 01 Jan 2023 - 
- Vol. 11, pp 3080-3095
Reads0
Chats0
TLDR
In this paper , the authors explore the opportunity of mitigating problems associated with shared data access via in-network caching for directory entries (NCDE), which can utilize every input port's virtual channels to hold directory entries.
Abstract
The processing of data-intensive applications, followed by an unprecedented amount of data traffic, drives explosive accesses to the memory subsystem. The overloaded memory subsystem experiences increased data access latency. To expedite data access, a network caching technique that leverages network-on-chip (NoC) virtual channels (VCs) as an expanded memory subsystem has emerged. Previous network caching studies focused on utilizing VCs on the NoC’s local input port as a victim cache to reduce local data access latency. In contrast to previous studies, we explore the opportunity of mitigating problems associated with shared data access via in-network caching for directory entries (NCDE), which can utilize every input port’s VCs to hold directory entries. NCDE exploits VCs as the victim and prefetch buffers of the directory entries, each reducing directory eviction-induced invalidations and simplifying the cache-to-cache (C2C) data transfer. The effectiveness of NCDE was evaluated using a gem5 full-system simulator, and the results show that the average memory access time (AMAT) and workload execution time were reduced by 7.69% and 5.82%, respectively. As a cost for accelerating the data access latency, implementing NCDE incurs a negligible router area overhead of 1.56%.

read more

References
More filters
Journal ArticleDOI

A high-performance, portable implementation of the MPI message passing interface standard

TL;DR: The MPI Message Passing Interface (MPI) as mentioned in this paper is a standard library for message passing that was defined by the MPI Forum, a broadly based group of parallel computer vendors, library writers, and applications specialists.
Journal ArticleDOI

An 80-Tile Sub-100-W TeraFLOPS Processor in 65-nm CMOS

TL;DR: In this paper, an integrated network-on-chip architecture containing 80 tiles arranged as an 8x10 2D array of floating-point cores and packet-switched routers, both designed to operate at 4 GHz.
Book

A Primer on Memory Consistency and Cache Coherence

TL;DR: This primer is to provide readers with a basic understanding of consistency and coherence, and presents both highlevel concepts as well as specific, concrete examples from real-world systems.
Journal ArticleDOI

Knights Landing: Second-Generation Intel Xeon Phi Product

TL;DR: The architecture of Knights Landing, the second-generation Intel Xeon Phi product family, which targets high-performance computing and other highly parallel workloads, provides a significant increase in scalar and vector performance and a big boost in memory bandwidth compared to the prior generation, called Knights Corner.
Related Papers (5)