NCDE: In-Network Caching for Directory Entries to Expedite Data Access in Tiled-Chip Multiprocessors

doi:10.1109/access.2023.3234933

Open AccessJournal ArticleDOI

NCDE: In-Network Caching for Directory Entries to Expedite Data Access in Tiled-Chip Multiprocessors

- 01 Jan 2023 -

IEEE Access

- Vol. 11, pp 3080-3095

Chats0

TLDR

In this paper , the authors explore the opportunity of mitigating problems associated with shared data access via in-network caching for directory entries (NCDE), which can utilize every input port's virtual channels to hold directory entries.

Abstract:

The processing of data-intensive applications, followed by an unprecedented amount of data traffic, drives explosive accesses to the memory subsystem. The overloaded memory subsystem experiences increased data access latency. To expedite data access, a network caching technique that leverages network-on-chip (NoC) virtual channels (VCs) as an expanded memory subsystem has emerged. Previous network caching studies focused on utilizing VCs on the NoC’s local input port as a victim cache to reduce local data access latency. In contrast to previous studies, we explore the opportunity of mitigating problems associated with shared data access via in-network caching for directory entries (NCDE), which can utilize every input port’s VCs to hold directory entries. NCDE exploits VCs as the victim and prefetch buffers of the directory entries, each reducing directory eviction-induced invalidations and simplifying the cache-to-cache (C2C) data transfer. The effectiveness of NCDE was evaluated using a gem5 full-system simulator, and the results show that the average memory access time (AMAT) and workload execution time were reduced by 7.69% and 5.82%, respectively. As a cost for accelerating the data access latency, implementing NCDE incurs a negligible router area overhead of 1.56%.

References

PDF

Open Access

More filters

Journal ArticleDOI

A high-performance, portable implementation of the MPI message passing interface standard

William Gropp, +3 more

TL;DR: The MPI Message Passing Interface (MPI) as mentioned in this paper is a standard library for message passing that was defined by the MPI Forum, a broadly based group of parallel computer vendors, library writers, and applications specialists.

...read moreread less

Journal ArticleDOI

An 80-Tile Sub-100-W TeraFLOPS Processor in 65-nm CMOS

Sriram R. Vangal, +14 more

- 28 Jan 2008 -

IEEE Journal of Solid-state Circuits

TL;DR: In this paper, an integrated network-on-chip architecture containing 80 tiles arranged as an 8x10 2D array of floating-point cores and packet-switched routers, both designed to operate at 4 GHz.

...read moreread less

Book

A Primer on Memory Consistency and Cache Coherence

Daniel J. Sorin, +2 more

TL;DR: This primer is to provide readers with a basic understanding of consistency and coherence, and presents both highlevel concepts as well as specific, concrete examples from real-world systems.

...read moreread less

Journal ArticleDOI

Knights Landing: Second-Generation Intel Xeon Phi Product

Avinash Sodani, +8 more

- 01 Mar 2016 -

IEEE Micro

TL;DR: The architecture of Knights Landing, the second-generation Intel Xeon Phi product family, which targets high-performance computing and other highly parallel workloads, provides a significant increase in scalar and vector performance and a big boost in memory bandwidth compared to the prior generation, called Knights Corner.

...read moreread less