Efficient implementation of distributed routing algorithms for NoCs
TLDR
LBDR (logic-based distributed routing) is proposed as a new routing method that removes the need of using routing tables at all and enables the implementation of many routing algorithms on most of the practical topologies in a multi-core system.Abstract:
Chip multiprocessors (CMPs) are gaining momentum in the high-performance computing domain. Networks-on-chip (NoCs) are key components of CMP architectures, in that they have to deal with the communication scalability challenge while meeting tight power, area and latency constraints. 2D mesh topologies are usually preferred by designers of general purpose NoCs. However, manufacturing faults may break their regularity. Moreover, resource management frameworks may require the segmentation of the network into irregular regions. Under these conditions, efficient routing becomes a challenge. Although the use of routing tables at switches is flexible, it does not scale in terms of latency and area due to its memory requirements. Logic-based distributed routing (LBDR) is proposed as a new routing method that removes the need for routing tables at all. LBDR enables the implementation of many routing algorithms on most of the practical topologies we may find in the near future in a multi-core system. From an initial topology and routing algorithm, a set of three bits per switch/output port is computed. Evaluation results show that, by using a small logic, LBDR mimics the performance of routing algorithms when implemented with routing tables, both in regular and irregular topologies. LBDR implementation in a real NoC switch is also explored, proving its smooth integration in the architecture and its negligible hardware and performance overhead.read more
Citations
More filters
Journal ArticleDOI
Elevator-First: A Deadlock-Free Distributed Routing Algorithm for Vertically Partially Connected 3D-NoCs
TL;DR: It is formally proved that independently of the shape and dimensions of the planar topologies and of the number and placement of the TSVs, the proposed routing algorithm using two virtual channels in the plane is deadlock and livelock free.
Journal ArticleDOI
A Survey and Evaluation of Topology-Agnostic Deterministic Routing Algorithms
Jose Flich,Tor Skeie,A. Mejia,Olav Lysne,Pedro López,Antonio Robles,José Duato,Michihiro Koibuchi,Tomas Rokicki,Jose Carlos Sancho +9 more
TL;DR: This paper presents a comprehensive overview of the known topology-agnostic routing algorithms, classify these algorithms by their most important properties, and evaluate them consistently, providing significant insight into the algorithms and their appropriateness for different on- and off-chip environments.
Proceedings ArticleDOI
Addressing Manufacturing Challenges with Cost-Efficient Fault Tolerant Routing
Samuel Rodrigo,Jose Flich,Antoni Roca,Simone Medardoni,Davide Bertozzi,J. Camacho,Federico Silla,José Duato +7 more
TL;DR: Universal Logic-Based Distributed Routing (uLBDR) as mentioned in this paper is an efficient logic-based mechanism that adapts to any irregular topology derived from 2D meshes, being an alternative to the use of routing tables.
Proceedings ArticleDOI
Exploiting Network-on-Chip structural redundancy for a cooperative and scalable built-in self-test architecture
Alessandro Strano,Crispín Gómez,Daniele Ludovici,Michele Favalli,Maria E. Gomez,Davide Bertozzi +5 more
TL;DR: This paper proposes a built-in self-test/self-diagnosis procedure at start-up of an on-chip network (NoC) that exploits the inherent structural redundancy of the NoC architecture in a cooperative way, thus detecting faults in test pattern generators too.
Journal ArticleDOI
Cost-Efficient On-Chip Routing Implementations for CMP and MPSoC Systems
Samuel Rodrigo,Jose Flich,Antoni Roca,Simone Medardoni,Davide Bertozzi,Jesús Camacho,Federico Silla,José Duato +7 more
TL;DR: ULBDR is presented, an efficient logic-based mechanism that adapts to any irregular topology derived from 2-D meshes, instead of using routing tables, that requires a small set of configuration bits, thus being more practical than large routing tables implemented in memories.
References
More filters
Journal ArticleDOI
Introduction to the cell multiprocessor
TL;DR: This paper discusses the history of the project, the program objectives and challenges, the disign concept, the architecture and programming models, and the implementation of the Cell multiprocessor.
Journal ArticleDOI
A 5-GHz Mesh Interconnect for a Teraflops Processor
TL;DR: A multicore processor in 65-Nm technology with 80 single-precision, floatingpoint cores delivers performance in excess of a Teraflops while consuming less than 100 W.
Proceedings ArticleDOI
iWarp: an integrated solution to high-speed parallel computing
Shekhar Borkar,Robert Cohn,George W. Cox,S. Gleason,T. Gross,Hsiang-Tsung Kung,Monica S. Lam,B. Moore,Craig B. Peterson,J. Pieper,Linda J. Rankin,P.S. Tseng,J. Sutton,John A. Urbanski,Jon A. Webb +14 more
TL;DR: Because of their strong computation and communication capabilities, the iWarp components provide a versatile building block for high-performance parallel systems ranging from special-purpose systolic arrays to general-purpose distributed memory computers.
Journal ArticleDOI
Error control schemes for on-chip communication links: the energy-reliability tradeoff
TL;DR: Redundant bus coding is proved to be an effective technique for trading off energy against reliability, so that the most efficient scheme can be selected to meet predefined reliability requirements in a low signal-to-noise ratio regime.