scispace - formally typeset
Proceedings ArticleDOI

A Framework for Combining Concurrent Checking and On-Line Embedded Test for Low-Latency Fault Detection in NoC Routers

Reads0
Chats0
TLDR
A framework of tools for formally evaluating the quality of the checkers and for optimizing the overhead area with given fault coverage constraints is proposed and successfully applied to a realistic case-study of a fault tolerant NoC router design.
Abstract
The focus of the paper is detection of faults in NoC routers by combining concurrent checkers with embedded on-line test to enable cost-effective trade-offs between area-overhead and test coverage. First, we propose a framework of tools for formally evaluating the quality of the checkers and for optimizing the overhead area with given fault coverage constraints. The stress is in particular on the minimization of the error detection latency, which is a crucial aspect in order to eliminate (or limit) error propagation. Second, the concurrent checkers will be complemented by embedded on-line test packets which are to be applied as a periodic routine during the idle periods in router operation. The framework together with the corresponding methodology has been successfully applied to a realistic case-study of a fault tolerant NoC router design. The case study shows that combining concurrent routers with embedded test allows reducing the area overhead of the checkers from 31--35% down to 1.5--10% without sacrificing the fault coverage.

read more

Citations
More filters
Proceedings ArticleDOI

From online fault detection to fault management in Network-on-Chips: A ground-up approach

TL;DR: A ground-up approach from fault detection to fault management for such a NoC-based system on chip is proposed that utilizes both local faultmanagement for fast reaction to faults and a global fault management mechanisms for triggering a large-scale reconfiguration of the NoC.
Journal ArticleDOI

Efficient Design-for-Test Approach for Networks-on-Chip

TL;DR: EsyTest, a comprehensive test strategy with minimized influence on system performance, is proposed, which provides a full test coverage for the NoC and a better hardware compatibility comparing with the existing test strategies.
Journal ArticleDOI

Link Testing: a Survey of Current Trends in Network on Chip

TL;DR: This is the pioneering survey paper which is concerned with classifying link testing approaches in two Online and Offline categories and extracts some conceptualizations to assist the research community.
Journal ArticleDOI

An Energy-Efficient NoC Router with Adaptive Fault-Tolerance Using Channel Slicing and On-Demand TMR

TL;DR: This paper proposes an energy-efficient NoC router that exhibits strong fault-tolerance by leveraging channel slicing, and increases router logic area by 7.8 percent compared to baseline.
Book ChapterDOI

Designing Reliable Cyber-Physical Systems

TL;DR: In this article, the application in the physical environment drives the overall requirements that must be respected when designing the computing system, and reliability is a core aspect where some of the most pressing design challenges are: (1)monitoring failures throughout the computing systems, (2) determining the impact of failures on the application constraints, and (3) ensuring correctness of computing system with respect to application-driven requirements rooted in physical environment.
References
More filters
Journal ArticleDOI

A note on error detection codes for asymmetric channels

TL;DR: Some new codes are described which are separable and are perfect error detection codes in a completely asymmetric channel and the new code is found to compare favorably in error detection capability in several cases.
Journal ArticleDOI

Selective triple Modular redundancy (STMR) based single-event upset (SEU) tolerant synthesis for FPGAs

TL;DR: The proposed STMR method along with the readback and reconfiguration feature of Virtex can result in very high SEU immunity and greatly reduce the area overhead of the hardened circuit when compared to the state-of-the-art triple modular redundancy (TMR).
Proceedings ArticleDOI

Assertion Checkers in Verification, Silicon Debug and In-Field Diagnosis

TL;DR: It is detailed how a checker generator can be used as a means of circuit design for certain portions of self test circuits, and more generally the design of monitoring circuits, in post-fabrication silicon debug.
Journal ArticleDOI

Synthesis of circuits with low-cost concurrent error detection based on Bose-Lin codes

TL;DR: An efficient scheme for concurrent error detection in sequential circuits with no constraint on the state encoding is presented and its cost is reduced significantly compared to other methods based on other codes.
Proceedings ArticleDOI

NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip Architectures

TL;DR: NoCAlert is proposed, a comprehensive on-line and real-time fault detection mechanism that demonstrates 0% false negatives within the interconnect, for the fault model and stimulus set used in this study.
Related Papers (5)