Proceedings ArticleDOI
A Framework for Combining Concurrent Checking and On-Line Embedded Test for Low-Latency Fault Detection in NoC Routers
Pietro Saltarelli,Behrad Niazmand,Jaan Raik,Vineeth Govind,Thomas Hollstein,Gert Jervan,Ranganathan Hariharan +6 more
- pp 6
Reads0
Chats0
TLDR
A framework of tools for formally evaluating the quality of the checkers and for optimizing the overhead area with given fault coverage constraints is proposed and successfully applied to a realistic case-study of a fault tolerant NoC router design.Abstract:
The focus of the paper is detection of faults in NoC routers by combining concurrent checkers with embedded on-line test to enable cost-effective trade-offs between area-overhead and test coverage. First, we propose a framework of tools for formally evaluating the quality of the checkers and for optimizing the overhead area with given fault coverage constraints. The stress is in particular on the minimization of the error detection latency, which is a crucial aspect in order to eliminate (or limit) error propagation. Second, the concurrent checkers will be complemented by embedded on-line test packets which are to be applied as a periodic routine during the idle periods in router operation. The framework together with the corresponding methodology has been successfully applied to a realistic case-study of a fault tolerant NoC router design. The case study shows that combining concurrent routers with embedded test allows reducing the area overhead of the checkers from 31--35% down to 1.5--10% without sacrificing the fault coverage.read more
Citations
More filters
Proceedings ArticleDOI
From online fault detection to fault management in Network-on-Chips: A ground-up approach
Siavoosh Payandeh Azad,Behrad Niazmand,Karl Janson,Nevin George,Adeboye Stephen Oyeniran,Tsotne Putkaradze,Apneet Kaur,Jaan Raik,Gert Jervan,Raimund Ubar,Thomas Hollstein +10 more
TL;DR: A ground-up approach from fault detection to fault management for such a NoC-based system on chip is proposed that utilizes both local faultmanagement for fast reaction to faults and a global fault management mechanisms for triggering a large-scale reconfiguration of the NoC.
Journal ArticleDOI
Efficient Design-for-Test Approach for Networks-on-Chip
TL;DR: EsyTest, a comprehensive test strategy with minimized influence on system performance, is proposed, which provides a full test coverage for the NoC and a better hardware compatibility comparing with the existing test strategies.
Journal ArticleDOI
Link Testing: a Survey of Current Trends in Network on Chip
TL;DR: This is the pioneering survey paper which is concerned with classifying link testing approaches in two Online and Offline categories and extracts some conceptualizations to assist the research community.
Journal ArticleDOI
An Energy-Efficient NoC Router with Adaptive Fault-Tolerance Using Channel Slicing and On-Demand TMR
Cheng Li,Mo Yang,Paul Ampadu +2 more
TL;DR: This paper proposes an energy-efficient NoC router that exhibits strong fault-tolerance by leveraging channel slicing, and increases router logic area by 7.8 percent compared to baseline.
Book ChapterDOI
Designing Reliable Cyber-Physical Systems
Gadi Aleksandrowicz,Eli Arbel,Roderick Bloem,Timon D. ter Braak,Sergei Devadze,Goerschwin Fey,Maksim Jenihhin,Artur Jutman,Hans G. Kerkhoff,Robert Könighofer,Shlomit Koyfman,Jan Malburg,Shiri Moran,Jaan Raik,Gerard Rauwerda,Heinz Riener,Franz Röck,Konstantin Shibin,Kim Sunesen,Jinbo Wan,Yong Zhao +20 more
TL;DR: In this article, the application in the physical environment drives the overall requirements that must be respected when designing the computing system, and reliability is a core aspect where some of the most pressing design challenges are: (1)monitoring failures throughout the computing systems, (2) determining the impact of failures on the application constraints, and (3) ensuring correctness of computing system with respect to application-driven requirements rooted in physical environment.
References
More filters
Journal ArticleDOI
A note on error detection codes for asymmetric channels
TL;DR: Some new codes are described which are separable and are perfect error detection codes in a completely asymmetric channel and the new code is found to compare favorably in error detection capability in several cases.
Journal ArticleDOI
Selective triple Modular redundancy (STMR) based single-event upset (SEU) tolerant synthesis for FPGAs
TL;DR: The proposed STMR method along with the readback and reconfiguration feature of Virtex can result in very high SEU immunity and greatly reduce the area overhead of the hardened circuit when compared to the state-of-the-art triple modular redundancy (TMR).
Proceedings ArticleDOI
Assertion Checkers in Verification, Silicon Debug and In-Field Diagnosis
TL;DR: It is detailed how a checker generator can be used as a means of circuit design for certain portions of self test circuits, and more generally the design of monitoring circuits, in post-fabrication silicon debug.
Journal ArticleDOI
Synthesis of circuits with low-cost concurrent error detection based on Bose-Lin codes
D. Das,Nur A. Touba +1 more
TL;DR: An efficient scheme for concurrent error detection in sequential circuits with no constraint on the state encoding is presented and its cost is reduced significantly compared to other methods based on other codes.
Proceedings ArticleDOI
NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip Architectures
TL;DR: NoCAlert is proposed, a comprehensive on-line and real-time fault detection mechanism that demonstrates 0% false negatives within the interconnect, for the fault model and stimulus set used in this study.