scispace - formally typeset
Journal ArticleDOI

Transient fault tolerance in digital systems

Janusz Sosnowski
- 01 Feb 1994 - 
- Vol. 14, Iss: 1, pp 24-35
TLDR
This framework provides a basis for understanding transient fault problems in digital systems and can be helpful in selecting optimum techniques to mask or eliminate transient fault effects in developed systems.
Abstract
It is hard to shield systems effectively from transient faults (fault avoidance techniques). So some other means must be employed to assure appropriate levels of transient fault tolerance (insensitivity to transient faults). They are based on fault-masking and fault recovery ideas. Having analyzed this problem, the author identifies critical design points and outlines some practical solutions that refer to efficient on-line detectors (detecting errors during the system operation) and error handling procedures. This framework provides a basis for understanding transient fault problems in digital systems. It can be helpful in selecting optimum techniques to mask or eliminate transient fault effects in developed systems. >

read more

Citations
More filters
Journal ArticleDOI

Fault and Error Tolerance in Neural Networks: A Review

TL;DR: A survey on fault tolerance in neural networks manly focusing on well-established passive techniques to exploit and improve, by design, such potential but limited intrinsic property in neural models, particularly for feedforward neural networks is presented.
Proceedings ArticleDOI

ICR: in-cache replication for enhancing data cache reliability

TL;DR: This paper proposes a novel solution to this problem by allowing in-cache replication, wherein reliability can be enhanced without excessively slowing down cache accesses or requiring significant area cost increases.
Journal ArticleDOI

Threshold-based mechanisms to discriminate transient from intermittent faults

TL;DR: A class of count-and-threshold mechanisms, collectively named /spl alpha/-count, which are able to discriminate between transient faults and intermittent faults in computing systems and adopt a mathematically defined structure, which is simple enough to analyze by standard tools.
Journal ArticleDOI

Design Optimization of Time- and Cost-Constrained Fault-Tolerant Embedded Systems With Checkpointing and Replication

TL;DR: This work uses checkpointing with rollback recovery and active replication for tolerating transient faults, and presents several design optimization approaches which are able to find fault-tolerant implementations given a limited amount of resources.
Proceedings ArticleDOI

Area efficient architectures for information integrity in cache memories

TL;DR: This work focuses on transient fault tolerance in primary cache memories and develops new architectural solutions, to maximize fault coverage when the budgeted silicon area is not sufficient for the conventional configuration of an error checking code.
References
More filters
Book

Testing Semiconductor Memories: Theory and Practice

TL;DR: Memory modeling functional testing: reduced functional RAM chip model Functional RAM chip testing functional ROM chip testingfunctional memory array testing functional memory board testing electrical testing: parametric testing dynamic testing on chip testing conclusions: address line scrambling various proofs software package.
Journal ArticleDOI

Concurrent error detection using watchdog processors-a survey

TL;DR: It is shown that a large number of errors can be detected by monitoring the control flow and memory-access behavior and two techniques for control-flow checking are discussed and compared with current error-detection techniques.
Proceedings ArticleDOI

Evaluation of error detection schemes using fault injection by heavy-ion radiation

TL;DR: Several concurrent error detection schemes suitable for a watch-dog processor were evaluated by fault injection andSoft errors were induced into a MC6809E microprocessor by heavy-ion radiation from a Californium-252 source to characterize the errors and determine coverage and latency for the variouserror detection schemes.
Related Papers (5)