scispace - formally typeset
Open AccessJournal ArticleDOI

New Techniques for Improving the Performance of the Lockstep Architecture for SEEs Mitigation in FPGA Embedded Processors

Reads0
Chats0
TLDR
A non invasive approach for the implementation of fault tolerant systems based on COTS processors embedded in FPGAs, using lockstep in conjunction with checkpoint and rollback recovery, is presented.
Abstract
The growing availability of embedded processors inside FPGAs provides unprecedented flexibility for system designers. The use of such devices for space or mission critical applications, however, is being delayed by the lack of effective low cost techniques to mitigate radiation induced errors. In this paper a non invasive approach for the implementation of fault tolerant systems based on COTS processors embedded in FPGAs, using lockstep in conjunction with checkpoint and rollback recovery, is presented. The proposed approach does not require modifications in the processor architecture or in the application software. The experimental validation of this approach through fault injection is described, the corresponding results are discussed, and the addition of a write history table as a means to reduce the performance overhead imposed by previous implementations is proposed and evaluated.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Fault-tolerant computer system design

TL;DR: Fault-Tolerant Computer System Design by Dhiraj K. Pradhan examines the design of fault-tolerant systems and their applications in the oil and gas industry.
Journal ArticleDOI

Low-overhead fault-tolerance technique for a dynamically reconfigurable softcore processor

TL;DR: A new Enhanced Lockstep scheme built using a pair of MicroBlaze cores is proposed and implemented on Xilinx Virtex-5 FPGA, which can mitigate radiation-induced temporary faults (single-event upsets (SEUs) at moderate cost and requires significantly shorter error recovery time.
Proceedings ArticleDOI

Scrubbing-based SEU mitigation approach for Systems-on-Programmable-Chips

TL;DR: A constraint-driven re-placement method to reduce the number of sensitive configuration frames and consequently the scrubbing time is proposed and a low-cost SEU mitigation approach for SoPCs is presented which uses configuration memory scan and scrubbing as fault detection and fault repair mechanisms combined with checkpointing and rollback for fault recovery.
Proceedings ArticleDOI

Fault tolerant FPGA processor based on runtime reconfigurable modules

TL;DR: This paper partitions the processor core into reconfigurable modules and duplicate these modules to implement a concurrent error detection mechanism and generates precompiled configurations which include spare resources and are used to runtime repair the defective module.
Proceedings ArticleDOI

Combining checkpointing and scrubbing in FPGA-based real-time systems

TL;DR: This paper calculates the checkpoint frequencies that guarantee the execution of the tasks within their deadlines in the presence of transient faults, and proposes a selective scrubbing approach to reduce the scrubbing time and make feasible the fault tolerant execution of tasks with tight deadlines.
References
More filters
Journal ArticleDOI

Soft errors in advanced computer systems

TL;DR: This article comprehensively analyzes soft-error sensitivity in modern systems and shows it to be application dependent.
Book

Fault-tolerant computer system design

TL;DR: This new edition specifically deals with this dynamically changing computing environment, incorporating new topics such as fault-tolerance in multiprocessor and distributed systems.
Journal ArticleDOI

Concurrent error detection using watchdog processors-a survey

TL;DR: It is shown that a large number of errors can be detected by monitoring the control flow and memory-access behavior and two techniques for control-flow checking are discussed and compared with current error-detection techniques.
Journal ArticleDOI

Control-flow checking by software signatures

TL;DR: A pure software method that checks the control flow of a program using assigned signatures that can be used even when the operating system does not support multitasking, and it is possible to increase error detection coverage for control flow errors by an order of magnitude.
Journal ArticleDOI

Fault-tolerant computer system design

TL;DR: Fault-Tolerant Computer System Design by Dhiraj K. Pradhan examines the design of fault-tolerant systems and their applications in the oil and gas industry.
Related Papers (5)