Showing papers on "Fault coverage published in 2006"

PDF

Open Access

Book•

[...]

01 Jan 2006

TL;DR: In this paper, the authors present a comparison and combination of fault-detection methods for different types of fault detection methods: Fault detection with classification methods, fault detection with inference methods, and fault detection using Principal Component Analysis (PCA).

...read moreread less

Abstract: Fundamentals.- Supervision and fault management of processes - tasks and terminology.- Reliability, Availability and Maintainability (RAM).- Safety, Dependability and System Integrity.- Fault-Detection Methods.- Process Models and Fault Modelling.- Signal models.- Fault detection with limit checking.- Fault detection with signal models.- Fault detection with process-identification methods.- Fault detection with parity equations.- Fault detection with state observers and state estimation.- Fault detection of control loops.- Fault detection with Principal Component Analysis (PCA).- Comparison and combination of fault-detection methods.- Fault-Diagnosis Methods.- Diagnosis procedures and problems.- Fault diagnosis with classification methods.- Fault diagnosis with inference methods.- Fault-Tolerant Systems.- Fault-tolerant design.- Fault-tolerant components and control.- Application Examples.- Fault detection and diagnosis of DC motor drives.- Fault detection and diagnosis of a centrifugal pump-pipe-system.- Fault detection and diagnosis of an automotive suspension and the tire pressures.

...read moreread less

1,754 citations

Proceedings Article•DOI•

Improving test suites for efficient fault localization

[...]

Benoit Baudry, Franck Fleurey, Yves Le Traon¹•Institutions (1)

Orange S.A.¹

28 May 2006

TL;DR: The dilemma between a reduced testing effort and the diagnosis accuracy is partly solved by selecting test cases that are dedicated to diagnosis, and a test-for-diagnosis criterion is proposed and validated through rigorous case studies.

...read moreread less

Abstract: The need for testing-for-diagnosis strategies has been identified for a long time, but the explicit link from testing to diagnosis (fault localization) is rare. Analyzing the type of information needed for efficient fault localization, we identify the attribute (called Dynamic Basic Block) that restricts the accuracy of a diagnosis algorithm. Based on this attribute, a test-for-diagnosis criterion is proposed and validated through rigorous case studies: it shows that a test suite can be improved to reach a high level of diagnosis accuracy. So, the dilemma between a reduced testing effort (with as few test cases as possible) and the diagnosis accuracy (that needs as much test cases as possible to get more information) is partly solved by selecting test cases that are dedicated to diagnosis.

...read moreread less

226 citations

Book•

Fault-Tolerance Techniques for SRAM-Based FPGAs

[...]

Ricardo Jasinski

14 Jun 2006

TL;DR: In this article, the authors present a single event UPSET (SEU) MITIGATION TECHNIQUE for FPGA-based CIRCUITS, where the UPSET is used to detect faults in the FPGAs.

...read moreread less

Abstract: DEDICATION CONTRIBUTING AUTHORS PREFACE 1 INTRODUCTION 2 RADIATION EFFECTS IN INTEGRATED CIRCUITS 21 RADIATION ENVIROMENT OVERVIEW 22 RADIATION EFFECTS IN INTEGRATED CIRCUITS 221 SEU Classification 23 PECULIAR EFFECTS IN SRAM-BASED FPGAS 3 SINGLE EVENT UPSET (SEU) MITIGATION TECHNIQUES 31 DESIGN-BASED TECHNIQUES 311 Detection Techniques 312 Mitigation Techniques 3121 Full Time and Hardware Redundancy 3122 Error Correction and Detection Codes 3123 Hardened Memory Cells 32 EXAMPLES OF SEU MITIGATION TECHNIQUES IN ASICS 33 EXAMPLES OF SEU MITIGATION TECHNIQUES IN FPGAS 331 Antifuse based FPGAs 332 SRAM-based FPGAs 3321 SEU Mitigation Solution in high-level description 3322 SEU Mitigation Solutions at the Architectural level 3323 Recovery technique 4 ARCHITECTURAL SEU MITIGATION TECHNIQUES 5 HIGH-LEVEL SEU MITIGATION TECHNIQUES 51 TRIPLE MODULAR REDUNDANCY TECHNIQUE FOR FPGAS 52 SCRUBBING 6 TRIPLE MODULAR REDUNDANCY (TMR) ROBUSTNESS 61 TEST DESIGN METHODOLOGY 62 FAULT INJECTION IN THE FPGA BITSTREAM 63 LOCATING THE UPSET IN THE DESIGN FLOORPLANNING 631 Bit column location in the matrix 632 Bit row location in the matrix 633 Bit location in the CLB 634 Bit Classification 64 FAULT INJECTION RESULTS 65 THE 'GOLDEN' CHIP APPROACH 7 DESIGNING AND TESTING A TMR MICRO-CONTROLLER 71 AREA AND PERFORMANCE RESULTS 72 TMR 8051 MICRO-CONTROLLER RADIATION GROUND TEST RESULTS 8 REDUCING TMR OVERHEADS: PART I 81 DUPLICATION WITH COMPARISON COMBINED WITH TIME REDUNDANCY 82 FAULT INJECTION IN THE VHDL DESCRIPTION 83 AREA AND PERFORMANCE RESULTS 9 REDUCING TMR OVERHEADS: PART II 91 DWC-CED TECHNIQUE IN ARITHMETIC-BASED CIRCUITS 911 Using CEDbased on hardware redundancy 912 Using CED based on time redundancy 913 Choosing the most appropriated CED block 9131 Multipliers 9132 Arithmetic and Logic Unit (ALU) 9133 Digital FIR Filter 914 Fault Coverage Results 914 Area and Performance Results 92 DESIGNING DWC-CED TECHNIQUE IN NON-ARITHMETIC-BASED CIRCUITS 10 FINAL REMARKS REFERENCES

...read moreread less

171 citations

Journal Article•DOI•

On the use of continuous-wavelet transform for fault location in distribution power systems

[...]

Alberto Borghetti¹, Sandro Corsi, Carlo Alberto Nucci¹, Mario Paolone¹, Lorenzo Peretto¹, Roberto Tinarelli¹ - Show less +2 more•Institutions (1)

University of Bologna¹

01 Nov 2006-International Journal of Electrical Power & Energy Systems

TL;DR: In this paper, a procedure based on the continuous wavelet transform (CWT) for the analysis of voltage transients due to line faults, and its application to fault location in power distribution systems is discussed.

...read moreread less

158 citations

Journal Article•DOI•

Simple error detection methods for hardware implementation of Advanced Encryption Standard

[...]

Chih-Hsu Yen¹, Bing-Fei Wu¹•Institutions (1)

National Chiao Tung University¹

01 Jun 2006-IEEE Transactions on Computers

TL;DR: In order to prevent the Advanced Encryption Standard (AES) from suffering from differential fault attacks, the technique of error detection can be adopted to detect the errors during encryption or decryption and then to provide the information for taking further action, such as interrupting the AES process or redoing the process.

...read moreread less

Abstract: In order to prevent the Advanced Encryption Standard (AES) from suffering from differential fault attacks, the technique of error detection can be adopted to detect the errors during encryption or decryption and then to provide the information for taking further action, such as interrupting the AES process or redoing the process. Because errors occur within a function, it is not easy to predict the output. Therefore, general error control codes are not suited for AES operations. In this work, several error-detection schemes have been proposed. These schemes are based on the (n+1, n) cyclic redundancy check (CRC) over GF(28), where nisin{4,8,16}. Because of the good algebraic properties of AES, specifically the MixColumns operation, these error detection schemes are suitable for AES and efficient for the hardware implementation; they may be designed using round-level, operation-level, or algorithm-level detection. The proposed schemes have high fault coverage. In addition, the schemes proposed are scalable and symmetrical. The scalability makes these schemes suitable for an AES circuit implemented in 8-bit, 32-bit, or 128-bit architecture. Symmetry also benefits the implementation of the proposed schemes to achieve that the encryption process and the decryption process can share the same error detection hardware. These schemes are also suitable for encryption-only or decryption-only cases. Error detection for the key schedule in AES is also proposed and is based on the derived results in the data procedure of AES

...read moreread less

151 citations

Proceedings Article•DOI•

Failure proximity: a fault localization-based approach

[...]

Chao Liu¹, Jiawei Han¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

05 Nov 2006

TL;DR: In this paper, the authors propose a new type of failure proximity, called R-Proximity, which regards two failing traces as similar if they suggest roughly the same fault location.

...read moreread less

Abstract: Recent software systems usually feature an automated failure reporting system, with which a huge number of failing traces are collected every day. In order to prioritize fault diagnosis, failing traces due to the same fault are expected to be grouped together. Previous methods, by hypothesizing that similar failing traces imply the same fault, cluster failing traces based on the literal trace similarity, which we call trace proximity. However, since a fault can be triggered in many ways, failing traces due to the same fault can be quite different. Therefore, previous methods actually group together traces exhibiting similar behaviors, like similar branch coverage, rather than traces due to the same fault. In this paper, we propose a new type of failure proximity, called R-Proximity, which regards two failing traces as similar if they suggest roughly the same fault location. The fault location each failing case suggests is automatically obtained with Sober, an existing statistical debugging tool. We show that with R-Proximity, failing traces due to the same fault can be grouped together. In addition, we find that R-Proximity is helpful for statistical debugging: It can help developers interpret and utilize the statistical debugging result. We illustrate the usage of R-Proximity with a case study on the grep program and some experiments on the Siemens suite, and the result clearly demonstrates the advantage of R-Proximity over trace proximity.

...read moreread less

142 citations

Proceedings Article•DOI•

Test Case Prioritization Using Relevant Slices

[...]

D. Jeffrey¹, N. Gupta¹•Institutions (1)

University of Arizona¹

17 Sep 2006

TL;DR: This paper presents a new approach to prioritize test cases based on the coverage requirements present in the relevant slices of the outputs of test cases, and presents experimental results comparing the effectiveness of the prioritization approach with existing techniques that only account for total requirement coverage.

...read moreread less

Abstract: Software testing and retesting occurs continuously during the software development lifecycle to detect errors as early as possible. The sizes of test suites grow as software evolves. Due to resource constraints, it is important to prioritize the execution of test cases so as to increase chances of early detection of faults. Prior techniques for test case prioritization are based on the total number of coverage requirements exercised by the test cases. In this paper, we present a new approach to prioritize test cases based on the coverage requirements present in the relevant slices of the outputs of test cases. We present experimental results comparing the effectiveness of our prioritization approach with that of existing techniques that only account for total requirement coverage, in terms of ability to achieve high rate of fault detection. Our results present interesting insights into the effectiveness of using relevant slices for test case prioritization.

...read moreread less

138 citations

Proceedings Article•DOI•

Fault analysis on distribution feeders with distributed generators

[...]

Mesut Baran¹, I. EI-Markaby¹•Institutions (1)

North Carolina State University¹

16 Oct 2006

TL;DR: In this paper, the authors proposed a method to extend the conventional fault analysis methods so that IIDG contribution can be estimated in the fault analysis, and the proposed method gives rms profiles of the fault currents of interest (i.e., the fault contribution and the fault current the protective device will see) under both balanced and unbalanced fault conditions.

...read moreread less

Abstract: Summary form only given. This paper shows that the current an inverter interfaced distributed generator (IIDG) contributes to a fault varies considerably, due mainly to fast response of its controller. The paper proposes a method to extend the conventional fault analysis methods so that IIDG contribution can be estimated in the fault analysis. The proposed method gives rms profiles of the fault currents of interest (IIDG contribution and the fault currents the protective device will see). Test results, based on a prototype feeder, show that the proposed approach can estimate the fault currents contributions under both balanced and unbalanced fault conditions.

...read moreread less

124 citations

Journal Article•DOI•

A survey of fault tolerant methodologies for FPGAs

[...]

J.A. Cheatham¹, John M. Emmert¹, Stan Baumgart²•Institutions (2)

Wright State University¹, University of North Carolina at Charlotte²

01 Apr 2006-ACM Transactions on Design Automation of Electronic Systems

TL;DR: This survey attempts to provide an overview of the current state of the art for fault tolerance in FPGAs, assuming that faults have been previously detected and diagnosed and the methods presented are targeted towards tolerating the faults.

...read moreread less

Abstract: A wide range of fault tolerance methods for FPGAs have been proposed. Approaches range from simple architectural redundancy to fully on-line adaptive implementations. The applications of these methods also differ; some are used only for manufacturing yield enhancement, while others can be used in-system. This survey attempts to provide an overview of the current state of the art for fault tolerance in FPGAs. It is assumed that faults have been previously detected and diagnosed; the methods presented are targeted towards tolerating the faults. A detailed description of each method is presented. Where applicable, the methods are compared using common metrics. Results are summarized to present a succinct, comprehensive comparison of the different approaches.

...read moreread less

110 citations

Proceedings Article•DOI•

Cost-efficient soft error protection for embedded microprocessors

[...]

Jason Blome¹, Shantanu Gupta¹, Shuguang Feng¹, Scott Mahlke¹•Institutions (1)

University of Michigan¹

22 Oct 2006

TL;DR: This work presents a thorough analysis of the effects of soft errors on a production-grade, fully synthesized implementation of an ARM926EJ-S embedded microprocessor and designs two orthogonal low-costs of terror protection techniques that can be tuned to achieve variable levels of fault coverage as a function of area and power constraints.

...read moreread less

Abstract: Device scaling trends dramatically increase the susceptibility of microprocessors to soft errors. Further, mounting demand for embedded microprocessors in a wide array of safety critical applications, ranging from automobiles to pacemakers, compounds the importance of addressing the soft error problem. Historically, soft error tolerance techniques have been targeted mainly at high-end server markets, leading to solutions such as coarse-grained modular redundancy and redundant multithreading. However, these techniques tend to be prohibitively expensive to implement in the embedded design space. To address this problem, we first present a thorough analysis of the effects of soft errors on a production-grade, fully synthesized implementation of an ARM926EJ-S embedded microprocessor. We then leverage this analysis in the design of two orthogonal low-costs of terror protection techniques that can be tuned to achieve variable levels of fault coverage as a function of area and power constraints. The first technique uses a small cache of live register values in order to provide nearly twice the fault coverage of a register file protected using traditional error correcting codes at little or no additional area cost. The second technique is a statistical method used to significantly reduce the overhead of deploying time-delayed shadow latches for low-latency fault detection.

...read moreread less

94 citations

Journal Article•DOI•

Fuzzy logic-based fault-type identification in unbalanced radial power distribution system

[...]

Biswarup Das¹•Institutions (1)

Indian Institute of Technology Roorkee¹

01 Jan 2006-IEEE Transactions on Power Delivery

TL;DR: In this article, a fuzzy logic-based algorithm to identify the type of faults in radial, unbalanced distribution system has been developed, which is able to accurately identify the phase(s) involved in all ten types of shunt faults that may occur in an electric power distribution system under different fault types, fault resistance, fault inception angle, system topology and loading levels.

...read moreread less

Abstract: In this paper, a fuzzy logic-based algorithm to identify the type of faults in radial, unbalanced distribution system has been developed. The proposed technique is able to accurately identify the phase(s) involved in all ten types of shunt faults that may occur in an electric power distribution system under different fault types, fault resistance, fault inception angle, system topology and loading levels. The proposed method needs only three line current measurements available at the substation and can perform the fault classification task in about half-cycle period. All the test results show that the proposed fault identifier is well suited for identifying fault types in radial, unbalanced distribution system.

...read moreread less

Journal Article•DOI•

Ground distance relay Compensation based on fault resistance calculation

[...]

M.M. Eissa¹•Institutions (1)

Helwan University¹

02 Oct 2006-IEEE Transactions on Power Delivery

TL;DR: In this paper, a new compensation method based on fault resistance calculation is presented, which is based on monitoring the active power at the relay point and measures accurately the impedance between the relay location and the fault point.

...read moreread less

Abstract: The fault resistance introduces an error in the fault distance estimate, and hence may create an unreliable operation of a distance relay. A new compensation method based on fault resistance calculation is presented. The fault resistance calculation is based on monitoring the active power at the relay point. The compensated fault impedance measures accurately the impedance between the relay location and the fault point. The relay has shown satisfactory performances under various fault conditions especially for the ground faults with high fault resistance. This new compensation method avoids the under-reach problem in ground distance relays

...read moreread less

Proceedings Article•DOI•

Systematic software-based self-test for pipelined processors

[...]

Mihalis Psarakis, Dimitris Gizopoulos, Miltiadis Hatzimihail, Antonis Paschalis¹, Anand Raghunathan², Srivaths Ravi² - Show less +2 more•Institutions (2)

National and Kapodistrian University of Athens¹, Princeton University²

24 Jul 2006

TL;DR: A systematic SBST methodology that enhances existing SBST programs so that they comprehensively test the pipeline logic, and applies it to two complex benchmark RISC processors with respect to two fault models: stuck-at fault model and transition delay fault model.

...read moreread less

Abstract: Software-based self-test (SBST) has recently emerged as an effective methodology for the manufacturing test of processors and other components in Systems-on-Chip (SoCs). By moving test related functions from external resources to the SoC's interior, in the form of test programs that the on-chip processor executes, SBST eliminates the need for high-cost testers, and enables high-quality at-speed testing. Thus far, SBST approaches have focused almost exclusively on the functional (directly programmer visible) components of the processor. In this paper, we analyze the challenges involved in testing an important component of modern processors, namely, the pipelining logic, and propose a systematic SBST methodology to address them. We first demonstrate that SBST programs that only target the functional components of the processor are insufficient to test the pipeline logic, resulting in a significant loss of fault coverage. We further identify the testability hotspots in the pipeline logic. Finally, we develop a systematic SBST methodology that enhances existing SBST programs to comprehensively test the pipeline logic. The proposed methodology is complementary to previous SBST techniques that target functional components (their results can form the input to our methodology), and can reuse the test development effort behind existing SBST programs. We applied the methodology to two complex, fully pipelined processors. Results show that our methodology provides fault coverage improvements of up to 15% (12% on average) for the entire processor, and fault coverage improvements of 22% for the pipeline logic, compared to a conventional SBST approach.

...read moreread less

Journal Article•DOI•

LT-RTPG: a new test-per-scan BIST TPG for low switching activity

[...]

Seongmoon Wang¹, Sandeep K. S. Gupta•Institutions (1)

Princeton University¹

01 Nov 2006-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: Experimental results demonstrate that LT-RTPGs designed using the proposed methodology decrease switching activity during BIST by significant amounts while providing high fault coverage.

...read moreread less

Abstract: A new built-in self-test (BIST) test pattern generator (TPG) design, called low-transition random TPG (LT-RTPG), is presented. An LT-RTPG is composed of a linear feedback shift register (LFSR), a /spl kappa/-input AND gate, and a T flip-flop. When used to generate test patterns for test-per-scan BIST, it decreases the number of transitions that occur during scan shifting and, hence, decreases switching activity during testing. Various properties of LT-RTPGs are identified and a methodology for their design is presented. Experimental results demonstrate that LT-RTPGs designed using the proposed methodology decrease switching activity during BIST by significant amounts while providing high fault coverage.

...read moreread less

Proceedings Article•DOI•

Discriminative pattern mining in software fault detection

[...]

Giuseppe Di Fatta¹, Stefan Leue¹, Evghenia Stegantova¹•Institutions (1)

University of Konstanz¹

06 Nov 2006

TL;DR: A method to enhance fault localization for software systems based on a frequent pattern mining algorithm that identifies frequent subtrees in successful and failing test executions to rank functions according to their likelihood of containing a fault.

...read moreread less

Abstract: We present a method to enhance fault localization for software systems based on a frequent pattern mining algorithm. Our method is based on a large set of test cases for a given set of programs in which faults can be detected. The test executions are recorded as function call trees. Based on test oracles the tests can be classified into successful and failing tests. A frequent pattern mining algorithm is used to identify frequent subtrees in successful and failing test executions. This information is used to rank functions according to their likelihood of containing a fault. The ranking suggests an order in which to examine the functions during fault analysis. We validate our approach experimentally using a subset of Siemens benchmark programs.

...read moreread less

Journal Article•DOI•

Design and evaluation of a directional algorithm for transmission-line protection based on positive- sequence fault components

[...]

Houlei Gao¹, Peter Crossley•Institutions (1)

Shandong University¹

30 Nov 2006

TL;DR: In this article, a directional relay algorithm for EHV transmission lines using positive-sequence fault components is presented, where the phase relationship between the voltage and current measured at the relay point is compared to determine whether a fault is in the forward or backward direction.

...read moreread less

Abstract: A directional relay algorithm for EHV transmission lines using positive-sequence fault components is presented. By comparing the phase relationship between the voltage and current measured at the relay point, the algorithm can determine correctly whether a fault is in the forward or backward direction. Specially designed techniques and logic are adopted to solve the difficult problems that exist in a real system. The signal-processing procedure for extracting the required fault components is provided in detail. Extensive simulation studies were conducted on a 500 kV system model using EMTDC. Theoretical analysis and simulation results show that the proposed algorithm provides adequate sensitivity, reliability and a fast operating response under a variety of system and fault conditions. In addition, it provides significant advantages over conventional directional relays, and these are discussed in the paper.

...read moreread less

Proceedings Article•DOI•

Extensive Events Database Development using ATP and Matlab to Fault Location in Power Distribution Systems

[...]

J. Mora, Juan Carlos Bedoya¹, Joaquim Melendez²•Institutions (2)

Technological University of Pereira¹, University of Girona²

01 Aug 2006

TL;DR: The modeling of a power distribution system and its protective relaying to obtain an extensive fault database using the capabilities of ATP and Matlab is described and a methodology to perform automatic simulations and a data base with 930 fault situations in a 25 kV test system is obtained.

...read moreread less

Abstract: Opportune fault location in power distribution systems is an important aspect related to power quality, and especially to maintain good continuity indexes Fault location methods which use more information than RMS values of voltage and current, are the commonly known as Knowledge Based Methods - KBM Those require of a complete fault database to adequately perform the training and validation stages, and as a consequence successfully perform the fault location task In this paper, the modeling of a power distribution system and its protective relaying to obtain an extensive fault database using the capabilities of ATP and Matlab is described The obtained database can be used to perform different types of system analysis and in this specific case to solve the problem of fault location in power distribution systems As a result a methodology to perform automatic simulations and a data base with 930 fault situations in a 25 kV test system was obtained

...read moreread less

Proceedings Article•DOI•

Parity-Based Fault Detection Architecture of S-box for Advanced Encryption Standard

[...]

Mehran Mozaffari Kermani¹, Arash Reyhani-Masoleh¹•Institutions (1)

University of Western Ontario¹

04 Oct 2006

TL;DR: The authors present parity-based fault detection architecture of the S-box for designing high performance fault detection structures of the advanced encryption standard and propose a parity- based fault detection scheme for reaching the maximum fault coverage.

...read moreread less

Abstract: In this paper, the authors present parity-based fault detection architecture of the S-box for designing high performance fault detection structures of the advanced encryption standard. Instead of using look-up tables for the S-box and its parity prediction, logical gate implementations based on the composite field are utilized. After analyzing the error propagation for injected single faults, the authors modify the original S-box and suggest fault detection architecture for the S-box. Using the closed formulations for the predicted parity bits, the authors propose a parity-based fault detection scheme for reaching the maximum fault coverage. Moreover, the overhead costs, including space complexity and time delay of our modified S-box and the parity predictions are also compared to those of the previously reported ones

...read moreread less

Journal Article•DOI•

Design techniques and test methodology for low-power TCAMs

[...]

N. Mohan¹, W. Fung², Derek Wright³, Manoj Sachdev¹•Institutions (3)

University of Waterloo¹, ATI Technologies², University of Toronto³

01 Jun 2006-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A comprehensive review of the design techniques for low-power TCAMs is presented and a novel test methodology for various TCAM components is proposed.

...read moreread less

Abstract: Ternary content addressable memories (TCAMs) are gaining importance in high-speed lookup-intensive applications. However, the high cost and power consumption are limiting their popularity and versatility. TCAM testing is also time consuming due to the complex integration of logic and memory. In this paper, we present a comprehensive review of the design techniques for low-power TCAMs. We also propose a novel test methodology for various TCAM components. The proposed test algorithms show significant improvement over the existing algorithms both in test complexity and fault coverage

...read moreread less

Proceedings Article•DOI•

At-Speed Structural Test For High-Performance ASICs

[...]

Vijay S. Iyengar¹, T. Yokota¹, K. Yamada¹, Theo Anemikos¹, Bob Bassett¹, M. Degregorio¹, Rudy Farmer¹, Gary D. Grise¹, Mark Johnson¹, David W. Milton¹, Mark Allan Taylor¹, Frank Woytowich¹ - Show less +8 more•Institutions (1)

IBM¹

01 Oct 2006

TL;DR: This paper presents a new method for at-speed structural test of ASICs, having no tight restrictions on the circuit design, and describes a method to test asynchronous clock domains simultaneously.

...read moreread less

Abstract: At-speed test of integrated circuits is becoming critical to detect subtle delay defects. Existing structural at-speed test methods are inadequate because they are unable to supply sufficiently-varied functional clock sequences to test complex sequential logic. Moreover, they require tight restrictions on the circuit design. In this paper, we present a new method for at-speed structural test of ASICs, having no tight restrictions on the circuit design. In the present implementation, any complex at-speed functional clock waveform for 16 cycles can be applied. We present DFT structures that can generate high-speed launch-off-capture as well as launch-off-scan clocking without the need to switch a scan enable at-speed. We also describe a method to test asynchronous clock domains simultaneously. Experimental results on fault coverage and hardware measurements for three multi-million gate ASICs demonstrate the feasibility of the proposed approach.

...read moreread less

Journal Article•DOI•

Analysis and methodology for multiple-fault diagnosis

[...]

Zhiyuan Wang¹, Malgorzata Marek-Sadowska¹, Kun-Han Tsai, Janusz Rajski•Institutions (1)

University of California, Santa Barbara¹

01 Nov 2006-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A multiple-fault-diagnosis methodology based on the analysis of failing patterns and the structure of diagnosed circuits that has an approximately linear time complexity with respect to the fault multiplicity and achieves a high diagnostic resolution for multiple faults.

...read moreread less

Abstract: In this paper, we propose a multiple-fault-diagnosis methodology based on the analysis of failing patterns and the structure of diagnosed circuits. We do not consider the multiple-fault behavior explicitly, but rather partition the failing outputs and use an incremental simulation-based technique to diagnose failures one at a time. Our methodology can be further improved by selecting appropriate diagnostic test patterns. The n-detection tests allow us to apply a simple single-fault-based diagnostic algorithm, and yet achieve good diagnosability for multiple faults. Experimental results demonstrate that our technique is highly efficient and effective. It has an approximately linear time complexity with respect to the fault multiplicity and achieves a high diagnostic resolution for multiple faults. Real manufactured industrial chips affected by multiple faults can be diagnosed in minutes of central processing unit (CPU) time.

...read moreread less

Journal Article•DOI•

Fault location scheme for combined overhead line with underground power cable

[...]

El Sayed Tag El Din¹, M.M.A. Aziz¹, Doaa Khalil Ibrahim¹, Mahmoud Gilany•Institutions (1)

Cairo University¹

01 Jul 2006-Electric Power Systems Research

TL;DR: In this paper, a fault location scheme for transmission systems consisting of an overhead line combined with an underground power cable is presented, which can be used on-line or off-line using the data stored in the digital fault recording apparatuses.

...read moreread less

Proceedings Article•DOI•

Understanding prediction-based partial redundant threading for low-overhead, high- coverage fault tolerance

[...]

Vimal K. Reddy¹, Eric Rotenberg¹, Sailashri Parthasarathy²•Institutions (2)

North Carolina State University¹, Intel²

20 Oct 2006

TL;DR: This paper attempts to better understand Slipstream's fault tolerance, conjecturing that the mixture of partial duplication and confident predictions actually closely approximates the coverage of full duplication, and proposes and evaluates a suite of simple microarchitectural alterations to recovery and checking.

...read moreread less

Abstract: Redundant threading architectures duplicate all instructions to detect and possibly recover from transient faults. Several lighter weight Partial Redundant Threading (PRT) architectures have been proposed recently. (i) Opportunistic Fault Tolerance duplicates instructions only during periods of poor single-thread performance. (ii) ReStore does not explicitly duplicate instructions and instead exploits mispredictions among highly confident branch predictions as symptoms of faults. (iii) Slipstream creates a reduced alternate thread by replacing many instructions with highly confident predictions. We explore PRT as a possible direction for achieving the fault tolerance of full duplication with the performance of single-thread execution. Opportunistic and ReStore yield partial coverage since they are restricted to using only partial duplication or only confident predictions, respectively. Previous analysis of Slipstream fault tolerance was cursory and concluded that only duplicated instructions are covered. In this paper, we attempt to better understand Slipstream's fault tolerance, conjecturing that the mixture of partial duplication and confident predictions actually closely approximates the coverage of full duplication. A thorough dissection of prediction scenarios confirms that faults in nearly 100% of instructions are detectable. Fewer than 0.1% of faulty instructions are not detectable due to coincident faults and mispredictions. Next we show that the current recovery implementation fails to leverage excellent detection capability, since recovery sometimes initiates belatedly, after already retiring a detected faulty instruction. We propose and evaluate a suite of simple microarchitectural alterations to recovery and checking. Using the best alterations, Slipstream can recover from faults in 99% of instructions, compared to only 78% of instructions without alterations. Both results are much higher than predicted by past research, which claims coverage for only duplicated instructions, or 65% of instructions. On an 8-issue SMT processor, Slipstream performs within 1.3% of single-thread execution whereas full duplication slows performance by 14%.A key byproduct of this paper is a novel analysis framework in which every dynamic instruction is considered to be hypothetically faulty, thus not requiring explicit fault injection. Fault coverage is measured in terms of the fraction of candidate faulty instructions that are directly or indirectly detectable before.

...read moreread less

Proceedings Article•DOI•

Diagnostic Test Generation for Arbitrary Faults

[...]

Naresh K. Bhatti¹, R.D. Blanton¹•Institutions (1)

Carnegie Mellon University¹

01 Oct 2006

TL;DR: This work describes a new diagnostic ATPG implementation that uses a generalized fault model and shows that diagnostic resolution can be significantly enhanced over a traditional diagnostic test set aimed only at stuck-at faults.

...read moreread less

Abstract: It is now generally accepted that the stuck-at fault model is no longer sufficient for many manufacturing test activities. Consequently, diagnostic test pattern generation based solely on distinguishing stuck-at faults is unlikely to achieve the resolution required for emerging fault types. In this work we describe a new diagnostic ATPG implementation that uses a generalized fault model. It can be easily used in any diagnosis framework to refine diagnostic resolution for complex defects. For various types of faults that include, for example, bridge, transition, and transistor stuck-open, we show that diagnostic resolution can be significantly enhanced over a traditional diagnostic test set aimed only at stuck-at faults. Finally, we illustrate the use of our diagnostic ATPG to distinguish faults derived from a state-of-the-art diagnosis flow based on layout.

...read moreread less

Journal Article•DOI•

Simulating Resistive-Bridging and Stuck-At Faults

[...]

P. Engelke¹, Ilia Polian, Michel Renovell, Bernd Becker•Institutions (1)

University of Freiburg¹

01 Oct 2006-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A simulator for resistive-bridging and stuck-at faults based on electrical equations rather than table look up is presented, thus, exposing more flexibility and interaction of fault effects in current time frame and earlier time frames is elaborated on.

...read moreread less

Abstract: The authors present a simulator for resistive-bridging and stuck-at faults. In contrast to earlier work, it is based on electrical equations rather than table look up, thus, exposing more flexibility. For the first time, simulation of sequential circuits is dealt with; interaction of fault effects in current time frame and earlier time frames is elaborated on for different bridge resistances. Experimental results are given for resistive-bridging and stuck-at faults in combinational and sequential circuits. Different definitions of fault coverage are listed, and quantitative results with respect to all these definitions are given for the first time

...read moreread less

Proceedings Article•DOI•

Analyzing Fault Models for Reversible Logic Circuits

[...]

Jing Zhong¹, Jon C. Muzio¹•Institutions (1)

University of Victoria¹

11 Sep 2006

TL;DR: In this article, a new fault model, labeled crosspoint faults, is proposed for reversible logic circuits and a randomized Automatic Test Pattern Generation algorithm targeting this specific kind of fault is introduced and analyzed.

...read moreread less

Abstract: Reversible logic computing is a rapidly developing research area. Testing such circuits is obviously an important issue. In this paper, we consider a new fault model, labeled crosspoint faults, for reversible logic circuits. A randomized Automatic Test Pattern Generation algorithm targeting this specific kind of fault is introduced and analyzed. Simulation results show that the algorithm yields very good performance. The relationship between the crosspoint faults and stuck-at faults is also investigated. We show that the crosspoint fault model is a better fault model for reversible circuits since it dominates the traditional stuck-at fault model in most instances.

...read moreread less

Journal Article•DOI•

Actuator fault distinguishability study for the DAMADICS benchmark problem

[...]

Jan Maciej Kościelny¹, Michał Bartyś¹, Pawel Rzepiejewski¹, José Sá da Costa²•Institutions (2)

Warsaw University of Technology¹, Technical University of Lisbon²

01 Jun 2006-Control Engineering Practice

TL;DR: In this paper, the problems of fault distinguishability in case of a multiple-valued evaluation of fault symptoms are discussed, and fault detectability factors are examined on the example of actuation system.

...read moreread less

Proceedings Article•DOI•

Functional Constraints vs. Test Compression in Scan-Based Delay Testing

[...]

Ilia Polian¹, Hideo Fujiwara¹•Institutions (1)

Nara Institute of Science and Technology¹

06 Mar 2006

TL;DR: This is the first systematic study on the relationship between over testing prevention and test compression and results emphasize the severity of over testing in scan-based delay test.

...read moreread less

Abstract: We present an approach to prevent over testing in scan-based delay test. The test data is transformed with respect to functional constraints while simultaneously keeping as many positions as possible unspecified in order to facilitate test compression. The method is independent of the employed delay fault model, ATPG algorithm and test compression technique, and it is easy to integrate into an existing flow. Experimental results emphasize the severity of over testing in scan-based delay test. Influence of different functional constraints on the amount of the required test data and the compression efficiency is investigated. To the best of our knowledge, this is the first systematic study on the relationship between over testing prevention and test compression

...read moreread less

Journal Article•DOI•

A Novel Delay Fault Testing Methodology Using Low-Overhead Built-In Delay Sensor

[...]

Swaroop Ghosh¹, Swarup Bhunia², Arijit Raychowdhury², Arijit Raychowdhury¹, Kaushik Roy², Kaushik Roy¹ - Show less +2 more•Institutions (2)

Purdue University¹, Case Western Reserve University²

01 Dec 2006-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A novel integrated approach for delay-fault testing in external (automatic-test-equipment-based) and test-per-scan built-in self-test (BIST) using on-die delay sensing and test point insertion is proposed and a robust, low-overhead, and process-tolerant on-chip delay-sensing circuit is designed.

...read moreread less

Abstract: A novel integrated approach for delay-fault testing in external (automatic-test-equipment-based) and test-per-scan built-in self-test (BIST) using on-die delay sensing and test point insertion is proposed. A robust, low-overhead, and process-tolerant on-chip delay-sensing circuit is designed for this purpose. An algorithm is also developed to judiciously insert delay-sensor circuits at the internal nodes of logic blocks for improving delay-fault coverage with little or no impact on the critical-path delay. The proposed delay-fault testing approach is verified for transition- and segment-delay-fault models. Experimental results for external testing (BIST) show up to 31% (30%) improvement in fault coverage and up to 67.5% (85.5%) reduction in test length for transition faults. An increase in the number of robustly detectable critical-path segments of up to 54% and a reduction in test length for the segment-delay-fault model of up to 76% were also observed. The delay and area overhead due to insertion of the delay-sensing hardware have been limited to 2% and 4%, respectively

...read moreread less

Journal Article•DOI•

Software based fault tolerance: a survey

[...]

Goutam Kumar Saha

01 Jul 2006-Ubiquity

TL;DR: This article aims to present a survey of important software based (or software controlled) fault tolerance literature over the period of 1966 to 2006.

...read moreread less

Abstract: This article aims to present a survey of important software based (or software controlled) fault tolerance literature over the period of 1966 to 2006. Nowadays, fault tolerance is a much researched topic. A system fails because of incorrect specification, incorrect design, design flaws, poor testing, undetected fault, environment, substandard implementation, aging component, operator errors or combination of these causes [1,7]. Modern microprocessors having faster and denser transistors with lower threshold voltages and tighter noise margins make them less reliable [6] but such transistors yield performance enhancements [4,5]. At the same time, such transistors render processors more susceptible to transient faults. Transient faults are intermittent faults that are caused by external events or by the environment [7], for examples, energetic particles [84,93] striking the chip or electrical surges [1,3] etc. Though these faults do not cause permanent faults [2], but they may result in incorrect program execution by inadvertently altering processors' states, signal transfers or stored values on registers etc. If a fault of such type affects program's normal execution, it is considered to be a soft error [2,8,85]. Though programming bugs is considered to be an important reason of the most system failures at present but the recent studies suggest that soft errors are increasingly responsible for system downtime. Computing system is becoming more complex and is getting optimized for performance and price but not for availability. This makes soft errors an even more common case. Using denser, smaller and lower voltage transistors has the potential threats to be more susceptible [92] to such increased transient errors. Soft errors

...read moreread less

Collapse