scispace - formally typeset
Search or ask a question

Showing papers on "Fault coverage published in 1999"


Proceedings ArticleDOI
15 Jun 1999
TL;DR: A new time redundancy fault-tolerant approach in which a program is duplicated and the two redundant programs simultaneously run on the processor: the technique exploits several significant microarchitectural trends to provide broad coverage of transient faults and restricted coverage of permanent faults.
Abstract: This paper speculates that technology trends pose new challenges for fault tolerance in microprocessors. Specifically, severely reduced design tolerances implied by gigaherz clock rates may result in frequent and arbitrary transient faults. We suggest that existing fault-tolerant techniques-system-level, gate-level, or component-specific approaches-are either too costly for general purpose computing, overly intrusive to the design, or insufficient for covering arbitrary logic faults. An approach in which the microarchitecture itself provides fault tolerance is required. We propose a new time redundancy fault-tolerant approach in which a program is duplicated and the two redundant programs simultaneously run on the processor: The technique exploits several significant microarchitectural trends to provide broad coverage of transient faults and restricted coverage of permanent faults. These trends are simultaneous multithreading, control flow and data flow prediction, and hierarchical processors-all of which are intended for higher performance, but which can be easily leveraged for the specified fault tolerance goals. The overhead for achieving fault tolerance is low, both in terms of performance and changes to the existing microarchitecture. Detailed simulations of five of the SPEC95 benchmarks show that executing two redundant programs on the fault-tolerant microarchitecture takes only 10% to 30% longer than running a single version of the program.

507 citations


Proceedings ArticleDOI
28 Sep 1999
TL;DR: The experimental results demonstrate that with automation of the proposed solutions, logic BIST can achieve test quality approaching that of ATPG with minimal area overhead and few changes to the design flow.
Abstract: This paper discusses practical issues involved in applying logic built-in self-test (BIST) to four large industrial designs. These multi-clock designs, ranging in size from 200 K to 800 K gates, pose significant challenges to logic BIST methodology, flow, and tools. The paper presents the process of generating a BIST-compliant core along with the logic BIST controller for at-speed testing. Comparative data on fault grades and area overhead between automatic test pattern generation (ATPG) and logic BIST are reported. The experimental results demonstrate that with automation of the proposed solutions, logic BIST can achieve test quality approaching that of ATPG with minimal area overhead and few changes to the design flow.

324 citations


Proceedings ArticleDOI
28 Sep 1999
TL;DR: The design modifications include some gating logic for masking the scan path activity during shifting, and the synthesis of additional logic for suppressing random patterns which do not contribute to increase the fault coverage.
Abstract: Power consumption of digital systems may increase significantly during testing. In this paper, systems equipped with a scan-based built-in self-test like the STUMPS architecture are analyzed, the modules and modes with the highest power consumption are identified, and design modifications to reduce power consumption are proposed. The design modifications include some gating logic for masking the scan path activity during shifting, and the synthesis of additional logic for suppressing random patterns which do not contribute to increase the fault coverage. These design changes reduce power consumption during BIST by several orders of magnitude, at very low cost in terms of area and performance.

312 citations


Journal ArticleDOI
TL;DR: This paper evaluates the concurrent error detection capabilities of system-level checks, using fault and error injection, of Enhanced Control-Flow Checking Using Assertions (ECCA), a proposed enhanced version of ECCA.
Abstract: This paper evaluates the concurrent error detection capabilities of system-level checks, using fault and error injection. The checks comprise application and system level mechanisms to detect control flow errors. We propose Enhanced Control-Flow Checking Using Assertions (ECCA). In ECCA, branch-free intervals (BFI) in a given high or intermediate level program are identified and the entry and exit points of the intervals are determined. BFls are then grouped into blocks, the size of which is determined through a performance/overhead analysis. The blocks are then fortified with preinserted assertions. For the high level ECCA, we describe an implementation of ECCA through a preprocessor that will automatically insert the necessary assertions into the program. Then, we describe the intermediate implementation possible through modifications made on gee to make it ECCA capable. The fault detection capabilities of the checks are evaluated both analytically and experimentally. Fault injection experiments are conducted using FERRARI to determine the fault coverage of the proposed techniques.

251 citations


Proceedings ArticleDOI
26 Apr 1999
TL;DR: A test vectors inhibiting technique is proposed which tackles the increased activity during test operation and a mixed solution based on a reseeding scheme and the vector inhibiting techniques is proposed to deal with hard-to-test circuits that contain pseudo-random resistant faults.
Abstract: During self-test, the switching activity of the circuit under test is significantly increased compared to normal operation and leads to an increased power consumption which often exceeds specified limits. In the first part of this paper, we propose a test vector inhibiting technique which tackles the increased activity during test operation. Next, a mixed solution based on a reseeding scheme and the vector inhibiting technique is proposed to deal with hard-to-test circuits that contain pseudo-random resistant faults. From a general point of view, the goal of these techniques is to minimize the total energy consumption during test and to allow the test at system speed in order to achieve high fault coverage. The effectiveness of the proposed low energy BIST scheme has been validated on a set of benchmarks with respect to hardware overhead and power savings.

168 citations


Journal ArticleDOI
01 Jan 1999
TL;DR: In this paper, a fault locator unit is used to capture the high-frequency voltage transient signal generated by faults on the distribution line/cable, and the travelling time of the highfrequency components are used to determine the fault position.
Abstract: A technique is presented for accurate fault location on distribution overhead lines and underground cables. A specially designed fault locator unit is used to capture the high-frequency voltage transient signal generated by faults on the distribution line/cable. The travelling time of the high-frequency components is used to determine the fault position. The technique is insensitive to fault type, fault resistance, fault inception angle and system source configuration, and is able to offer very high accuracy in fault location in a distribution system.

168 citations


Proceedings ArticleDOI
01 Nov 1999
TL;DR: A systematic approach for automatically introducing data and code redundancy into an existing program written using a high-level language that can be automatically applied as a pre-compilation phase, freeing the programmer from the cost and responsibility of introducing suitable EDMs in its code.
Abstract: The paper describes a systematic approach for automatically introducing data and code redundancy into an existing program written using a high-level language. The transformations aim at making the program able to detect most of the soft-errors affecting data and code, independently of the Error Detection Mechanisms (EDMs) possibly implemented by the hardware. Since the transformations can be automatically applied as a pre-compilation phase, the programmer is freed from the cost and responsibility of introducing suitable EDMs in its code. Preliminary experimental results are reported, showing the fault coverage obtained by the method, as well as some figures concerning the slow-down and code size increase it causes.

163 citations


Journal ArticleDOI
TL;DR: It is shown that there is a coverage hierarchy to fault classes that is consistent with, and may therefore explain, experimental results on fault-based testing.
Abstract: Some varieties of specification-based testing rely upon methods for generating test cases from predicates in a software specification. These methods derive various test conditions from logic expressions, with the aim of detecting different types of faults. Some authors have presented empirical results on the ability of specification-based test generation methods to detect failures. This article describes a method for cokmputing the conditions that must be covered by a test set for the test set to guarantee detection of the particular fault class. It is shown that there is a coverage hierarchy to fault classes that is consistent with, and may therefore explain, experimental results on fault-based testing. The method is also shown to be effective for computing MCDC-adequate tests.

161 citations


Patent
John E. Seem1
05 Aug 1999
TL;DR: In this paper, a finite state machine (FSM) controller for an air handling system is used to determine whether a fault condition exists, based on saturation of the system control or on a comparison of actual performance to a mathematical model of the air handling systems.
Abstract: Fault detection is implemented on a finite state machine controller for an air handling system. The method employs data, regarding the system performance in the current state and upon a transition occurring, to determine whether a fault condition exists. The fault detection may be based on saturation of the system control or on a comparison of actual performance to a mathematical model of the air handling system. As a consequence, the control does not have to be in steady-state operation to perform fault detection.

144 citations


Journal ArticleDOI
TL;DR: Block minimized test sets have a size/effectiveness advantage, in terms of a significant reduction in test set size and with almost the same fault detection effectiveness, over the original non-minimized test sets.

142 citations


Journal ArticleDOI
TL;DR: The design for testability (DFT) of active analog filters based on oscillation-test methodology is described and the DFT techniques investigated are very suitable for automatic testable filter synthesis and can be easily integrated in the tools dedicated to automatic filter design.
Abstract: The oscillation-test strategy is a low cost and robust test method for mixed-signal integrated circuits. Being a vectorless test method, it allows one to eliminate the analog test vector generator. Furthermore, as the oscillation frequency is considered to be digital, it can be precisely analyzed using pure digital circuitry and can be easily interfaced to test techniques dedicated to the digital part of the circuit under test (CUT). This paper describes the design for testability (DFT) of active analog filters based on oscillation-test methodology. Active filters are transformed to oscillators using very simple techniques. The tolerance band of the oscillation frequency is determined by a Monte Carlo analysis taking into account the nominal tolerance of all circuit under test components. Discrete practical realizations and extensive simulations based on CMOS 1.2 /spl mu/m technology parameters affirm that the test technique presented for active analog filters ensures high fault coverage and requires a negligible area overhead. Finally, the DFT techniques investigated are very suitable for automatic testable filter synthesis and can be easily integrated in the tools dedicated to automatic filter design.

Proceedings ArticleDOI
28 Sep 1999
TL;DR: Experimental results demonstrate that LT-RTPGs designed using the proposed methodology decrease the heat dissipated during BIST by significant amounts while attaining high fault coverage, especially for circuits with moderate to large number of scan inputs.
Abstract: A new BIST TPG design, called low-transition random TPG (LT-RTPG), that is comprised of an LFSR, a k-input AND gate, and a T flip-flop, is presented. When used to generate test patterns for test-per-scan BIST, it decreases the number of transitions that occur during scan shifting and hence decreases the heat dissipated during testing. Various properties of LT-RTPGs are studied and a methodology for their design is presented. Experimental results demonstrate that LT-RTPGs designed using the proposed methodology decrease the heat dissipated during BIST by significant amounts while attaining high fault coverage, especially for circuits with moderate to large number of scan inputs.

Journal ArticleDOI
TL;DR: This paper presents a new method for incorporating imperfect FC (fault coverage) into a combinatorial model, SEA, which applies to any system for which the FC probabilities are constant and state-independent; the hazard rates are state- independent; and an FC failure leads to immediate system failure.
Abstract: This paper presents a new method for incorporating imperfect FC (fault coverage) into a combinatorial model. Imperfect FC, the probability that a single malicious fault can thwart automatic recovery mechanisms, is important to accurate reliability assessment of fault-tolerant computer systems. Until recently, it was thought that the consideration of this probability necessitated a Markov model rather than the simpler (and usually faster) combinatorial model. SEA, the new approach, separates the modeling of FC failures into two terms that are multiplied to compute the system reliability. The first term, a simple product, represents the probability that no uncovered fault occurs. The second term comes from a combinatorial model which includes the covered faults that can lead to system failure. This second term can be computed from any common approach (e.g. fault tree, block diagram, digraph) which ignores the FC concept by slightly altering the component-failure probabilities. The result of this work is that reliability engineers can use their favorite software package (which ignores the FC concept) for computing reliability, and then adjust the input and output of that program slightly to produce a result which includes FC. This method applies to any system for which: the FC probabilities are constant and state-independent; the hazard rates are state-independent; and an FC failure leads to immediate system failure.

Journal ArticleDOI
TL;DR: This paper proposes new methods to: 1) Perform fault tolerance based task clustering, which determines the best placement of assertion and duplicate-and-compare tasks, 2) Derive the best error recovery topology using a small number of extra processing elements, and 4) Share assertions to reduce the fault tolerance overhead.
Abstract: Embedded systems employed in critical applications demand high reliability and availability in addition to high performance. Hardware-software co-synthesis of an embedded system is the process of partitioning, mapping, and scheduling its specification into hardware and software modules to meet performance, cost, reliability, and availability goals. In this paper, we address the problem of hardware-software co-synthesis of fault-tolerant real-time heterogeneous distributed embedded systems. Fault detection capability is imparted to the embedded system by adding assertion and duplicate-and-compare tasks to the task graph specification prior to co-synthesis. The dependability (reliability and availability) of the architecture is evaluated during co-synthesis. Our algorithm, called COFTA (Co-synthesis Of Fault-Tolerant Architectures), allows the user to specify multiple types of assertions for each task. It uses the assertion or combination of assertions which achieves the required fault coverage without incurring too much overhead. We propose new methods to: 1) Perform fault tolerance based task clustering, which determines the best placement of assertion and duplicate-and-compare tasks, 2) Derive the best error recovery topology using a small number of extra processing elements, 3) Exploit multidimensional assertions, and 4) Share assertions to reduce the fault tolerance overhead. Our algorithm can tackle multirate systems commonly found in multimedia applications. Application of the proposed algorithm to a large number of real-life telecom transport system examples (the largest example consisting of 2,172 tasks) shows its efficacy. For fault secure architectures, which just have fault detection capabilities, COFTA is able to achieve up to 48.8 percent and 25.6 percent savings in embedded system cost over architectures employing duplication and task-based fault tolerance techniques, respectively. The average cost overhead of COFTA fault-secure architectures over simplex architectures is only 7.3 percent. In case of fault-tolerant architectures, which cannot only detect but also tolerate faults, COFTA is able to achieve up to 63.1 percent and 23.8 percent savings in embedded system cost over architectures employing triple-modular redundancy, and task-based fault tolerance techniques, respectively. The average cost overhead of COFTA fault-tolerant architectures over simplex architectures is only 55.4 percent.

Proceedings ArticleDOI
26 Apr 1999
TL;DR: The concept of partial coverage is introduced, it is shown that it is inherent to analog testing, and coverage cannot be calculated without knowing the performance specifications for a circuit, as well as the process parameter distributions.
Abstract: This paper first summarizes the complete range of analog defects and resultant faults. A complete set of metrics is then derived for measuring the quality of analog tests. The probability-based equations for fault coverage, defect level, yield coverage, and yield loss are self-consistent, and consistent with existing equations for digital test metrics. We introduce the concept of partial coverage, show that it is inherent to analog testing, and show that coverage cannot be calculated without knowing the performance specifications for a circuit, as well as the process parameter distributions. Practical methods for calculating probabilities are discussed, and simple, illustrative examples given.

Proceedings ArticleDOI
26 Apr 1999
TL;DR: It is shown that the path delay fault coverage achieved by an n-detection transition fault test set increases significantly as n is increased, and a method is introduced to reduce the number of tests included in an n -detection test set by using different values of n for different faults based on their potential effect on the defect coverage.
Abstract: We study the effectiveness of n-detection test sets based on transition faults in detecting defects that affect the timing behavior of a circuit. We use path delay faults as surrogates for unmodeled defects, and show that the path delay fault coverage achieved by an n-detection transition fault test set increases significantly as n is increased. We also introduce a method to reduce the number of tests included in an n-detection test set by using different values of n for different faults based on their potential effect on the defect coverage. The resulting test sets are referred to as variable n-detection test sets.

Proceedings ArticleDOI
28 Sep 1999
TL;DR: In this article, the authors developed a model of resistive bridging faults and studied the fault coverage on ISCAS85 circuits of different test sets using resistive and zero-ohm bridges at different supply voltages.
Abstract: In this work/sup 1/ we develop models of resistive bridging faults and study the fault coverage on ISCAS85 circuits of different test sets using resistive and zero-ohm bridges at different supply voltages. These results explain several previously observed anomalous behaviors. In order to serve as a reference, we have developed the first resistive bridging fault ATPG, which attempts to detect the maximum possible bridging resistance at each fault site. We compare the results of the ATPG to the coverage obtained from other test sets, and coverage obtained by using the ATPG in a clean-up mode. Results on ISCAS85 circuits show that stuck-at test sets do quite well, but that the ATPG can still improve the coverage. We have also found that the loss of fault coverage is predominantly due to undetected faults, rather than faults in which only a small resistance is detected. This suggests that lower-cost fault models can be used to obtain high resistive bridge fault coverage.

Proceedings ArticleDOI
15 Jun 1999
TL;DR: A systematic and quantitative approach for using software-implemented fault injection to guide the design and implementation of a fault-tolerant system to improve robustness in the presence of operating system errors is presented.
Abstract: Fault injection is typically used to characterize failures and to validate and compare fault-tolerant mechanisms. However fault injection is rarely used for all these purposes to guide the design and implementation of a fault tolerant system. We present a systematic and quantitative approach for using software-implemented fault injection to guide the design and implementation of a fault-tolerant system. Our system design goal is to build a write-back file cache on Intel PCs that is as reliable as a write-through file cache. We follow an iterative approach to improve robustness in the presence of operating system errors. In each iteration, we measure the reliability of the system, analyze the fault symptoms that lead to data con option, and apply fault-tolerant mechanisms that address the fault symptoms. Our initial system is 13 times less reliable than a write-through file cache. The result of several iterations is a design that is both more reliable (1.9% vs. 3.1% corruption rate) and 5-9 times as fast as a write-through file cache.

Proceedings ArticleDOI
28 Sep 1999
TL;DR: A new technique for diagnosis in a scan-based BIST environment is presented that allows non-adaptive identification of both the scan cells that capture errors as well as a subset of the failing test vectors (time information).
Abstract: A new technique for diagnosis in a scan-based BIST environment is presented. It allows non-adaptive identification of both the scan cells that capture errors (space information) as well as a subset of the failing test vectors (time information). Having both space and time information allows a faster and more precise diagnosis. Previous techniques for identifying the failing test vectors during BIST have been limited in the multiplicity of errors that can be handled and/or require a very large hardware overhead. The proposed approach, however, uses only two cycling registers at the output of the scan chain to accurately identify a subset of the failing BIST test vectors. This is accomplished using some novel pruning techniques that efficiently extract information from the signatures of the cycling registers. While not all the failing BIST test vectors can be identified, results indicate that a significant number of them can be. This additional information can save a lot of time in failure analysis.

Book ChapterDOI
05 Oct 1999
TL;DR: This paper presents a new algorithm, Hit-or-Jump, for embedded testing of components of communication systems that can be modeled by communicating extended finite state machines that constructs test sequences efficiently with a high fault coverage.
Abstract: This paper presents a new algorithm, Hit-or-Jump, for embedded testing of components of communication systems that can be modeled by communicating extended finite state machines. It constructs test sequences efficiently with a high fault coverage. It does not have state space explosion, as is often encountered in exhaustive search, and it quickly covers the system components under test without being “trapped”, as is experienced by random walks. Furthermore, it is a generalization and unification of both exhaustive search and random walks; both are special cases of Hit-or-Jump. The algorithm has been implemented and applied to embedded testing of telephone services in an Intelligent Network (IN) architecture, including the Basic Call Service and five supplementary services.

Journal ArticleDOI
Zhang Qingchao1, Zhang Yao1, Song Wennan1, Yu Yixin1, Wang Zhigang 
TL;DR: In this article, an accurate fault-location algorithm for the two-parallel transmission line of a direct ground neutral system is presented, which employs the faulted circuit and healthy circuit of a twoparallel line as fault location model, in which remote source impedance is not involved.
Abstract: Summary form only as given. An accurate fault-location algorithm for the two-parallel transmission line of a direct ground neutral system is presented. The algorithm employs the faulted circuit and healthy circuit of a two-parallel line as fault-location model, in which remote source impedance is not involved. It effectively eliminates the effect of load flow and fault resistance on the accuracy of fault location. It embodies an accurate location by measuring only one local end data and it is used in a procedure that provides the automatic determination of fault types and phases, rather than require an engineer to specify them. Simulation results have shown the effectiveness of the algorithm under the conditions of nonearth faults (phase-to-phase fault and three-phase fault with and without earth-connection).

Journal ArticleDOI
TL;DR: Establishing computer system dependability benchmarks would make tests much easier and enable comparison of results across different machines.
Abstract: Computer-based systems are expected to be more and more dependable. For that, they have to operate correctly even in the presence of faults, and this fault tolerance of theirs must be thoroughly tested by the injection of faults both real and artificial. Users should start to request reports from manufacturers on the outcomes of such experiments, and on the mechanisms built into systems to handle faults. To inject artificial physical faults, fault injection offers a reasonably mature option today, with Swift tools being preferred for most applications because of their flexibility and low cost. To inject software bugs, although some promising ideas are being researched, no established technique yet exists. In any case, establishing computer system dependability benchmarks would make tests much easier and enable comparison of results across different machines.

Proceedings ArticleDOI
24 Sep 1999
TL;DR: A fast simulation-based method to compute an efficient seed (initial state) of a given primitive polynomial LFSR TPG that is able to deal with combinational circuits of great size and with a lot of primary inputs.
Abstract: Linear Feedback Shift Registers (LFSRs) are commonly used as pseudo-random test pattern generators (TPGs) in BIST schemes. This paper presents a fast simulation-based method to compute an efficient seed (initial state) of a given primitive polynomial LFSR TPG. The size of the LFSR, the primitive feedback polynomial and the length of the generated test sequence are a priori known. The method uses a deterministic test cube compression technique and produces a one-seed LFSR test sequence of a predefined test length that achieves high fault coverage. This technique can be applied either in pseudo-random testing for BISTed circuits containing few random resistant faults, or in pseudo-deterministic BIST where it allows the hardware generator overhead area to be reduced. Compared with existing methods, the proposed technique is able to deal with combinational circuits of great size and with a lot of primary inputs. Experimental results demonstrate the effectiveness of our method.

Proceedings ArticleDOI
30 May 1999
TL;DR: It is shown that appropriately selecting the seed of the LFSR can lead to an important energy reduction, and a heuristic method based on a simulated annealing algorithm is proposed to significantly decrease the energy consumption of BIST sessions.
Abstract: Low-power design looks for low-energy BIST. This paper considers the problem of minimizing the energy required to test a BISTed combinational circuit without modifying the stuck-at fault coverage and with no extra area or delay overhead over the classical LFSR architectures. The objective of this paper is twofold. First, is to analyze the impact of the polynomial and seed selection of the LFSR used as TPG on the energy consumed by the circuit. It is shown that appropriately selecting the seed of the LFSR can lead to an important energy reduction. Second, is to propose a method to significantly decrease the energy consumption of BIST sessions. For this purpose, a heuristic method based on a simulated annealing algorithm is briefly described in this paper. Experimental results using the ISCAS benchmark circuits are reported, showing variations of the weighted switching activity ranging from 147% to 889% according to the seed selected for the LFSR. Note that these results are always obtained with no loss of stuck-at fault coverage.

Proceedings ArticleDOI
04 Mar 1999
TL;DR: The proposed approach is based on the reordering of test vectors of a given test sequence to minimize the average and peak power dissipation during test operation and reduces the internal switching activity by lowering the transition density at circuit inputs.
Abstract: This paper considers the problem of testing VLSI integrated circuits without exceeding their power ratings during test. The proposed approach is based on the reordering of test vectors of a given test sequence to minimize the average and peak power dissipation during test operation. For this purpose, the proposed technique reduces the internal switching activity by lowering the transition density at circuit inputs. The technique considers combinational or full scan sequential circuits and do not modify the initial fault coverage. Results of experiments show reductions of the switching activity ranging from 11% to 66% during external test application.

Proceedings ArticleDOI
Michinobu Nakao1, Seiji Kobayashi1, Kazumi Hatayama1, K. Iijima1, S. Terada1 
28 Sep 1999
TL;DR: Efficient test point selection algorithms, which are suitable for utilizing overhead reduction approaches such as restricted cell replacement, test point flip-flops sharing, are proposed to meet the above requirements.
Abstract: This paper presents a practical test point insertion method for scan-based BIST. To apply test point insertion in actual LSIs, especially high performance LSIs, it is important to reduce the delay penalty and the area overhead of the inserted test points. Here efficient test point selection algorithms, which are suitable for utilizing overhead reduction approaches such as restricted cell replacement, test point flip-flops sharing, are proposed to meet the above requirements. The effectiveness of the algorithms is demonstrated by some experiments.

Proceedings ArticleDOI
16 Nov 1999
TL;DR: A novel low power/energy built-in self test (BIST) strategy based on circuit partitioning to minimize the average power, the peak power and the energy consumption during pseudo-random testing without modifying the fault coverage.
Abstract: In this paper, we propose a novel low power/energy built-in self test (BIST) strategy based on circuit partitioning. The goal of the proposed strategy is to minimize the average power, the peak power and the energy consumption during pseudo-random testing without modifying the fault coverage. The strategy consists in partitioning the original circuit into two structural subcircuits so that each subcircuit can be successively tested through two different BIST sessions. In partitioning the circuit and planning the test session, the switching activity in a time interval (i.e. the average power) as well as the peak power consumption are minimized. Moreover, the total energy consumption during BIST is also reduced since the test length required to test the two subcircuits is roughly the same as the test length for the original circuit. Results on ISCAS circuits show that average power reduction of up to 72%, peak power reduction of up to 53%, and energy reduction of up to 84% can be achieved.

Journal ArticleDOI
TL;DR: Two fault injection methodologies are presented-stress-based injection and path-based injections; both are based on resource activity analysis to ensure that injections cause fault tolerance activity and, thus, the resulting exercise of fault tolerance mechanisms.
Abstract: The objective of fault injection is to mimic the existence of faults and to force the exercise of the fault tolerance mechanisms of the target system. To maximize the efficacy of each injection, the locations, timing, and conditions for faults being injected must be carefully chosen. Faults should be injected with a high probability of being accessed. This paper presents two fault injection methodologies-stress-based injection and path-based injection; both are based on resource activity analysis to ensure that injections cause fault tolerance activity and, thus, the resulting exercise of fault tolerance mechanisms. The difference between these two methods is that stress-based injection validates the system dependability by monitoring the run-time workload activity at the system level to select faults that coincide with the locations and times of greatest workload activity, while path-based injection validates the system from the application perspective by using an analysis of the program flow and resource usage at the application program level to select faults during the program execution. These two injection methodologies focus separately on the system and process viewpoints to facilitate the testing of system dependability. Details of these two injection methodologies are discussed in this paper, along with their implementations, experimental results, and advantages and disadvantages.

Proceedings ArticleDOI
10 Jan 1999
TL;DR: Results on ISCAS benchmark circuits show that energy reduction of up to 97.82% can be achieved (compared to equi-probable random-pattern testing with identical fault coverage) while achieving high fault coverage.
Abstract: Due to the increasing use of portable computing and wireless communications systems, energy consumption is of major concern in today's VLSI circuits. With that in mind we present an energy conscious weighted random pattern testing technique for Built-In-Self-Test (BIST) applications. Energy consumption during BIST operation can be minimized while achieving high fault coverage. Simple measures of observability and controllability of circuit nodes are proposed based on primary input signal probability (probability that a signal is logic ONE). Such measures help determine the testability of a circuit. We developed a tool, POWERTEST, which uses a genetic algorithm based search to determine optimal weight sets (signal probabilities or input signal distribution) at primary inputs to minimize energy dissipations. The inputs conforming to the primary input weight set can be generated using cellular automata or LFSR (Linear Feedback Shift Register). We observed that a single input distribution (or weights) may not be sufficient for some random-pattern resistant circuits, while multiple distributions consume larger area. As a trade-off, two distributions have been used in our analysis. Results on ISCAS benchmark circuits show that energy reduction of up to 97.82% can be achieved (compared to equi-probable random-pattern testing with identical fault coverage) while achieving high fault coverage.

Proceedings ArticleDOI
28 Sep 1999
TL;DR: Results show that pattern generation should be driven by the most accurate modeling method when pursuing 100% bridging coverage, since less accurate methods will not necessarily converge to a high quality result.
Abstract: This study provides bridging fault simulation data obtained from the AMD-K6 microprocessor. It shows that: (1) high stuck-at fault coverage (99.5%) implies high bridging fault coverage; (2) coverage of a bridging fault by both wired-AND and wired-OR behavior does not guarantee detection of that fault when compared against a more accurate (transistor-level simulation) modeling method. A set of netname pairs representing bridging fault sites were extracted from layout and used for each fault modeling method. Results show that pattern generation should be driven by the most accurate modeling method when pursuing 100% bridging coverage, since less accurate methods will not necessarily converge to a high quality result.