scispace - formally typeset
Search or ask a question

Showing papers on "Fault model published in 2001"


Proceedings ArticleDOI
25 Jun 2001
TL;DR: This paper gives an overview of recent tools to analyze and explore structure and other fundamental properties of an automated system such that any inherent redundancy in the controlled process can be fully utilized to maintain availability, even though faults may occur.
Abstract: Faults in automated processes will often cause undesired reactions and shut-down of a controlled plant, and the consequences could be damage to technical parts of the plant, to personnel or the environment. Fault-tolerant control combines diagnosis with control methods to handle faults in an intelligent way. The aim is to prevent that simple faults develop into serious failure and hence increase plant availability and reduce the risk of safety hazards. Fault-tolerant control merges several disciplines into a common framework to achieve these goals. The desired features are obtained through online fault diagnosis, automatic condition assessment and calculation of appropriate remedial actions to avoid certain consequences of a fault. The envelope of the possible remedial actions is very wide. Sometimes, simple re-tuning can suffice. In other cases, accommodation of the fault could be achieved by replacing a measurement from a faulty sensor by an estimate. In yet other situations, complex reconfiguration or online controller redesign is required. This paper gives an overview of recent tools to analyze and explore structure and other fundamental properties of an automated system such that any inherent redundancy in the controlled process can be fully utilized to maintain availability, even though faults may occur.

289 citations


Journal ArticleDOI
TL;DR: In this paper, the authors used a simplified Landers fault model where the fault segments were combined into a single vertical, planar fault to invert for the dynamic rupture propagation of the 1992 Landers earthquake.
Abstract: We have used observed band-pass filtered accelerograms and a previously determined slip distribution to invert for the dynamic rupture propagation of the 1992 Landers earthquake. In our simulations, dynamic rupture grows under the simultaneous control of initial stress and rupture resistance by friction, which we modeled using a simple slip-weakening law. We used a simplified Landers fault model where the fault segments were combined into a single vertical, planar fault. By trial and error we modified an initial stress field, inferred from the kinematic slip distribution proposed by Wald and Heaton [1994], until dynamic rupture generated a rupture history and final slip distribution that approximately matched those determined by the kinematic inversion. We found that rupture propagation was extremely sensitive to small changes in the distribution of prestress and that a delicate balance with energy release rate controls the average rupture speed. For the inversion we generated synthetic 0.5 Hz ground displacements using an efficient Green's function propagator method (AXITRA). This method enables us to propagate the radiation generated by the dynamic rupture to distances greater than those feasible using the finite difference method. The dynamic model built by trial-and-error inversion provides a very satisfactory fit between synthetics and strong motion data. We validated this model using records from stations used in the slip inversion as well as some which were not included. We also inverted for a complementary model that fits the data just as well but in which the initial stress was perfectly uniform while rupture resistance was heterogeneous. This demonstrates that inversion of ground motion is nonunique.

178 citations


Patent
07 May 2001
TL;DR: In this article, a system for analyzing a fault includes a fault object factory (110) constructed and arranged to receive fault data and create a fault objects (112), and a fault diagnosis engine (101), which performs root cause analysis of the fault object.
Abstract: A system (100) for analyzing a fault includes a fault object factory (110) constructed and arranged to receive fault data and create a fault object (112), and a fault diagnosis engine (101) constructed and arranged to perform root cause analysis of the fault object. The system may further include a fault detector (130) constructed and arranged to detect the fault data in a monitored entity, a fault repository (140) constructed and arranged to store and access the fault object; and a fault handler (150) constructed and arranged to be triggered by the fault diagnosis engine to analyze the fault object.

129 citations


Proceedings ArticleDOI
Jeff Offutt1, Roger T. Alexander, Y. Wu, Q. Xiao, C. Hutchinson1 
27 Nov 2001
TL;DR: This paper presents a model for the appearance and realization of OO faults and defines and discusses specific categories of inheritance and polymorphic faults, which can be used to support empirical investigations of object-oriented testing techniques, to inspire further research into object- oriented testing and analysis, and to help improve design and development of object.
Abstract: Although program faults are widely studied, there are many aspects of faults that we still do not understand, particularly about OO software. In addition to the simple fact that one important goal during testing is to cause failures and thereby detect faults, a full understanding of the characteristics of faults is crucial to several research areas. The power that inheritance and polymorphism brings to the expressiveness of programming languages also brings a number of new anomalies and fault types. This paper presents a model for the appearance and realization of OO faults and defines and discusses specific categories of inheritance and polymorphic faults. The model and categories can be used to support empirical investigations of object-oriented testing techniques, to inspire further research into object-oriented testing and analysis, and to help improve design and development of object-oriented software.

94 citations


Proceedings ArticleDOI
20 May 2001
TL;DR: This work introduces the concept of fail-stutter fault tolerance, a realistic and yet tractable fault model that accounts for both absolute failure and a new range of performance failures common in modern components.
Abstract: Traditional fault models present system designers with two extremes: the Byzantine fault model, which is general and therefore difficult to apply, and the fail-stop fault model, which is easier to employ but does not accurately capture modern device behavior To address this gap, we introduce the concept of fail-stutter fault tolerance, a realistic and yet tractable fault model that accounts for both absolute failure and a new range of performance failures common in modern components. Systems built under the fail-stutter model will likely perform well, be highly reliable and available, and be easier to manage when deployed.

83 citations


Journal Article
TL;DR: The annotated bibliography highlights work in the area of algorithmic test generation from formal specifications with guaranteed fault coverage, i.e., fault model-driven test derivation as a triple, comprising a finite state specification, conformance relation and fault domain that is the set of possible implementations.
Abstract: The annotated bibliography highlights work in the area of algorithmic test generation from formal specifications with guaranteed fault coverage, i.e., fault model-driven test derivation. A fault model is understood as a triple, comprising a finite state specification, conformance relation and fault domain that is the set of possible implementations. The fault model can be specialized to Input/Output FSM, Labeled Transition System, or Input/Output Automaton and to a number of conformance relations such as FSM equivalence, reduction or quasi-equivalence, trace inclusion or trace equivalence and others. The fault domain usually reflects test assumptions, as an example, it can be the universe of all possible I/O FSMs with a given number of states, a classical fault domain in FSM-based testing. A test suite is complete with respect to a given fault model when each implementation from the fault domain passes it if and only if the postulated conformance relation holds between the implementation and its specification. A complete test suite is said to provide fault coverage guarantee for a given fault model.

74 citations


Proceedings ArticleDOI
15 May 2001
TL;DR: In this paper, a ground-fault relay is used to detect the fault, identify the faulted phase, and measure the electrical distance away from the substation, where the remote fault indicators are used to visually indicate where the fault is located.
Abstract: One of the most common and difficult problems to solve in industrial power systems is the location and elimination of the ground fault. Ground faults that occur in ungrounded and high-resistance grounded systems do not draw enough current to trigger circuit breaker or fuse operation, making them difficult to localize. Techniques currently used to track down faults are time consuming and cumbersome. A new approach developed for ground-fault localization on ungrounded and high-resistance grounded low-voltage systems is described. The system consists of a novel ground-fault relay that operates in conjunction with low-cost fault indicators permanently mounted in the circuit. The ground-fault relay employs digital signal processing techniques to detect the fault, identify the faulted phase, and measure the electrical distance away from the substation. The remote fault indicators are used to visually indicate where the fault is located. The resulting system provides a fast, easy, economical, and safe detection system for ground-fault localization.

73 citations


Journal ArticleDOI
TL;DR: In this article, a robust fault isolation scheme for a class of non-linear systems with unstructured modelling uncertainty and partial state measurement is presented, which consists of a fault detection and approximation estimator and a bank of isolation estimators.
Abstract: The design and analysis of fault diagnosis methodologies for non-linear systems has received significant attention recently. This paper presents a robust fault isolation scheme for a class of non-linear systems with unstructured modelling uncertainty and partial state measurement. The proposed fault diagnosis architecture consists of a fault detection and approximation estimator and a bank of isolation estimators. Each isolation estimator corresponds to a particular type of fault in the fault class. A fault isolation decision scheme is presented with guaranteed performance. If at least one component of the output estimation error of a particular fault isolation estimator exceeds the corresponding adaptive threshold at some finite time, then the occurrence of that type of fault can be excluded. Fault isolation is achieved if this is valid for all but one isolation estimator. Based on the class of non-linear systems under consideration, fault isolability conditions are rigorously investigated, characterizin...

68 citations


Proceedings ArticleDOI
29 Mar 2001
TL;DR: A coupling fault model is developed that appropriately models disturbances in flash memories that use floating gate transistor as their core memory element and the behavior of faulty cells under different fault models and how their characteristics change under each model is described.
Abstract: Nonvolatile Memories (NVMs) can undergo different types of disturbances. These disturbances are particular to the technology and the cell structure of the memory element. In this paper we develop a coupling fault model that appropriately models disturbances in flash memories that use floating gate transistor as their core memory element. We describe the behavior of faulty cells under different fault models and how their characteristics change under each model. We demonstrate the inappropriateness of conventional march algorithms for testing flash memories and present a procedure to derive pseudo-algorithms that can be used in testing flash memories. In addition we present an efficient test that detects these disturbances under different fault models developed in this paper.

60 citations


Journal ArticleDOI
TL;DR: In this article, the authors considered a model where the local boundary condition corresponds to a linear slip-dependent friction law, and defined the effective slip dependent friction law by analogy with the theoretical spectral solution for the initiation phase in the case of a homogeneous infinite fault.
Abstract: Numerical simulation of the rupture process is usually performed under an assumption of scale invariance of the friction process although heterogeneous fault properties are shown by both direct observations of surface crack geometry and slip inversion results. We investigate if it is possible to define an effective friction law for a finite fault with a small-scale heterogeneity, that is, with a distribution of narrow segments with a resistance to rupture higher than the rest of the fault. We consider a model where the local boundary condition corresponds to a linear slip-dependent friction law. We define the effective slip-dependent friction law by analogy with the theoretical spectral solution for the initiation phase in the case of a homogeneous infinite fault. We use finite difference simulations to test the validity of this approach. The results show a surprisingly good agreement between the calculations for the complete heterogeneous fault model and for a homogeneous fault with an effective friction law. The time of initiation and the average of the slip velocity on the fault are well predicted by the effective model. The effective friction law exhibits a nonlinear slip dependence with an initial weakening rate different from the one of the local laws. This initial weakening rate is related to the geometry of the heterogeneity and can be obtained by an eigenvalue analysis. The effective law shows a kink at a slip that corresponds to the average slip on the fault for which the stress concentration of the strong segments is sufficient to trigger their rupture. While based on a rather simple model of a fault, these results indicate that an effective friction can be defined and used for practical purposes. The heterogeneity of a fault tends to decrease the initial weakening rate of the local weak patches. Since the initial weakening rate controls the initiation duration, this last point indicates that the duration of initiation expected from actual heterogeneous faults is much larger than the one deduced from small-scale laboratory measurements. The actual fracture energy is not conservative in the rescaling of the friction law.

54 citations


Journal ArticleDOI
TL;DR: A new fault-tolerant intersection function is presented, which satisfies the Lipschitz condition for the uniform metric and is optimal among all functions with this property, which settles Lamport's question about such a function raised in [5].
Abstract: We present a new fault-tolerant intersection function F, which satisfies the Lipschitz condition for the uniform metric and is optimal among all functions with this property. F thus settles Lamport's question about such a function raised in [5]. Our comprehensive analysis reveals that F has exactly the same worst-case performance as the optimal Marzullo function M, which does not satisfy a Lipschitz condition. The utilized modelling approach in conjunction with a powerful hybrid fault model ensures compatibility of our results with any known application framework, including replicated sensors and clock synchronization.

Journal ArticleDOI
TL;DR: In this paper, a new method that handles multiple hypotheses is presented for fault diagnosis using sequence of event recorders (SERs) to quantify the certainty of hypotheses, a method to calculate their credibility is provided.
Abstract: In this study, a new method that handles multiple hypotheses is presented for fault diagnosis using sequence of event recorders (SERs). To quantify the certainty of hypotheses, a method to calculate their credibility is provided. The proposed techniques are integrated in a generalized alarm analysis module (GAAM) and have been tested with numerous scenarios from the Italian power system.

Book
30 Nov 2001
TL;DR: This chapter discusses Von Neumann's approach to Fault Tolerance, Redundant Implementations of Algebraic Machines, and the implications for Reliable Dynamic Systems using Distributed Voting Schemes and Constant Redundancy.
Abstract: List of Figures. List of Tables. Foreword by George Verghese. Preface. Acknowledgments. 1: Introduction. 1. Definitions, Motivation and Background. 2. Fault-Tolerant Combinational Systems. 2.1. Reliable Combinational Systems. 2.2. Minimizing Redundant Hardware. 3. Fault-Tolerant Dynamic Systems. 3.1. Redundant Implementations. 3.2. Faults in the Error-Correcting Mechanism. 4. Coding Techniques for Fault Diagnosis. Part I: Fault-Tolerant Combinational Systems. 2: Reliable Combinational Systems Out of Unreliable Components. 1. Introduction. 2. Computational Models for Combinational Systems. 3. Von Neumann's Approach to Fault Tolerance. 4. Extensions of Von Neumann's Approach. 4.1. Maximum Tolerable Noise for 3-Input Gates. 4.2. Maximum Tolerable Noise for u-Input Gates. 5. Related Work and Further Reading. 3: ABFT for Combinational Systems. 1. Introduction. 2. Arithmetic Codes. 3.Algorithm-Based Fault Tolerance. 4. Generalizations of Arithmetic Coding to Operations with Algebraic Structure. 4.1. Fault Tolerance for Abelian Group Operations. 4.1.1. Use of Group Homomorphisms. 4.1.2. Error Detection and Correction. 4.1.3. Separate Group Codes. 4.2. Fault Tolerance for Semigroup Operations. 4.2.1. Use of Semigroup Homomorphisms. 4.2.2. Error Detection and Correction. 4.2.3. Separate Semigroup Codes. 4.3. Extensions. Part II: Fault-Tolerant Dynamic Systems. 4: Redundant Implementations of Algebraic Machines. 1. Introduction. 2. Algebraic Machines: Definitions and Decompositions. 3. Redundant Implementations of Group Machines. 3.1. Separate Monitors for Group Machines. 3.2. Non-Separate Redundant Implementations for Group Machines. 4. Redundant Implementations of Semigroup Machines. 4.1. Separate Monitors for Reset-Identity Machines. 4.2. Non-Separate Redundant Implementations for Reset-Identity Machines. 5. Summary. 5: Redundant Implementations of Discrete-Time LTI Dynamic Systems. 1. Introduction. 2. Discrete-Time LTI Dynamic Systems. 3. Characterization of Redundant Implementations. 4. Hardware Implementation and Fault Model. 5. Examples of Fault-Tolerant Systems. 6. Summary. 6: Redundant Implementations of Linear Finite-state Machines. 1. Introduction. 2. Linear Finite-State Machines. 3. Characterization of Redundant Implementations. 4. Examples of Fault-Tolerant Systems. 5. Hardware Minimization in Redundant LFSM Implementations. 6. Summary. 7: Unreliable Error Correction in Dynamic Systems. 1. Introduction. 2. Fault Model for Dynamic Systems. 3. Reliable Dynamic Systems using Distributed Voting Schemes. 4. Reliable Linear Finite-State Machines. 4.1. Low-Density Parity Check Codes and Stable Memories. 4.2. Reliable Linear Finite-State Machines using Constant Redundancy. 5. Other Issues. 8: Coding A

Journal ArticleDOI
TL;DR: In this paper, it is shown that the binary decision diagram (BDD) method can overcome some of the difficulties in the analysis of non-coherent fault trees, and what potential benefits can be derived from the incorporation of NOT logic.
Abstract: Risk and safety assessments carried out on potentially hazardous industrial systems commonly employ fault tree analysis to predict the probability or frequency of system failure. Causes of the system failure mode are developed in an inverted tree structure where the events are linked using logic gates. The type of logic is usually restricted to AND and OR gates which makes the fault tree structure coherent. The use, directly or indirectly, of the NOT logic gate is generally discouraged as this can result in a non-coherent structure. Non-coherent structures mean that components' working states contribute to the failure of the system. The qualitative and quantitative analysis of such fault trees can present additional difficulties when compared to the coherent versions. This paper examines some of the difficulties that can occur, and what potential benefits can be derived from the incorporation of NOT logic. It is shown that the binary decision diagram (BDD) method can overcome some of the difficulties in the analysis of non-coherent fault trees. Copyright © 2001 John Wiley & Sons, Ltd.

Journal ArticleDOI
01 Jan 2001
TL;DR: In this article, on-site condition monitoring, safety evaluation and preventive maintenance for an earth fault are proposed in order to improve the system safety, including on-line parameters measurement based on the resonance-measurement principle.
Abstract: Many mining power systems utilize ineffectively grounded sources to restrict the residual current of single-phase earth fault in order to reduce outage and shock hazard. In practice, with the system expansion, topology changing and insulation aging, the potential residual current and zero sequence voltage for earth fault vary dynamically and some arcing earth faults can easily cause overvoltage and induce multiple faults. In order to improve the system safety, on-site condition monitoring, safety evaluation and preventive maintenance for an earth fault are proposed in this paper. Some techniques for on-line parameters-measurement based on the resonance-measurement principle are presented. Power system operation states are classified into normal secure state, alert state, incipient fault state and fault state. Some security enhancement methods (preventive action and remedial action) for each state are implemented. The prototype for safety evaluation has been developed and installed in industry power systems for several years.

Proceedings ArticleDOI
22 Apr 2001
TL;DR: This work employs the finite state machine (FSM) model for networks to investigate fault identification using passive testing and develops the theorems and algorithms for fault identification.
Abstract: We employ the finite state machine (FSM) model for networks to investigate fault identification using passive testing. First we introduce the concept of passive testing. Then, we introduce the FSM model with necessary assumptions and justification. We introduce the fault model and the fault detection algorithm using passive testing. Extending this result, we develop the theorems and algorithms for fault identification. An example is given illustrating our approach. Then, extensions to our approach are introduced to achieve better fault identification. We then illustrate our technique through a simulation of a practical X.25 example. Finally future extensions and potential trends are discussed.

Proceedings ArticleDOI
30 Oct 2001
TL;DR: A new fault signature type is introduced, and the idea of low-precision fault diagnosis in which useful diagnoses are obtained with a minimum of data and computation is developed.
Abstract: The foremost problem in VLSI fault diagnosis is the problem of data: there's simply too much of it. Circuits are large and contain enormous numbers of faults of many types. Full-response fault dictionaries are too large for practical or quick diagnosis, but pass-fail dictionaries often do not provide adequate resolution. The goal of this paper is to examine different sets of diagnostic data, determine the utility of each, and investigate ways of drastically reducing the data requirements for fault diagnosis. In doing so we introduce a new fault signature type, and begin to develop the idea of low-precision fault diagnosis in which useful diagnoses are obtained with a minimum of data and computation.

Proceedings ArticleDOI
04 Dec 2001
TL;DR: An integrated methodology for detecting, isolating and accommodating faults in a class of nonlinear dynamical systems using a fault diagnosis module and a fault-tolerant control module designed to compensate for the effects of faults.
Abstract: The paper presents an integrated methodology for detecting, isolating and accommodating faults in a class of nonlinear dynamical systems. A fault diagnosis module is used for fault detection and isolation. Based on the fault information obtained during the fault diagnosis procedure, a fault-tolerant control module is designed to compensate for the effects of faults. In the presence of a fault, a nominal controller guarantees the boundedness of all the system signals until the fault is detected. Then the controller is reconfigured after fault detection and after fault isolation, respectively, to improve the control performance using the fault information generated by the diagnosis module. Under certain assumptions, the stability of the closed-loop system is rigorously investigated.

Journal ArticleDOI
TL;DR: In this paper, an approach is proposed in which fault detection and diagnosis (FDD) tasks are distributed to separate FDD modules associated with each control system located throughout a plant.

Proceedings ArticleDOI
01 Jul 2001
TL;DR: A new hybrid fault model for clock synchronization and single-round agreement in synchronous distributed systems, which accurately captures both node and link faults is proposed, which shows that the consistent broadcast primitive of Srikanth & Toueg (1987) can be analyzed under this model.
Abstract: We propose a new hybrid fault model for clock synchronization and single-round (approximate) agreement in synchronous distributed systems, which accurately captures both node and link faults. Unlike conventional "global" fault models, which rest upon the total number of faulty nodes in the system, it solely relies upon the number of faults in any two non-faulty nodes' "perceptions"-conveyed by the messages from all other nodes-of the system. This way, arbitrary node and communication faults, including receiver-caused omission and time/value faults, can be modeled properly. As an example, we show that the consistent broadcast primitive (and hence the clock synchronization algorithms) of Srikanth & Toueg (1987) can be analyzed under this model. As far as link faults are concerned, our analysis reveals that as few as 4f/sub /spl Cscr/a/+2f/sub /spl Cscr/s/+2f/sub /spl Cscr/o/+1 nodes are sufficient for tolerating at most f/sub /spl Cscr/a/, f/sub /spl Cscr/s/, and f/sub /spl Cscr/o/ asymmetric, symmetric, and omission link faults at any receiving node.

Journal ArticleDOI
TL;DR: Realization-independent block testing for cores (RIBTEC), a novel ATPG program for such designs, is described, which employs a functional (behavioral) fault model based on a class of nonexhaustive "universal" test sets.
Abstract: Conventional automatic test-pattern generation (ATPG) cannot effectively handle designs employing blocks whose implementation details are either unknown, unavailable, or subject to change. Realization-independent block testing for cores (RIBTEC), a novel ATPG program for such designs, is described, which employs a functional (behavioral) fault model based on a class of nonexhaustive "universal" test sets. Given a circuit's high-level block structure, RIBTEC constructs a universal test set (UTS) for each block from its functional description in such a way that realization independence of the blocks is ensured. Experimental results are presented for representative datapath circuits, which demonstrate that RIBTEC achieves very high fault coverage and an exceptionally high level of realization independence. We also show that RIBTEC can be applied to designs containing a class of small intellectual property (IP) circuits (cores).

Proceedings ArticleDOI
19 Nov 2001
TL;DR: The proposed algorithm mixes a code coverage-oriented approach with fault-oriented optimizations and exploits a fault model at the RT-level that enables efficient fault simulation and guarantees good correlation with gate-level fault coverage.
Abstract: The ASIC design flow is rapidly moving towards higher description levels, and most design activities are now performed at the RT-level. However, test-related activities are lacking behind this trend, mainly since effective fault models and test pattern generation tools are still missing. This paper proposes techniques for implementing a high-level ATPG. The proposed algorithm mixes a code coverage-oriented approach with fault-oriented optimizations. Moreover, it exploits a fault model at the RT-level that enables efficient fault simulation and guarantees good correlation with gate-level fault coverage. Experimental results show that the achieved results are comparable or better than those obtained at the gate level or by similar RT-level approaches.

Journal ArticleDOI
TL;DR: In this article, a dynamic simulator for MSF desalination plants is presented, which allows the modification of MSF topology and parameters into a wide range, such as the number of stages belonging to the recovery and rejection sections, controller parameters (set point, integral time and gain), valve size, pump characteristics, seawater conditions, stages and heater dimensions.

Journal ArticleDOI
TL;DR: A novel approach based on rough set theory and a pairwise comparison table for fault diagnosis is proposed that attempts to learn from the pattern of decision-making by domain experts from past experience and uses the knowledge acquired, which is in the form of a minimum decision rule set, to determine the ordering of basic events in a fault tree.
Abstract: The performance of manufacturing systems or equipment is, to a great extent, dependent upon the condition of their components. Closely monitoring the condition of the critical components and carrying out timely system diagnosis whenever a fault symptom is detected would help to reduce system downtime and improve overall productivity. Fault tree analysis (FTA) is a powerful tool for reliability studies and risk assessment. However, most research on FTA focuses on the generation of minimum cut sets and how to calculate the probability of main events. As a result, the issue concerning the ordering of basic events in a fault tree has been largely neglected. In this paper, a novel approach based on rough set theory and a pairwise comparison table for fault diagnosis is proposed. The approach attempts to learn from the pattern of decision-making by domain experts from past experience and uses the knowledge acquired, which is in the form of a minimum decision rule set, to determine the ordering of basic events in a fault tree. The details of the approach, together with the basic concepts of rough set theory, are presented. A case study is used to illustrate the application of the proposed approach. Results show that a reasonable ordering of basic events in a fault tree can be generated easily. With the ordering of basic events determined, a maintenance engineer in a manufacturing plant can then carry out fault diagnosis in an efficient and orderly manner.

Patent
11 May 2001
TL;DR: In this paper, a new testing method uses a field programmable gate array to emulate faults, instead of using a separate computer to simulate faults, and a fault model that can be used in the present invention is disclosed.
Abstract: A new testing method uses a field programmable gate array to emulate faults, instead of using a separate computer to simulate faults. In one embodiment, a few (e.g., two or three) known good FPGAs are selected. A fault is introduced into the design of a FPGA configuration. The configuration is loaded into the FPGAs. A test vector is applied and the result is evaluated. If the result is different from that of a fault-free configuration, the fault is caught. One application of this method is to evaluate fault coverage. A fault model that can be used in the present invention is disclosed.

Proceedings ArticleDOI
28 Oct 2001
TL;DR: An application of combinatorial designs and variance analysis to correlating events in the midst of multiple network faults shows that statistical analysis can pinpoint the probable causes of the observed symptoms with high accuracy and significant level of confidence.
Abstract: We present an application of combinatorial designs and variance analysis to correlating events in the midst of multiple network faults. The network fault model is based on the probabilistic dependency graph that accounts for the uncertainty about the state of network elements. Orthogonal arrays help reduce the exponential number of failure configurations to a small subset on which further analysis is performed. The preliminary results show that statistical analysis can pinpoint the probable causes of the observed symptoms with high accuracy and significant level of confidence. An example demonstrates how multiple soft link failures are localized in MIL-STD 188-220's datalink layer to explain the end-to-end connectivity problems in the network layer This technique can be utilized for the networks operating in an unreliable environment such as wireless and/or military networks.

01 Jan 2001
TL;DR: An new distribution network fault diagnosis approach to deal with the imperfect alarm signals, caused by malfunction or failing operation of protection relays and circuit breakers, error in the communication equipment is proposed.
Abstract: Based on rough set theory, the paper proposes an new distribution network fault diagnosis approach to deal with the imperfect alarm signals, caused by malfunction or failing operation of protection relays and circuit breakers, error in the communication equipment. Due to rough set theory can effectively handle the imprecise problems without any ancestor information except the data set itself, a decision table including all kinds of fault cases is established by considering the signals of protection relays and circuit breakers. Then, diagnostic rules are extracted by reducing the decision table. Using the reducts of decision table, the diagnosis rules can be obtained directly from fault samples which have been established. The method can tell indispensable fault signals from the dispensable ones and discover the inherent redundancy of alarm signal set. Finally, a practical fault diagnosis program is proposed based on VC++6.0, with the main interface made by VB6.0. A lot of simulation results show that the method is quite efficient and has excellent fault-tolerance capability.

Journal ArticleDOI
TL;DR: Results of trained OI Nets on the Iris classification problem show that fault tolerance can be increased with the algorithm presented, resulting in low weight salience and distributed computation.
Abstract: The recursive training algorithm for the optimal interpolative (OI) classification network is extended to include distributed fault tolerance. The conventional OI Net learning algorithm leads to network weights that are nonoptimally distributed (in the sense of fault tolerance). Fault tolerance is becoming an increasingly important factor in hardware implementations of neural networks. But fault tolerance is often taken for granted in neural networks rather than being explicitly accounted for in the architecture or learning algorithm. In addition, when fault tolerance is considered, it is often accounted for using an unrealistic fault model (e.g., neurons that are stuck on or off rather than small weight perturbations). Realistic fault tolerance can be achieved through a smooth distribution of weights, resulting in low weight salience and distributed computation. Results of trained OI Nets on the Iris classification problem show that fault tolerance can be increased with the algorithm presented in this paper.

Patent
Ochiai Shinichi1
26 Jan 2001
TL;DR: In this article, a fault management table for storing operation mode information indicating the operating status of the information processing system and a type of fault handling processing corresponding to the detected fault in the system in such a manner as to relate the operator mode information with the type of the fault handling process.
Abstract: A fault handling system which detects a fault occurred in an information processing system, and performs fault handling processing corresponding to the detected fault in order to recover from the detected fault condition. The fault handling system is provided with a fault management table for storing operation mode information indicating the operating status of the information processing system and a type of fault handling processing corresponding to the detected fault in the information processing system in such a manner as to relate the operation mode information with the type of the fault handling processing, and a fault handling facility for determining the operation mode information and obtaining the type of fault handling processing corresponding to the operation mode information determined by referring to the fault management table.

Patent
31 Aug 2001
TL;DR: In this paper, when a fault-on-fault condition arises in a data processing system which follows a backup fault procedure in the fault handling process, control is passed to dedicated firmware.
Abstract: When a fault-on-fault condition arises in a data processing system which follows a backup fault procedure in the fault handling process, control is passed to dedicated firmware. Fault flags are reset and information vital to maintaining operating system control is sent to a reserved memory (which can be written to in limited circumstances) under firmware control. Control is then transferred to an Intercept process resident in the reserved memory which attempts to build a stable environment for the operating system to dump the system memory. If possible, a dump is taken, and a normal operating system restart is carried out. If not possible, a message with the vital fault information is issued, and a full manual restart must be taken. Even in the latter case, the fault information is available to help in determining the cause of the fault-on-fault.