scispace - formally typeset
Search or ask a question

Showing papers on "Fault detection and isolation published in 1995"


Journal ArticleDOI
TL;DR: The approach to failure diagnosis presented in this paper is applicable to systems that fall naturally in the class of DES's; moreover, for the purpose of diagnosis, most continuous variable dynamic systems can be viewed as DES's at a higher level of abstraction.
Abstract: Fault detection and isolation is a crucial and challenging task in the automatic control of large complex systems We propose a discrete-event system (DES) approach to the problem of failure diagnosis We introduce two related notions of diagnosability of DES's in the framework of formal languages and compare diagnosability with the related notions of observability and invertibility We present a systematic procedure for detection and isolation of failure events using diagnosers and provide necessary and sufficient conditions for a language to be diagnosable The diagnoser performs diagnostics using online observations of the system behavior; it is also used to state and verify off-line the necessary and sufficient conditions for diagnosability These conditions are stated on the diagnoser or variations thereof The approach to failure diagnosis presented in this paper is applicable to systems that fall naturally in the class of DES's; moreover, for the purpose of diagnosis, most continuous variable dynamic systems can be viewed as DES's at a higher level of abstraction >

1,599 citations


Journal ArticleDOI
TL;DR: A unified theory of sequential changepoint detection is introduced which leads to a class of sequential detection rules which are not too demanding in computational and memory requirements for on-line implementation and yet are nearly optimal under several performance criteria.
Abstract: After a brief survey of a large variety of sequential detection procedures that are widely scattered in statistical references on quality control and engineering references on fault detection and signal processing, we study some open problems concerning these procedures and introduce a unified theory of sequential changepoint detection. This theory leads to a class of sequential detection rules which are not too demanding in computational and memory requirements for on-line implementation and yet are nearly optimal under several performance criteria.

563 citations


Journal ArticleDOI
TL;DR: The methodology and guidelines for the design of flexible software based fault and error injection are described and a tool, FERRARI, that incorporates the techniques are presented that demonstrates the effectiveness of the software-based error injection tool in evaluating the dependability properties of complex systems.
Abstract: A major step toward the development of fault-tolerant computer systems is the validation of the dependability properties of these systems. Fault/error injection has been recognized as a powerful approach to validate the fault tolerance mechanisms of a system and to obtain statistics on parameters such as coverages and latencies. This paper describes the methodology and guidelines for the design of flexible software based fault and error injection and presents a tool, FERRARI, that incorporates the techniques. The techniques used to emulate transient errors and permanent faults in software are described in detail. Experimental results are presented for several error detection techniques, and they demonstrate the effectiveness of the software-based error injection tool in evaluating the dependability properties of complex systems. >

370 citations


Journal ArticleDOI
01 Nov 1995
TL;DR: In this framework, neural network models constitute an important class of on-line approximators and adaptation/learning schemes and a systematic procedure for constructing nonlinear estimation algorithms is developed, and a stable learning scheme is derived using Lyapunov theory.
Abstract: The detection, diagnosis, and accommodation of system failures or degradations are becoming increasingly more important in modern engineering problems. A system failure often causes changes in critical system parameters, or even, changes in the nonlinear dynamics of the system. This paper presents a general framework for constructing automated fault diagnosis and accommodation architectures using on-line approximators and adaptation/learning schemes. In this framework, neural network models constitute an important class of on-line approximators. Changes in the system dynamics are monitored by an on-line approximation model, which is used not only for detecting but also for accommodating failures. A systematic procedure for constructing nonlinear estimation algorithms is developed, and a stable learning scheme is derived using Lyapunov theory. Simulation studies are used to illustrate the results and to gain intuition into the selection of design parameters. >

314 citations


Proceedings ArticleDOI
24 Oct 1995
TL;DR: A tool which supports execution slicing and dicing based on test cases is described and an experiment that uses heuristic techniques in fault localization is reported.
Abstract: Finding a fault in a program is a complex process which involves understanding the program's purpose, structure, semantics, and the relevant characteristics of failure producing tests. We describe a tool which supports execution slicing and dicing based on test cases. We report the results of an experiment that uses heuristic techniques in fault localization.

305 citations


Proceedings ArticleDOI
23 Apr 1995
TL;DR: An empirical study using the block and all-uses criteria as the coverage measures to address the issue of whether the size of T or the coverage of T on P determines the fault detection effectiveness of T for P.
Abstract: Size and code coverage are important attributes of a set of tests. When a program P is executed on elements of the test set T, we can observe the fault detecting capability of T for P. We can also observe the degree to which T induces code coverage on P according to some coverage criterion. We would like to know whether it is the size of T or the coverage of T on P which determines the fault detection effectiveness of T for P. To address this issue we ask the following question: While keeping coverage constant, what is the effect on fault detection of reducing the size of a test set? We report results from an empirical study using the block and all-uses criteria as the coverage measures.

264 citations


Journal ArticleDOI
01 Aug 1995
TL;DR: This paper presents a layered fault tolerance framework containing new fault detection and tolerance schemes, divided into servo, interface, and supervisor layers which provide different levels of detection andolerance capabilities for structurally diverse robots.
Abstract: This paper presents a layered fault tolerance framework containing new fault detection and tolerance schemes. The framework is divided into servo, interface, and supervisor layers. The servo layer is the continuous robot system and its normal controller. The interface layer monitors the servo layer for sensor or motor failures using analytical redundancy based fault detection tests. A newly developed algorithm generates the dynamic thresholds necessary to adapt the detection tests to the modeling inaccuracies present in robotic control. Depending on the initial conditions, the interface layer can provide some sensor fault tolerance automatically without direction from the supervisor. If the interface runs out of alternatives, the discrete event supervisor searches for remaining tolerance options and initiates the appropriate action based on the current robot structure indicated by the fault tree database. The layers form a hierarchy of fault tolerance which provide different levels of detection and tolerance capabilities for structurally diverse robots. >

182 citations


Journal ArticleDOI
TL;DR: In this article, a new approach to power system fault analysis using synchronized sampling is introduced, which can be extremely fast, selective and accurate, providing fault analysis performance that can not easily be matched by other known techniques.
Abstract: This paper introduces a new approach to power system fault analysis using synchronized sampling. A digital fault recorder with a Global Positioning System (GPS) satellite receiver is the source of data for this approach. Fault analysis functions, such as fault detection, classification and location are implemented for a power transmission line using synchronized samples from two ends of a line. This technique can be extremely fast, selective and accurate, providing fault analysis performance that can not easily be matched by other known techniques.

174 citations


Patent
08 May 1995
TL;DR: In this paper, a fault detection system was proposed to detect the existence of unwanted electrical paths between the high voltage traction system of an electric car and the chassis of the car, where a comparator compares the sum of the voltages to a setpoint which varies proportionately to the varying voltage of the battery of the traction system.
Abstract: A fault detection system detects the existence of unwanted electrical paths between the high voltage traction system of an electric car and the chassis of the car. The fault detection system includes a positive sampling RC circuit connected to the positive conductor of the traction system and a negative sampling RC circuit connected to the negative conductor of the traction system. Each RC circuit generates a voltage, and the voltages are balanced, i.e., the voltages are equal and opposite, when no leakage path exists. In contrast, when a leakage path to chassis exists, the voltages are not balanced. A comparator compares the sum of the voltages to a setpoint which varies proportionately to the varying voltage of the battery of the traction system, and a fault condition is indicated when the sum of the voltages exceeds the setpoint.

159 citations


Book
01 Jan 1995
TL;DR: A methodical procedure for organization of fault detection experiments for synchronous sequential machines possessing distinguishing sequences (DS) is given, based on the transition checking approach.
Abstract: A methodical procedure for organization of fault detection experiments for synchronous sequential machines possessing distinguishing sequences (DS) is given. The organization is based on the transition checking approach. The checking experiment is considered in three concatenative parts: 1) the initial sequence which brings the machine under test into a specific state, 2) the α-sequence to recognize all the states and to establish the information about the next states under the input DS, and 3) the β-sequence to check all the individual transitions in the state table.

157 citations


Patent
04 Dec 1995
TL;DR: In this paper, the authors present an efficient method for implementing a safe virtual machine, in software, that embodies a general purpose memory protection model, running on any general purpose computer architecture and will run an executable that has been developed for the virtual machine.
Abstract: An efficient method for implementing a safe virtual machine, in software, that embodies a general purpose memory protection model. The present invention runs on any general purpose computer architecture and will run an executable that has been developed for the virtual machine. The present invention compiles the executable into the native instructions of the hardware. During the compilation, specialized code sequences are added to the code using a technique called software fault isolation. A set of allowed behaviors and a set of responses to the undesirable actions will be created and written to memory. A series of optimizations are applied so that the translated code executes at nearly the native speed of the architecture, but the fault isolation sequences prevent it from engaging in undesirable actions. In particular, the memory protection model is enforced, providing the same level of protection that customarily requires hardware support to enforce efficiently.

Book ChapterDOI
25 Sep 1995
TL;DR: The results suggest that inexperienced subjects can apply a formal verification technique (code reading) as effectively as an execution-based validation technique, but they are most efficient when using functional testing.
Abstract: We replicated a controlled experiment first run in the early 1980's to evaluate the effectiveness and efficiency of 50 student subjects who used three defect-detection techniques to observe failures and isolate faults in small C programs. The three techniques were code reading by stepwise abstraction, functional (black-box) testing, and structural (white-box) testing. Two internal replications showed that our relatively inexperienced subjects were similarly effective at observing failures and isolating faults with all three techniques. However, our subjects were most efficient at both tasks when they used functional testing. Some significant differences among the techniques in their effectiveness at isolating faults of different types were seen. These results suggest that inexperienced subjects can apply a formal verification technique (code reading) as effectively as an execution-based validation technique, but they are most efficient when using functional testing.

Journal ArticleDOI
TL;DR: Modeling, based on the data, shows that, in addition to reducing the number of software faults, software dependability can be enhanced by reducing the recurrence rate.
Abstract: Based on extensive field failure data for Tandem's GUARDIAN operating system, the paper discusses evaluation of the dependability of operational software. Software faults considered are major defects that result in processor failures and invoke backup processes to take over. The paper categorizes the underlying causes of software failures and evaluates the effectiveness of the process pair technique in tolerating software faults. A model to describe the impact of software faults on the reliability of an overall system is proposed. The model is used to evaluate the significance of key factors that determine software dependability and to identify areas for improvement. An analysis of the data shows that about 77% of processor failures that are initially considered due to software are confirmed as software problems. The analysis shows that the use of process pairs to provide checkpointing and restart (originally intended for tolerating hardware faults) allows the system to tolerate about 75% of reported software faults that result in processor failures. The loose coupling between processors, which results in the backup execution (the processor state and the sequence of events) being different from the original execution, is a major reason for the measured software fault tolerance. Over two-thirds (72%) of measured software failures are recurrences of previously reported faults. Modeling, based on the data, shows that, in addition to reducing the number of software faults, software dependability can be enhanced by reducing the recurrence rate. >

Journal ArticleDOI
TL;DR: The authors introduce the methodology behind a novel hybrid neural/fuzzy system which merges the neural network and fuzzy logic technologies to solve fault detection problems.
Abstract: The use of electric motors in industry is extensive. These motors are exposed to a wide variety of environments and conditions which age the motor and make it subject to incipient faults. These incipient faults, if left undetected, contribute to the degradation and eventual failure of the motors. Artificial neural networks have been proposed and have demonstrated the capability of solving the motor monitoring and fault detection problem using an inexpensive, reliable, and noninvasive procedure. However, the major drawback of conventional artificial neural network fault detection is the inherent black box approach that can provide the correct solution, but does not provide heuristic interpretation of the solution. Engineers prefer accurate fault detection as well as the heuristic knowledge behind the fault detection process. Fuzzy logic is a technology that can easily provide heuristic reasoning while being difficult to provide exact solutions. The authors introduce the methodology behind a novel hybrid neural/fuzzy system which merges the neural network and fuzzy logic technologies to solve fault detection problems. They also discuss a training procedure for this neural/fuzzy fault detection system. This procedure is used to determine the correct solutions while providing qualitative, heuristic knowledge about the solutions. >

Journal ArticleDOI
Ron J. Patton1
TL;DR: In this paper, the authors give the state-of-the-art in robust fault diagnosis, based principally on residual generation, and some of the key challenges and potential for future directions in the research are drawn up.

15 Sep 1995
TL;DR: An approximate model is proposed, incorporating the main inertial effects contributing to integrity, that can be used to calculate the achievable horizontal protection level (HPL) at any geographical location and time and is used to estimate the availability of fault detection for an integrated GPS/inertial system.
Abstract: GPS systems possess a high availability of accurate horizontal positioning but detection and exclusion techniques based on satellite redundancy do not provide the integrity availability sufficient for primary means navigation in the terminal and non precision phases of flight. To achieve primary means integrity availability different types of augmentations such as WAAS, Loran, baro aiding, atomic clock and Inertial aiding are under consideration. Different techniques have been proposed to combine GPS and inertial sensor information and recent published findings seem to indicate that primary means integrity availability is achievable with standard 2 nmi/ h inertial sensor performance. This paper investigates and quantifies what different inertial effects that contribute to enhanced integrity of the integrated GPS/inertial system such as coasting, Schuler feedback etc. A Kalman filter based integration scheme that preserves the integrity information in an optimal fashion is presented and used to quantify integrity performance. The availability of fault detection and exclusion will depend on the geometry of the satellites used for positioning. The availability of satellites in good geometry is in turn a function of the status of the GPS constellation ( failures, maintenance etc. ). Methods for availability calculations have been developed and adopted by the RTCA SC-159. The availability of fault detection and exclusion over a specified region can be calculated for most augmentations but corresponding estimates (conforming with RTCA SC-l 59 guidelines) of the FDE availability provided by GPS/inertial integration techniques have not yet been published. This paper proposes an approximate model, incorporating the main inertial effects contributing to integrity, that can be used to calculate the achievable integrity (horizontal integrity limit) at any geographical location and time. This model is used to estimate the availability of fault detection (FD) for an integrated GPS/lnertial system. The availability of fault detection for the GPS/inertial system is compared to the availability of FD for other augmentations to provide trade off information.

Journal ArticleDOI
TL;DR: The parity relation approach is compared with the traditional detection filter design, and is shown to be more straightforward and have milder existence conditions; if subjected to the same specification, the two approaches yield identical residual generators.

Proceedings ArticleDOI
21 Jun 1995
TL;DR: The described methodology was verified by experiments with several technical processes like electric motors, actuators, pumps, machine tools, robots, heat exchangers, combustion engines and vehicles.
Abstract: For the fault detection of technical processes different methods can be applied based on the information extracted from direct measured signals, from signal models and process models. Examples for signal model based fault detection methods are spectral analysis or parameter estimation of ARMA models, examples for process model based methods are parameter estimation, state estimation or parity equation approaches. A comparison of these methods shows that they have different properties with regard to the detection of faults in the process, the actuators and the sensors. By a proper integration of different fault detection methods mainly their advantages can be used to generate a number of different analytical symptoms. For fault diagnosis a knowledge based procedure is required, because also qualitative information in form of heuristic symptoms have to be taken into account. Based on heuristic process knowledge as fault-symptom causalities and a unified representation of all symptoms an integrated fault diagnosis can be performed. This comprises the treatment of the symptoms as uncertain facts and approximate diagnostic reasoning via if-then rules either in a probabilistic or a fuzzy-logic (possibilistic) frame. The described methodology was verified by experiments with several technical processes like electric motors, actuators, pumps, machine tools, robots, heat exchangers, combustion engines and vehicles.

Proceedings ArticleDOI
24 Apr 1995
TL;DR: In this paper, the most relevant techniques, from classical monitoring to more recent model-based methods, are reviewed for improving reliability and avoiding unplanned maintenance in rail vehicle traction and braking systems.
Abstract: A number of techniques are available for improving reliability and avoiding unplanned maintenance in rail vehicle traction and braking systems. In this paper, the most relevant techniques, from classical monitoring to the more recent model-based methods, are reviewed. Although available for some years, many of the classical methods are often not used because of the added instrumentation cost. With increasingly tighter slip and slide control requirements, more sophisticated control strategies, many of which make use of state variable observers, are being used. As a result, the use of modern fault detection algorithms are enabled at little extra cost. Possibilities for observer design for fault tolerant control in the face of sensor failures are also explored. (13 pages)

Patent
01 Jun 1995
TL;DR: In this paper, a fault detection circuit for detecting leakage currents between a DC power source and chassis of an automobile, including a voltage sensor coupled to the power source, the voltage sensor including an analog reference and a chassis ground, was proposed.
Abstract: A fault detection circuit for detecting leakage currents between a DC power source and chassis of an automobile, includes a voltage sensor coupled to the DC power source, the voltage sensor including an analog reference and a chassis ground A differential amplifier is coupled to the voltage sensor and detects variations in the analog reference and the chassis ground A voltage comparator unit determines whether the variations detected in the differential amplifier is above a predetermined threshold value A built-in test circuit tests whether the fault detection circuit is operating correctly

Journal ArticleDOI
TL;DR: Two active compaction methods based on essential faults are developed to reduce a given test set, forced pair-merging and essential fault pruning, which achieves further compaction from removal of a pattern by modifying other patterns of the test set to detect the essential faults of the target pattern.
Abstract: Test set compaction for combinational circuits is studied in this paper. Two active compaction methods based on essential faults are developed to reduce a given test set. The special feature is that the given test set will be adaptively renewed to increase the chance of compaction. In the first method, forced pair-merging, pairs of patterns are merged by modifying their incompatible specified bits without sacrificing the original fault coverage. The other method, essential fault pruning, achieves further compaction from removal of a pattern by modifying other patterns of the test set to detect the essential faults of the target pattern. With these two developed methods, the compacted test size on the ISCAS'85 benchmark circuits is smaller than that of COMPACTEST by more than 20%, and 12% smaller than that by ROTCO+COMPACTEST. >

Patent
24 Jul 1995
TL;DR: In this article, the authors present an approach for detecting and correcting various fault conditions in an operating Coriolis effect mass flowmeter, including the presence of a crack in the flow tubes and stopping the flow of material to prevent release of the material through a cracked flow tubes.
Abstract: Apparatus and methods for detecting and correcting various fault conditions in an operating Coriolis effect mass flowmeter. The apparatus of the present invention receives information from an operating Coriolis mass flowmeter and compares the information to threshold signatures representing various fault conditions. When a fault condition is detected, output signals are applied to inform an operator and to control the mass flow rate through the flowmeter to correct the fault condition. Specifically, the methods of the present invention detect the presence of a crack in the flow tubes and stop the flow of material to prevent release of the material through a cracked flow tubes. Other methods of the present invention detect the void fraction of material flowing through the flow tubes, compute a corrected actual mass flow rate, and control the mass flow rate through the flowmeter to compensate for the effects of the void fraction. Signature information relating to threshold values for measured frequency, drive power, temperature and mass flow of the operating flowmeter as well as the slope and curvature of changes in each measured operating parameter are stored in memory within the fault detection apparatus of the present invention.

Journal ArticleDOI
Mihalis Yannakakis1, David Lee1
TL;DR: In this article, simple randomized algorithms for fault detection of finite state machines are presented. But they do not consider the fault detection problem of partially specified finite state machine (PSM) specifications.

Proceedings ArticleDOI
30 Apr 1995
TL;DR: A new parametric bridging fault model is proposed allowing to realistically represent the faulty behavior according to the intrinsic resistance which is not known a priori.
Abstract: From circuit measurement, it has been demonstrated that actual bridging faults have an intrinsic resistance mainly in the range from 0 /spl Omega/ to 500 /spl Omega/. This paper first analyses the consequences of this resistance on the electrical and logic behavior of bridging faults. Second, it is demonstrated that the classical models such as the voting model which consider the resistance as negligible, do not accurately and realistically represent the behavior of the fault. Third, a new parametric bridging fault model is proposed allowing to realistically represent the faulty behavior according to the intrinsic resistance which is not known a priori. Finally, a parametric bridging fault simulation algorithm is described together with some redefinition of the classical concepts of fault detection and fault coverage.

Patent
24 Aug 1995
TL;DR: In this article, a fault detection element reads the resultant signal from the bus and compares it with at least portions of the corresponding signals originally generated by the processing sections themselves, if there is discrepancy, the faultdetector signals a fault, e.g., causing the unit to be taken off-line.
Abstract: A digital data processing device includes a bus for transmitting signals (e.g., data and/or address information) between plural functional units (e.g., a central processing unit and a peripheral controller). A first such unit includes first and second processing sections that concurrently apply to the bus complementary portions of like information signals (e.g., longwords containing data). A fault detection element reads the resultant signal from the bus and compares it with at least portions of the corresponding signals originally generated by the processing sections themselves. If there is discrepancy, the fault-detector signals a fault, e.g., causing the unit to be taken off-line. By use of a redundant unit, processing can continue for fault-tolerant operation.

Book ChapterDOI
20 Sep 1995
TL;DR: FTAPE (Fault Tolerance And Performance Evaluator) is a tool that can be used to compare fault-tolerant computers and the errors/fault ratio, performance degradation, and number of system crashes are presented as measures of fault tolerance.
Abstract: This paper describes FTAPE (Fault Tolerance And Performance Evaluator), a tool that can be used to compare fault-tolerant computers. The major parts of the tool include a system-wide fault injector, a workload generator, and a workload activity measurement tool. The workload creates high stress conditions on the machine. Using stress-based injection, the fault injector is able to utilize knowledge of the workload activity to ensure a high level of fault propagation. The errors/fault ratio, performance degradation, and number of system crashes are presented as measures of fault tolerance.

Proceedings ArticleDOI
02 Apr 1995
TL;DR: The authors introduce a new model for faults and alarms based on probabilistic finite state machines and propose two algorithms that correlates alarms in the presence of multiple faults and noisy information.
Abstract: In communication networks, a large number of alarms exist to signal any abnormal behavior of the network. As network faults typically result in a number of alarms, correlating these different alarms and identifying their source is a major problem in fault management. The alarm correlation problem is of major practical significance. Alarms that have not been correlated may not only lead to significant misdirected efforts, based on insufficient information, but may cause multiple corrective actions (possibly contradictory) as each alert is handled independently. The paper proposes a general framework to solve the alarm correlation problem. The authors introduce a new model for faults and alarms based on probabilistic finite state machines. They propose two algorithms. The first one acquires the fault models starting from possibly incomplete and incorrect date. The second one correlates alarms in the presence of multiple faults and noisy information. Both algorithms have polynomial time complexity, use an extension of the Viterbi algorithm to deal with the corrupted data, and can be implemented in hardware. As an example, they are applied to analyse faults using data generated by the ANS (Advanced Network and Services, Inc.)/NSF T3 network.

Proceedings ArticleDOI
05 Dec 1995
TL;DR: A scheme to guarantee that the execution of real-time tasks can tolerate transient and intermittent faults assuming any queue-based scheduling technique and a dynamic programming optimal solution and a greedy heuristic which closely approximates the optimal are presented.
Abstract: We present a scheme to guarantee that the execution of real-time tasks can tolerate transient and intermittent faults assuming any queue-based scheduling technique. The scheme is based on reserving sufficient slack: in a schedule such that a task can be re-executed before its deadline without compromising guarantees given to other tasks. Only enough slack is reserved in the schedule to guarantee fault tolerance if at most one fault occurs within a time interval. This results in increased schedulability and a very low percentage of deadline misses even if no restriction is placed on the fault separation. We provide two algorithms to solve the problem of adding fault tolerance to a queue of real-time tasks. The first is a dynamic programming optimal solution and the second is a greedy heuristic which closely approximates the optimal.

Journal ArticleDOI
TL;DR: In this article, a nonlinear parity equation residual generation scheme that uses forward and inverse dynamic models of nonlinear systems, to diagnose sensor and actuator faults in an internal combustion engine, during execution of the United States Environmental Protection Agency Inspection and Maintenance 240 driving cycle is presented.

Journal ArticleDOI
TL;DR: In this article, a review of model-based fault diagnosis techniques is provided, starting from basic principles, the properties and limitations of different methods are discussed, and the problems encountered in both the residual generation and evaluation stages of fault diagnosis have been outlined.
Abstract: This paper provides a review of model-based fault diagnosis techniques. Starting from basic principles, the properties and limitations of different methods are discussed. The main aim is to give some guide-lines for the use of model-based methods. Accordingly, the problems encountered in both the residual generation and evaluation stages of fault diagnosis have been outlined. A comparison of quantitative and qualitative methods is also given. Finally, the limitation of model-based methods and the alternative and supplementary use of heuristic knowledge are discussed.