Showing papers on "Fault detection and isolation published in 1995"
••
TL;DR: The approach to failure diagnosis presented in this paper is applicable to systems that fall naturally in the class of DES's; moreover, for the purpose of diagnosis, most continuous variable dynamic systems can be viewed as DES's at a higher level of abstraction.
Abstract: Fault detection and isolation is a crucial and challenging task in the automatic control of large complex systems We propose a discrete-event system (DES) approach to the problem of failure diagnosis We introduce two related notions of diagnosability of DES's in the framework of formal languages and compare diagnosability with the related notions of observability and invertibility We present a systematic procedure for detection and isolation of failure events using diagnosers and provide necessary and sufficient conditions for a language to be diagnosable The diagnoser performs diagnostics using online observations of the system behavior; it is also used to state and verify off-line the necessary and sufficient conditions for diagnosability These conditions are stated on the diagnoser or variations thereof The approach to failure diagnosis presented in this paper is applicable to systems that fall naturally in the class of DES's; moreover, for the purpose of diagnosis, most continuous variable dynamic systems can be viewed as DES's at a higher level of abstraction >
1,599 citations
••
TL;DR: A unified theory of sequential changepoint detection is introduced which leads to a class of sequential detection rules which are not too demanding in computational and memory requirements for on-line implementation and yet are nearly optimal under several performance criteria.
Abstract: After a brief survey of a large variety of sequential detection procedures that are widely scattered in statistical references on quality control and engineering references on fault detection and signal processing, we study some open problems concerning these procedures and introduce a unified theory of sequential changepoint detection. This theory leads to a class of sequential detection rules which are not too demanding in computational and memory requirements for on-line implementation and yet are nearly optimal under several performance criteria.
563 citations
••
TL;DR: The methodology and guidelines for the design of flexible software based fault and error injection are described and a tool, FERRARI, that incorporates the techniques are presented that demonstrates the effectiveness of the software-based error injection tool in evaluating the dependability properties of complex systems.
Abstract: A major step toward the development of fault-tolerant computer systems is the validation of the dependability properties of these systems. Fault/error injection has been recognized as a powerful approach to validate the fault tolerance mechanisms of a system and to obtain statistics on parameters such as coverages and latencies. This paper describes the methodology and guidelines for the design of flexible software based fault and error injection and presents a tool, FERRARI, that incorporates the techniques. The techniques used to emulate transient errors and permanent faults in software are described in detail. Experimental results are presented for several error detection techniques, and they demonstrate the effectiveness of the software-based error injection tool in evaluating the dependability properties of complex systems. >
370 citations
••
01 Nov 1995TL;DR: In this framework, neural network models constitute an important class of on-line approximators and adaptation/learning schemes and a systematic procedure for constructing nonlinear estimation algorithms is developed, and a stable learning scheme is derived using Lyapunov theory.
Abstract: The detection, diagnosis, and accommodation of system failures or degradations are becoming increasingly more important in modern engineering problems. A system failure often causes changes in critical system parameters, or even, changes in the nonlinear dynamics of the system. This paper presents a general framework for constructing automated fault diagnosis and accommodation architectures using on-line approximators and adaptation/learning schemes. In this framework, neural network models constitute an important class of on-line approximators. Changes in the system dynamics are monitored by an on-line approximation model, which is used not only for detecting but also for accommodating failures. A systematic procedure for constructing nonlinear estimation algorithms is developed, and a stable learning scheme is derived using Lyapunov theory. Simulation studies are used to illustrate the results and to gain intuition into the selection of design parameters. >
314 citations
••
24 Oct 1995
TL;DR: A tool which supports execution slicing and dicing based on test cases is described and an experiment that uses heuristic techniques in fault localization is reported.
Abstract: Finding a fault in a program is a complex process which involves understanding the program's purpose, structure, semantics, and the relevant characteristics of failure producing tests. We describe a tool which supports execution slicing and dicing based on test cases. We report the results of an experiment that uses heuristic techniques in fault localization.
305 citations
••
23 Apr 1995TL;DR: An empirical study using the block and all-uses criteria as the coverage measures to address the issue of whether the size of T or the coverage of T on P determines the fault detection effectiveness of T for P.
Abstract: Size and code coverage are important attributes of a set of tests. When a program P is executed on elements of the test set T, we can observe the fault detecting capability of T for P. We can also observe the degree to which T induces code coverage on P according to some coverage criterion. We would like to know whether it is the size of T or the coverage of T on P which determines the fault detection effectiveness of T for P. To address this issue we ask the following question: While keeping coverage constant, what is the effect on fault detection of reducing the size of a test set? We report results from an empirical study using the block and all-uses criteria as the coverage measures.
264 citations
••
01 Aug 1995TL;DR: This paper presents a layered fault tolerance framework containing new fault detection and tolerance schemes, divided into servo, interface, and supervisor layers which provide different levels of detection andolerance capabilities for structurally diverse robots.
Abstract: This paper presents a layered fault tolerance framework containing new fault detection and tolerance schemes. The framework is divided into servo, interface, and supervisor layers. The servo layer is the continuous robot system and its normal controller. The interface layer monitors the servo layer for sensor or motor failures using analytical redundancy based fault detection tests. A newly developed algorithm generates the dynamic thresholds necessary to adapt the detection tests to the modeling inaccuracies present in robotic control. Depending on the initial conditions, the interface layer can provide some sensor fault tolerance automatically without direction from the supervisor. If the interface runs out of alternatives, the discrete event supervisor searches for remaining tolerance options and initiates the appropriate action based on the current robot structure indicated by the fault tree database. The layers form a hierarchy of fault tolerance which provide different levels of detection and tolerance capabilities for structurally diverse robots. >
182 citations
••
TL;DR: In this article, a new approach to power system fault analysis using synchronized sampling is introduced, which can be extremely fast, selective and accurate, providing fault analysis performance that can not easily be matched by other known techniques.
Abstract: This paper introduces a new approach to power system fault analysis using synchronized sampling. A digital fault recorder with a Global Positioning System (GPS) satellite receiver is the source of data for this approach. Fault analysis functions, such as fault detection, classification and location are implemented for a power transmission line using synchronized samples from two ends of a line. This technique can be extremely fast, selective and accurate, providing fault analysis performance that can not easily be matched by other known techniques.
174 citations
•
08 May 1995TL;DR: In this paper, a fault detection system was proposed to detect the existence of unwanted electrical paths between the high voltage traction system of an electric car and the chassis of the car, where a comparator compares the sum of the voltages to a setpoint which varies proportionately to the varying voltage of the battery of the traction system.
Abstract: A fault detection system detects the existence of unwanted electrical paths between the high voltage traction system of an electric car and the chassis of the car. The fault detection system includes a positive sampling RC circuit connected to the positive conductor of the traction system and a negative sampling RC circuit connected to the negative conductor of the traction system. Each RC circuit generates a voltage, and the voltages are balanced, i.e., the voltages are equal and opposite, when no leakage path exists. In contrast, when a leakage path to chassis exists, the voltages are not balanced. A comparator compares the sum of the voltages to a setpoint which varies proportionately to the varying voltage of the battery of the traction system, and a fault condition is indicated when the sum of the voltages exceeds the setpoint.
159 citations
•
01 Jan 1995
TL;DR: A methodical procedure for organization of fault detection experiments for synchronous sequential machines possessing distinguishing sequences (DS) is given, based on the transition checking approach.
Abstract: A methodical procedure for organization of fault detection experiments for synchronous sequential machines possessing distinguishing sequences (DS) is given. The organization is based on the transition checking approach. The checking experiment is considered in three concatenative parts: 1) the initial sequence which brings the machine under test into a specific state, 2) the α-sequence to recognize all the states and to establish the information about the next states under the input DS, and 3) the β-sequence to check all the individual transitions in the state table.
157 citations
•
04 Dec 1995TL;DR: In this paper, the authors present an efficient method for implementing a safe virtual machine, in software, that embodies a general purpose memory protection model, running on any general purpose computer architecture and will run an executable that has been developed for the virtual machine.
Abstract: An efficient method for implementing a safe virtual machine, in software, that embodies a general purpose memory protection model. The present invention runs on any general purpose computer architecture and will run an executable that has been developed for the virtual machine. The present invention compiles the executable into the native instructions of the hardware. During the compilation, specialized code sequences are added to the code using a technique called software fault isolation. A set of allowed behaviors and a set of responses to the undesirable actions will be created and written to memory. A series of optimizations are applied so that the translated code executes at nearly the native speed of the architecture, but the fault isolation sequences prevent it from engaging in undesirable actions. In particular, the memory protection model is enforced, providing the same level of protection that customarily requires hardware support to enforce efficiently.
••
25 Sep 1995TL;DR: The results suggest that inexperienced subjects can apply a formal verification technique (code reading) as effectively as an execution-based validation technique, but they are most efficient when using functional testing.
Abstract: We replicated a controlled experiment first run in the early 1980's to evaluate the effectiveness and efficiency of 50 student subjects who used three defect-detection techniques to observe failures and isolate faults in small C programs. The three techniques were code reading by stepwise abstraction, functional (black-box) testing, and structural (white-box) testing. Two internal replications showed that our relatively inexperienced subjects were similarly effective at observing failures and isolating faults with all three techniques. However, our subjects were most efficient at both tasks when they used functional testing. Some significant differences among the techniques in their effectiveness at isolating faults of different types were seen. These results suggest that inexperienced subjects can apply a formal verification technique (code reading) as effectively as an execution-based validation technique, but they are most efficient when using functional testing.
••
TL;DR: Modeling, based on the data, shows that, in addition to reducing the number of software faults, software dependability can be enhanced by reducing the recurrence rate.
Abstract: Based on extensive field failure data for Tandem's GUARDIAN operating system, the paper discusses evaluation of the dependability of operational software. Software faults considered are major defects that result in processor failures and invoke backup processes to take over. The paper categorizes the underlying causes of software failures and evaluates the effectiveness of the process pair technique in tolerating software faults. A model to describe the impact of software faults on the reliability of an overall system is proposed. The model is used to evaluate the significance of key factors that determine software dependability and to identify areas for improvement. An analysis of the data shows that about 77% of processor failures that are initially considered due to software are confirmed as software problems. The analysis shows that the use of process pairs to provide checkpointing and restart (originally intended for tolerating hardware faults) allows the system to tolerate about 75% of reported software faults that result in processor failures. The loose coupling between processors, which results in the backup execution (the processor state and the sequence of events) being different from the original execution, is a major reason for the measured software fault tolerance. Over two-thirds (72%) of measured software failures are recurrences of previously reported faults. Modeling, based on the data, shows that, in addition to reducing the number of software faults, software dependability can be enhanced by reducing the recurrence rate. >
••
TL;DR: The authors introduce the methodology behind a novel hybrid neural/fuzzy system which merges the neural network and fuzzy logic technologies to solve fault detection problems.
Abstract: The use of electric motors in industry is extensive. These motors are exposed to a wide variety of environments and conditions which age the motor and make it subject to incipient faults. These incipient faults, if left undetected, contribute to the degradation and eventual failure of the motors. Artificial neural networks have been proposed and have demonstrated the capability of solving the motor monitoring and fault detection problem using an inexpensive, reliable, and noninvasive procedure. However, the major drawback of conventional artificial neural network fault detection is the inherent black box approach that can provide the correct solution, but does not provide heuristic interpretation of the solution. Engineers prefer accurate fault detection as well as the heuristic knowledge behind the fault detection process. Fuzzy logic is a technology that can easily provide heuristic reasoning while being difficult to provide exact solutions. The authors introduce the methodology behind a novel hybrid neural/fuzzy system which merges the neural network and fuzzy logic technologies to solve fault detection problems. They also discuss a training procedure for this neural/fuzzy fault detection system. This procedure is used to determine the correct solutions while providing qualitative, heuristic knowledge about the solutions. >
••
TL;DR: In this paper, the authors give the state-of-the-art in robust fault diagnosis, based principally on residual generation, and some of the key challenges and potential for future directions in the research are drawn up.
15 Sep 1995
TL;DR: An approximate model is proposed, incorporating the main inertial effects contributing to integrity, that can be used to calculate the achievable horizontal protection level (HPL) at any geographical location and time and is used to estimate the availability of fault detection for an integrated GPS/inertial system.
Abstract: GPS systems possess a high availability of accurate
horizontal positioning but detection and exclusion
techniques based on satellite redundancy do not
provide the integrity availability sufficient for primary
means navigation in the terminal and non precision
phases of flight. To achieve primary means integrity
availability different types of augmentations such as
WAAS, Loran, baro aiding, atomic clock and Inertial
aiding are under consideration. Different techniques
have been proposed to combine GPS and inertial
sensor information and recent published findings
seem to indicate that primary means integrity
availability is achievable with standard 2 nmi/ h inertial
sensor performance. This paper investigates and
quantifies what different inertial effects that contribute
to enhanced integrity of the integrated GPS/inertial
system such as coasting, Schuler feedback etc. A
Kalman filter based integration scheme that preserves
the integrity information in an optimal fashion is
presented and used to quantify integrity performance.
The availability of fault detection and exclusion will
depend on the geometry of the satellites used for
positioning. The availability of satellites in good
geometry is in turn a function of the status of the GPS
constellation ( failures, maintenance etc. ). Methods
for availability calculations have been developed and
adopted by the RTCA SC-159. The availability of fault
detection and exclusion over a specified region can be
calculated for most augmentations but corresponding
estimates (conforming with RTCA SC-l 59 guidelines)
of the FDE availability provided by GPS/inertial
integration techniques have not yet been published.
This paper proposes an approximate model,
incorporating the main inertial effects contributing to
integrity, that can be used to calculate the achievable
integrity (horizontal integrity limit) at any geographical
location and time. This model is used to estimate the
availability of fault detection (FD) for an integrated
GPS/lnertial system. The availability of fault detection
for the GPS/inertial system is compared to the
availability of FD for other augmentations to provide
trade off information.
••
TL;DR: The parity relation approach is compared with the traditional detection filter design, and is shown to be more straightforward and have milder existence conditions; if subjected to the same specification, the two approaches yield identical residual generators.
••
21 Jun 1995TL;DR: The described methodology was verified by experiments with several technical processes like electric motors, actuators, pumps, machine tools, robots, heat exchangers, combustion engines and vehicles.
Abstract: For the fault detection of technical processes different methods can be applied based on the information extracted from direct measured signals, from signal models and process models. Examples for signal model based fault detection methods are spectral analysis or parameter estimation of ARMA models, examples for process model based methods are parameter estimation, state estimation or parity equation approaches. A comparison of these methods shows that they have different properties with regard to the detection of faults in the process, the actuators and the sensors. By a proper integration of different fault detection methods mainly their advantages can be used to generate a number of different analytical symptoms. For fault diagnosis a knowledge based procedure is required, because also qualitative information in form of heuristic symptoms have to be taken into account. Based on heuristic process knowledge as fault-symptom causalities and a unified representation of all symptoms an integrated fault diagnosis can be performed. This comprises the treatment of the symptoms as uncertain facts and approximate diagnostic reasoning via if-then rules either in a probabilistic or a fuzzy-logic (possibilistic) frame. The described methodology was verified by experiments with several technical processes like electric motors, actuators, pumps, machine tools, robots, heat exchangers, combustion engines and vehicles.
••
24 Apr 1995TL;DR: In this paper, the most relevant techniques, from classical monitoring to more recent model-based methods, are reviewed for improving reliability and avoiding unplanned maintenance in rail vehicle traction and braking systems.
Abstract: A number of techniques are available for improving reliability and avoiding unplanned maintenance in rail vehicle traction and braking systems. In this paper, the most relevant techniques, from classical monitoring to the more recent model-based methods, are reviewed. Although available for some years, many of the classical methods are often not used because of the added instrumentation cost. With increasingly tighter slip and slide control requirements, more sophisticated control strategies, many of which make use of state variable observers, are being used. As a result, the use of modern fault detection algorithms are enabled at little extra cost. Possibilities for observer design for fault tolerant control in the face of sensor failures are also explored. (13 pages)
•
01 Jun 1995TL;DR: In this paper, a fault detection circuit for detecting leakage currents between a DC power source and chassis of an automobile, including a voltage sensor coupled to the power source, the voltage sensor including an analog reference and a chassis ground, was proposed.
Abstract: A fault detection circuit for detecting leakage currents between a DC power source and chassis of an automobile, includes a voltage sensor coupled to the DC power source, the voltage sensor including an analog reference and a chassis ground A differential amplifier is coupled to the voltage sensor and detects variations in the analog reference and the chassis ground A voltage comparator unit determines whether the variations detected in the differential amplifier is above a predetermined threshold value A built-in test circuit tests whether the fault detection circuit is operating correctly
••
TL;DR: Two active compaction methods based on essential faults are developed to reduce a given test set, forced pair-merging and essential fault pruning, which achieves further compaction from removal of a pattern by modifying other patterns of the test set to detect the essential faults of the target pattern.
Abstract: Test set compaction for combinational circuits is studied in this paper. Two active compaction methods based on essential faults are developed to reduce a given test set. The special feature is that the given test set will be adaptively renewed to increase the chance of compaction. In the first method, forced pair-merging, pairs of patterns are merged by modifying their incompatible specified bits without sacrificing the original fault coverage. The other method, essential fault pruning, achieves further compaction from removal of a pattern by modifying other patterns of the test set to detect the essential faults of the target pattern. With these two developed methods, the compacted test size on the ISCAS'85 benchmark circuits is smaller than that of COMPACTEST by more than 20%, and 12% smaller than that by ROTCO+COMPACTEST. >
•
24 Jul 1995
TL;DR: In this article, the authors present an approach for detecting and correcting various fault conditions in an operating Coriolis effect mass flowmeter, including the presence of a crack in the flow tubes and stopping the flow of material to prevent release of the material through a cracked flow tubes.
Abstract: Apparatus and methods for detecting and correcting various fault conditions in an operating Coriolis effect mass flowmeter. The apparatus of the present invention receives information from an operating Coriolis mass flowmeter and compares the information to threshold signatures representing various fault conditions. When a fault condition is detected, output signals are applied to inform an operator and to control the mass flow rate through the flowmeter to correct the fault condition. Specifically, the methods of the present invention detect the presence of a crack in the flow tubes and stop the flow of material to prevent release of the material through a cracked flow tubes. Other methods of the present invention detect the void fraction of material flowing through the flow tubes, compute a corrected actual mass flow rate, and control the mass flow rate through the flowmeter to compensate for the effects of the void fraction. Signature information relating to threshold values for measured frequency, drive power, temperature and mass flow of the operating flowmeter as well as the slope and curvature of changes in each measured operating parameter are stored in memory within the fault detection apparatus of the present invention.
••
TL;DR: In this article, simple randomized algorithms for fault detection of finite state machines are presented. But they do not consider the fault detection problem of partially specified finite state machine (PSM) specifications.
••
30 Apr 1995
TL;DR: A new parametric bridging fault model is proposed allowing to realistically represent the faulty behavior according to the intrinsic resistance which is not known a priori.
Abstract: From circuit measurement, it has been demonstrated that actual bridging faults have an intrinsic resistance mainly in the range from 0 /spl Omega/ to 500 /spl Omega/. This paper first analyses the consequences of this resistance on the electrical and logic behavior of bridging faults. Second, it is demonstrated that the classical models such as the voting model which consider the resistance as negligible, do not accurately and realistically represent the behavior of the fault. Third, a new parametric bridging fault model is proposed allowing to realistically represent the faulty behavior according to the intrinsic resistance which is not known a priori. Finally, a parametric bridging fault simulation algorithm is described together with some redefinition of the classical concepts of fault detection and fault coverage.
•
24 Aug 1995
TL;DR: In this article, a fault detection element reads the resultant signal from the bus and compares it with at least portions of the corresponding signals originally generated by the processing sections themselves, if there is discrepancy, the faultdetector signals a fault, e.g., causing the unit to be taken off-line.
Abstract: A digital data processing device includes a bus for transmitting signals (e.g., data and/or address information) between plural functional units (e.g., a central processing unit and a peripheral controller). A first such unit includes first and second processing sections that concurrently apply to the bus complementary portions of like information signals (e.g., longwords containing data). A fault detection element reads the resultant signal from the bus and compares it with at least portions of the corresponding signals originally generated by the processing sections themselves. If there is discrepancy, the fault-detector signals a fault, e.g., causing the unit to be taken off-line. By use of a redundant unit, processing can continue for fault-tolerant operation.
••
20 Sep 1995TL;DR: FTAPE (Fault Tolerance And Performance Evaluator) is a tool that can be used to compare fault-tolerant computers and the errors/fault ratio, performance degradation, and number of system crashes are presented as measures of fault tolerance.
Abstract: This paper describes FTAPE (Fault Tolerance And Performance Evaluator), a tool that can be used to compare fault-tolerant computers. The major parts of the tool include a system-wide fault injector, a workload generator, and a workload activity measurement tool. The workload creates high stress conditions on the machine. Using stress-based injection, the fault injector is able to utilize knowledge of the workload activity to ensure a high level of fault propagation. The errors/fault ratio, performance degradation, and number of system crashes are presented as measures of fault tolerance.
••
02 Apr 1995TL;DR: The authors introduce a new model for faults and alarms based on probabilistic finite state machines and propose two algorithms that correlates alarms in the presence of multiple faults and noisy information.
Abstract: In communication networks, a large number of alarms exist to signal any abnormal behavior of the network. As network faults typically result in a number of alarms, correlating these different alarms and identifying their source is a major problem in fault management. The alarm correlation problem is of major practical significance. Alarms that have not been correlated may not only lead to significant misdirected efforts, based on insufficient information, but may cause multiple corrective actions (possibly contradictory) as each alert is handled independently. The paper proposes a general framework to solve the alarm correlation problem. The authors introduce a new model for faults and alarms based on probabilistic finite state machines. They propose two algorithms. The first one acquires the fault models starting from possibly incomplete and incorrect date. The second one correlates alarms in the presence of multiple faults and noisy information. Both algorithms have polynomial time complexity, use an extension of the Viterbi algorithm to deal with the corrupted data, and can be implemented in hardware. As an example, they are applied to analyse faults using data generated by the ANS (Advanced Network and Services, Inc.)/NSF T3 network.
••
05 Dec 1995TL;DR: A scheme to guarantee that the execution of real-time tasks can tolerate transient and intermittent faults assuming any queue-based scheduling technique and a dynamic programming optimal solution and a greedy heuristic which closely approximates the optimal are presented.
Abstract: We present a scheme to guarantee that the execution of real-time tasks can tolerate transient and intermittent faults assuming any queue-based scheduling technique. The scheme is based on reserving sufficient slack: in a schedule such that a task can be re-executed before its deadline without compromising guarantees given to other tasks. Only enough slack is reserved in the schedule to guarantee fault tolerance if at most one fault occurs within a time interval. This results in increased schedulability and a very low percentage of deadline misses even if no restriction is placed on the fault separation. We provide two algorithms to solve the problem of adding fault tolerance to a queue of real-time tasks. The first is a dynamic programming optimal solution and the second is a greedy heuristic which closely approximates the optimal.
••
TL;DR: In this article, a nonlinear parity equation residual generation scheme that uses forward and inverse dynamic models of nonlinear systems, to diagnose sensor and actuator faults in an internal combustion engine, during execution of the United States Environmental Protection Agency Inspection and Maintenance 240 driving cycle is presented.
••
TL;DR: In this article, a review of model-based fault diagnosis techniques is provided, starting from basic principles, the properties and limitations of different methods are discussed, and the problems encountered in both the residual generation and evaluation stages of fault diagnosis have been outlined.
Abstract: This paper provides a review of model-based fault diagnosis techniques. Starting from basic principles, the properties and limitations of different methods are discussed. The main aim is to give some guide-lines for the use of model-based methods. Accordingly, the problems encountered in both the residual generation and evaluation stages of fault diagnosis have been outlined. A comparison of quantitative and qualitative methods is also given. Finally, the limitation of model-based methods and the alternative and supplementary use of heuristic knowledge are discussed.