scispace - formally typeset
Open AccessJournal ArticleDOI

System fault diagnostics using fault tree analysis

Emma E. Hurdle, +2 more
- Vol. 221, Iss: 1, pp 43-55
Reads0
Chats0
TLDR
A method for diagnosing faults in systems using FTA to explain the deviations from normal operation observed in sensor outputs is presented and the concepts of this method are illustrated by applying the technique to a simplified water tank level control system.
Abstract
Over the last 50 years, advances in technology have led to an increase in the complexity and sophistication of systems. More complex systems can be harder to maintain and the root cause of a fault more difficult to isolate. Downtime resultin from a system failure can be dangerous or expensive, depending on the type of system. In aircraft systems the ability to diagnose quickly the causes of a fault can have a significant impact on the time taken to rectify the problem and to return the aircraft to service. In chemical prcess plants the need to diagnose causes of a safety-critical failure in a system can be vital and a diagnosis may be required within minutes. Speed of fault isolation can save time, reduce costs, and increase company productivity and therefore profits. System fault diagnosis is the process of identifying the cause of a malfunction by observing its effect at various test points. Fault tree analysis (FTA) is a method that describes all possible causes of a specified system state in terms of the state of the components within the system. A system model is used to identify the states that the system should be in at any point in time. This paper presents a method for diagnosing faults in systems using FTA to explain the deviations from normal operation observed in sensor outputs. The causes of a system’s failure modes will be described in terms of the component states. This will be achieved with the use of coherent and non-coherent fault trees. A coherent fault tree is constructed from AND and OR logic and therefore considers only component-failed states. The non-coherent method expands this, allowing the use of NOT logic, which implies that the existence of component-failed states and component-working states are both taken into account. This paper illustrates the concepts of this method by applying the technique to a simplified water tank level control system.

read more

Content maybe subject to copyright    Report

This item was submitted to Loughborough’s Institutional Repository
(https://dspace.lboro.ac.uk/) by the author and is made available under the
following Creative Commons Licence conditions.
For the full text of this licence, please go to:
http://creativecommons.org/licenses/by-nc-nd/2.5/

System fault diagnostics using fault tree analysis
E E Hurdle, L M Bartlett, and J D Andrews*
Department of Aeronautical and Automotive Engineering, Loughborough University, Loughborough, UK
The manuscript was received on 15 June 2005 and was accepted after revision for publication on 17 May 2006.
DOI: 10.1243/1748006XJRR6
Abstract: Over the last 50 years, advances in technology have led to an increase in the
complexity and sophistication of systems. More complex systems can be harder to maintain
and the root cause of a fault more difficult to isolate.
Downtime resultin from a system failure can be dangerous or expensive, depending on the
type of system. In aircraft systems the ability to diagnose quickly the causes of a fault can
have a significant impact on the time taken to rectify the problem and to return the aircraft
to service . In chemical prcess plants the need to diagnose causes of a safety-critical failure in
a system can be vital and a diagnosis may be required within minutes. Speed of fault
isolation can save time, reduce costs, and increase company productivity and therefore
profits. System fault diagnosis is the process of identifying the cause of a malfunction by
observing its effect at v arious test points.
Fault tree analysis (FTA) is a method that describes all possible causes of a specified system
state in terms of the state of the components within the system. A system model is used to
identify the states that the system should be in at any point in time. This paper presents a
method for diagnosing faults in systems using FTA to explain the deviations from normal
operation observed in sensor outputs. The causes of a system’s failure modes will be
described in terms of the component state s. This will be achi eved with the use of coherent
and non-coherent fault trees. A coherent fault tree is constructed from AND and OR logic
and therefore considers only component-failed states. The non-coherent method expands
this, allowing the use of NOT logic, which implies that the existence of component-failed
states and component-working states are both taken into account. This paper illustrates the
concepts of this method by applying the technique to a simplified water tank level control
system.
Keywords: fault diagnosis, fault tree analysis
1 INTRODUCTION
A system can be analysed for faults in two diff-
erent ways. The first is to test the system functional-
ity at one point in time. The second continuously
monitors the system and detects faults as they occur.
An example of an approach that carries out a
series of tests to determine the system status at
one point in time is the sequential fault diagnostic
tool developed by Novak and co-workers [14].
The approach uses information about which
symptoms are exhibited when the faults are
present. The sequential fault diagnostic tool deter-
mines the best sequence to conduct the test to
locate the fault condition in the cheapest (or
quickest) way. A similar method by Pattip ati and
Alexandridis [5] uses heuristic algorithms in order
to determine the most cost-effective sequence of
tests. These methods are limited to situations where
only a single fault is expected to exist at any point in
time; they do not take into consideration multiple
component failures. Shakeri et al.[6] extended
sequential test sequencing to diagnose multiple
failures in systems. However, this method takes
a considerable length of time to obtain a diagnosis.
* Corresponding author: Department of Aeronautical and
Automotive Engineering, University of Loughborough, Lough-
borough, Leicestershire LE11 3TU, UK. email: J.D.Andrews@
lboro.ac.uk
JRR6 Ó IMechE 2007 Proc. IMechE Vol. 221 Part O: J. Risk and Reliability
43

Another approach to diagnosi s is the use of gra-
phical models to describe the propagation of faults
in systems. Rao [7] developed two algorithms that
use the information from directed graphs of systems
to diagnose single failures at one point in time. This
technique was developed further by Pattipat i [8]to
diagnose multiple failures in systems. This graphical
method, howev er, does not take into consideration
faults that do not immediately affect the status of
the s ystem when they occur.
Failure modes effects analysis (FMEA) is a struc-
tured qualitative analysis of a system, subsystem, or
function that can be used to identify potential sys-
tem failure modes, their causes, and the effects on
the system operation associated with the failure
modes occurrence. Price [9] demonstrated the use
of automated FMEA to generate reports that could
be used in a diagnostic tool to diagnose multiple
faults in systems at one point in time. The failures
from the FMEA are only generated to a chosen likeli-
hood of occurrence; therefore all possible outcomes
of failure for a system scenario may not be obtained.
Paasch and Mocko [10] use FMEA and fault tree
analysis (FTA) to develop a model that uses matrix
manipulation for diagnosing faults in systems at
one point in time. This research does not, however,
consider multiple failures.
Papadopoulos [11] has carried out work using
state charts and fault trees to provide continuous
on-line monitoring and rectification of systems.
NOT logic is excluded from the fault trees; therefore
only component failures are taken into account to
obtain diagnosis. As a result, some faults occurring
simultaneously have required conflicting remedial
procedures. Yangping et al.[12] also developed a
fault-tree-based method that considers only compo-
nent failures, which uses genetic algorithms to
monitor continuously for faults in nuclear power
plants. Genetic search is slow in obtaining solutions
and there can be problems determining when a
global rather than a local diagnosis has been
obtained.
Many system failures are not usually the result
of one single fault. Therefore the ability to diagnose
multiple faults is vitally important. Methods of
finding faults or combinations of faults as they occu r
are the subject of this paper. The approach is
based on the fault tree method [1315]. This is tradi-
tionally used to quantify the likelihood of a system
failure. In this application the logic diagram is used
to develop causes of a system symptom exhibited
by a sensor reading, in terms of component
conditions. In order to illustrate the features of
the method described in this paper it will be
applied to a simplified water tank level control
system.
2 THE WATER TANK SYSTEM
The water tank system is illustrated in Fig. 1. It aims
to maintain the level of water between two predeter-
mined limits. In normal operational mode, water
flows out of the system thro ugh valve V2. The level
control system deter mines when water is flowing
from the tank and then refills it by opening valve
V1 until the desired tan k level is reached. The over-
spill tray located beneath the tank catches water if
Main
Supply
Outlet
Valve
Inlet
Valve
Safety
Valve
Fig. 1 Water tank system
44 E E Hurdle, L M Bartlett, and J D Andrews
Proc. IMechE Vol. 221 Part O: J. Risk and Reliability JRR6 Ó IMechE 2007

the tank has ruptured, if water has leaked out
through a crack or hole, or if the water level over-
flows from the top.
2.1 System component description
The water tank system shown in Fig. 1 consists of
three valves labe lled V1, V2, and V3, two level sensors
represented by S1 and S2, two controllers C1 and C2,
and an overspill tray labelled TRAY. There are six
sections of pipes identified by the labels P1 to P6.
V1 is an air-to-open (A/O) inlet valve controlled by
C1. The level sensor S1 detects the height of the
water in the tank. In normal operating mode, if the
water in the tank falls below the required level (as
indicated by the sensor S1), the controller C1 would
open valve V1, allowing water into the tank. Con ver-
sely, if the water in the tank rises to the required
level, then C1 will close V1.
V2 is a manual (MAN) valve, controlled by an
operator in response to demand. Finally valve V3 is
an air-to-close (A/C) valve that operates as a safety
valve controlled by C2. Normally this will only
become operational when a component failure
occurs which causes a very high level of water in
the tank. A signal from S2 would cause the controller
C2 to open valve V3 to reduce the level of water in
the tank.
The overspill tray, located underneath the tank,
collects any spillages that may occur owing to a fail-
ure in the system. So, water in the overspill tray will
occur if the tank has ruptured, if the tank is leaking,
or if the water level overflows from the top of
the tank.
2.2 System operating modes
2.2.1 Sensor locations
The status of the system is determined using mea-
surements provided by flow sensors situated next
to each of the three valves in the system. The sensors
are denoted by VF1, VF2, and VF3 for locations at V1,
V2, and V3 respectively. These each detect the pre-
sence or absence of flow of water, which can be
denoted as flow F or no flow NF respectively. For
this demonstration study it is assumed the sensors
are perfectly reliable. A fourth sensor denoted by
SP1 is located in the overspill tray to indicate
whether any water has escaped from the tank. Its
reading is interpreted as water W or no water NW.
The sensor locations described above are called the
system observation points. The observation points
for V1, V2, V3, and TRAY generated 16 different sce-
narios that the system potentially could produce.
These are listed in Table 1.
Were this a real system, additional sensors could
be added in order to provide a more complete
picture of its operating state. The level control sen-
sors S1 and S2 could also be used. However for the
purposes of demonstrating the fault-tree-based
fault diagnostics technique the system sensors as
described are sufficient, without any additional
complexity.
The system has two operating modes; these being
ACTIVE when the operator opens valve V2, or
DORMANT when V2 is closed. In the ACTIVE operat-
ing mode, water is taken out of the system through
valve V2 and the tank is refilled by wa ter coming in
through valve V1 from the main water suppl y. Water
would not exit the system through val ve V3 and there
would be no water in the overspill tray. The sensor
readings for the system when ACTIVE should be as
those given in scenario 4 in Table 1. In the DOR-
MANT operating mode the system is effectively on
standby with all three valves remaining closed and
the overspill tray empty. This should result in the
sensor readings given in scenario 16 in Table 1.
The sensor readings given by scenarios 4 and 16
are those which, under steady state conditions,
represent the model of how the system should
behave when ACTIVE and DORMANT respectively.
Given that the system is in the ACTIVE or DORMANT
state, any sensor readings that deviate from those
expected are regarded as being indicative of some
fault within the system.
2.2.2 Possible component failures
In order to apply FTA to a system the faults that
could occur for each of the system components
need to be defined. Table 2 contains a list of possible
component failures and their code.
The two operating modes are also represented in
the fault trees. ACTIVE signifies that the operator
has attempted to open valve V2. DORMANT is used
Table 1 System scenarios
Scenario V1 V2 V3 TRAY
1 FFFW
2 FFFNW
3FFNFW
4FFNFNW
5FNFFW
6 F NF F NW
7 F NF NF W
8 F NF NF NW
9 NFFFW
10 NF F F NW
11 NF F NF W
12 NF F NF NW
13 NF NF F W
14 NF NF F NW
15 NF NF NF W
16 NF NF NF NW
System fault diagnostics using fault tree analysis 45
JRR6 Ó IMechE 2007 Proc. IMechE Vol. 221 Part O: J. Risk and Reliability

to indicate that the operator has tried to close V2. It
should be noted that this is a two-mode system and
so only one of the variables ACTIVE or DORMANT
can be true at any time.
3 SENSOR DEVIATION MODELS
In the application of FTA to the system fault diagnos-
tics a series of failure logic diagrams are produced,
representing the causes of any sensor readings.
These are developed in a fault tree in terms of the
component failure conditions of the system operat-
ing state. The list of all possible sensor readings for
the s ystem is shown in Table 3.
Two other possible sensor readings for the system
are ‘no flow through valve V3’ and ‘no water in the
overspill tray’. However, these sensor readings
would occur under normal operational conditions
(i.e. without any failures occurring) in both the
ACTIVE and the DORMANT operating states.
3.1 Fault tree construction
Fault trees were generated for each sensor reading
listed in Table 3 using both coherent and non-coher-
ent methods. Coherent fault trees are constructed
from AND and OR logic and feature only component
failure events in the causes of sensor status. The
non-coherent method expands this, allowing the
use of the NOT operator. This implies that both com-
ponent-failed and component-working states are
taken into account when developing the causes of
the sensor status.
3.1.1 Coherent fault tree for the sensor reading flow
through valve V2
The coherent fault tree for the sensor reading ‘flow
through valve V2’ is presented in Fig. 2. Referring to
Fig. 1, ‘flow through valve V2’ can only occur
because V2 is open. There are two possibilities for
having V2 open, these being that either the valve
has failed open, which is a basic event, or that
the system is in a flow phase and is therefore in the
ACTIVE operating mode, represented by a house
event. The two prospective outcomes terminate
the logic development; in this instance the fault
tree is small.
3.1.2 Non-coherent fault tree for the sensor
reading flow through valve V2
The non-coherent fault tree for the sensor reading
‘flow through valve V2’ is shown in Figs 3(a) to (c).
The information described by the coherent fault
tree in section 3.1.1 can now be expanded to include
everything in the system that is known not to have
failed. Figures 3(a) to (c) show that the introduction
of NOT logic significantly increases the amount of
information known about the system behaviour.
The sensor reading ‘flow through valve V2’ occurs
because V2 is open and also because there is water
available. As in the coherent fault tree, V2 is open
either because it has failed open or because it is in
a flow phase (therefore it is in the ACTIVE operating
mode). If V2 is open, then it is definitely not closed.
Therefore it cannot have failed closed or be in a no-
flow (DORMANT) phase.
Table 2 Potential component failures
Code Component failure
P
i
B(16 i 6 6) Pipe P
i
is blocked
P
i
F(16 i 6 6) Pipe P
i
is fractured
V
i
FC (1 6 i 6 3) Valve V
i
fails closed
V
i
FO (1 6 i 6 3) Valve V
i
fails open
S
i
FH (1 6 i 6 2) Sensor S
i
fails high
S
i
FL (1 6 i 6 2) Sensor S
i
fails low
C
i
FH (1 6 i 6 2) Controller C
i
fails high
C
i
FL (1 6 i 6 2) Controller C
i
fails low
TR Water tank ruptured
TL Water tank leaks
NWMS No water from main supply
Fig. 2 Coherent fault tree for the sensor reading flow
through valve V2
Table 3 Sensor readings
Abbreviation Sensor reading
FTV1 Flow through valve V1
FTV2 Flow through valve V2
FTV3 Flow through valve V3
NFTV1 No flow through valve V1
NFTV2 No flow through valve V2
WOST Water in the overspill tray
46 E E Hurdle, L M Bartlett, and J D Andrews
Proc. IMechE Vol. 221 Part O: J. Risk and Reliability JRR6 Ó IMechE 2007

Citations
More filters
Journal ArticleDOI

Fuzzy fault tree analysis: a review of concept and application

TL;DR: This paper explains the fundamentals of fuzzy theory and describes application of fuzzy importance for using FFTA, and reveals the effectiveness of the FFTA in comparison with conventional FTA, when there is inadequate amount of accurate reliability oriented information.
Journal ArticleDOI

Bayesian belief networks for system fault diagnostics

TL;DR: This paper investigates how Bayesian belief networks can be applied to diagnose faults on a system and gives a procedure that can be generalized for any system where the causality structure can be developed relating the system component states to the sensor readings.
Journal ArticleDOI

Integrated system fault diagnostics utilising digraph and fault tree-based approaches

TL;DR: Two approaches which have been developed to cater for the demands of diagnosis within current engineering systems are developed, namely application of the fault tree analysis technique and the method of digraphs, which use a comparative approach to consider differences between actual system behaviour and that expected.
Journal ArticleDOI

Probability analysis of offshore fire by incorporating human and organizational factor

TL;DR: The model clearly shows that the model integrates the power of FT for modeling deterministic causal paths with the flexibility of BN for modeling non-deterministic HOF relationships.
Journal ArticleDOI

Literature review and prospect of the development and application of FMEA in manufacturing industry

TL;DR: In this article, the current research status of FMEA is reviewed from failure mode identification, risk assessment, and industrial standard application, and the future development trend of failure mode and effects analysis in the context of intelligent manufacturing is discussed.
References
More filters
Book

Reliability Evaluation of Engineering Systems

TL;DR: The first € price and the £ and $ price are net prices, subject to local VAT, and the €(D) includes 7% for Germany, the€(A) includes 10% for Austria.
Journal ArticleDOI

Fault Tree Analysis, Methods, and Applications ߝ A Review

TL;DR: This paper reviews and classifies fault-tree analysis methods developed since 1960 for system safety and reliability and classified the literature according to system definition, fault- Tree construction, qualitative evaluation, quantitative evaluation, and available computer codes for fault-Tree analysis.
Book

Reliability and Risk Assessment

TL;DR: In this article, the main probabilistic methods employed in reliability and risk assessment, particularly fault tree analysis and failure mode, and effective analysis and their derivatives, are described and discussed.
Journal ArticleDOI

Application of heuristic search and information theory to sequential fault diagnosis

TL;DR: Lower bounds on the optimal cost-to-go from the information-theoretic concepts of Huffman coding and entropy are derived and have made it possible to obtain optimal test sequences to problems that are intractable with traditional dynamic programming techniques.
Book

Reliability and Risk Assessment

Bob Moss
Related Papers (5)
Frequently Asked Questions (4)
Q1. What are the contributions in this paper?

This paper presents a method for diagnosing faults in systems using FTA to explain the deviations from normal operation observed in sensor outputs. This paper illustrates the concepts of this method by applying the technique to a simplified water tank level control system. 

4. Scheme 4, which uses non-coherent fault trees and checks for consistency by using information from all the observation points, is the most accurate of schemes 1 to 4 for modelling the behaviour of the water tank system. 

If the real cause were, for example, TL.S2FH (the first potential cause listed in Table 8), one way tonarrow down the possibilities would be to switch the operating mode of the system, in this case from ACTIVE to DORMANT. 

Therefore water is contained in the tank, indicating that it has not ruptured, and water is being supplied to the tank, indicating that pipes P1 and P2 are clear and water is coming into the tank.