What are the contributions in this paper?

This paper presents a method for diagnosing faults in systems using FTA to explain the deviations from normal operation observed in sensor outputs. This paper illustrates the concepts of this method by applying the technique to a simplified water tank level control system.

what is the accurate scheme for modelling the behaviour of the water tank system?

4. Scheme 4, which uses non-coherent fault trees and checks for consistency by using information from all the observation points, is the most accurate of schemes 1 to 4 for modelling the behaviour of the water tank system.

What is the way to narrow down the possibilities of a leak?

If the real cause were, for example, TL.S2FH (the first potential cause listed in Table 8), one way tonarrow down the possibilities would be to switch the operating mode of the system, in this case from ACTIVE to DORMANT.

What is the reason why the water is being supplied to the tank?

Therefore water is contained in the tank, indicating that it has not ruptured, and water is being supplied to the tank, indicating that pipes P1 and P2 are clear and water is coming into the tank.

(Open Access) System fault diagnostics using fault tree analysis (2007) | Emma E. Hurdle

This item was submitted to Loughborough’s Institutional Repository

(https://dspace.lboro.ac.uk/) by the author and is made available under the

following Creative Commons Licence conditions.

For the full text of this licence, please go to:

http://creativecommons.org/licenses/by-nc-nd/2.5/

System fault diagnostics using fault tree analysis

E E Hurdle, L M Bartlett, and J D Andrews*

Department of Aeronautical and Automotive Engineering, Loughborough University, Loughborough, UK

The manuscript was received on 15 June 2005 and was accepted after revision for publication on 17 May 2006.

DOI: 10.1243/1748006XJRR6

Abstract: Over the last 50 years, advances in technology have led to an increase in the

complexity and sophistication of systems. More complex systems can be harder to maintain

and the root cause of a fault more difficult to isolate.

Downtime resultin from a system failure can be dangerous or expensive, depending on the

type of system. In aircraft systems the ability to diagnose quickly the causes of a fault can

have a significant impact on the time taken to rectify the problem and to return the aircraft

to service . In chemical prcess plants the need to diagnose causes of a safety-critical failure in

a system can be vital and a diagnosis may be required within minutes. Speed of fault

isolation can save time, reduce costs, and increase company productivity and therefore

profits. System fault diagnosis is the process of identifying the cause of a malfunction by

observing its effect at v arious test points.

Fault tree analysis (FTA) is a method that describes all possible causes of a specified system

state in terms of the state of the components within the system. A system model is used to

identify the states that the system should be in at any point in time. This paper presents a

method for diagnosing faults in systems using FTA to explain the deviations from normal

operation observed in sensor outputs. The causes of a system’s failure modes will be

described in terms of the component state s. This will be achi eved with the use of coherent

and non-coherent fault trees. A coherent fault tree is constructed from AND and OR logic

and therefore considers only component-failed states. The non-coherent method expands

this, allowing the use of NOT logic, which implies that the existence of component-failed

states and component-working states are both taken into account. This paper illustrates the

concepts of this method by applying the technique to a simplified water tank level control

system.

Keywords: fault diagnosis, fault tree analysis

1 INTRODUCTION

A system can be analysed for faults in two diff-

erent ways. The first is to test the system functional-

ity at one point in time. The second continuously

monitors the system and detects faults as they occur.

An example of an approach that carries out a

series of tests to determine the system status at

one point in time is the sequential fault diagnostic

tool developed by Novak and co-workers [1–4].

The approach uses information about which

symptoms are exhibited when the faults are

present. The sequential fault diagnostic tool deter-

mines the best sequence to conduct the test to

locate the fault condition in the cheapest (or

quickest) way. A similar method by Pattip ati and

Alexandridis [5] uses heuristic algorithms in order

to determine the most cost-effective sequence of

tests. These methods are limited to situations where

only a single fault is expected to exist at any point in

time; they do not take into consideration multiple

component failures. Shakeri et al.[6] extended

sequential test sequencing to diagnose multiple

failures in systems. However, this method takes

a considerable length of time to obtain a diagnosis.

* Corresponding author: Department of Aeronautical and

Automotive Engineering, University of Loughborough, Lough-

borough, Leicestershire LE11 3TU, UK. email: J.D.Andrews@

lboro.ac.uk

JRR6 Ó IMechE 2007 Proc. IMechE Vol. 221 Part O: J. Risk and Reliability

Another approach to diagnosi s is the use of gra-

phical models to describe the propagation of faults

in systems. Rao [7] developed two algorithms that

use the information from directed graphs of systems

to diagnose single failures at one point in time. This

technique was developed further by Pattipat i [8]to

diagnose multiple failures in systems. This graphical

method, howev er, does not take into consideration

faults that do not immediately affect the status of

the s ystem when they occur.

Failure modes effects analysis (FMEA) is a struc-

tured qualitative analysis of a system, subsystem, or

function that can be used to identify potential sys-

tem failure modes, their causes, and the effects on

the system operation associated with the failure

modes occurrence. Price [9] demonstrated the use

of automated FMEA to generate reports that could

be used in a diagnostic tool to diagnose multiple

faults in systems at one point in time. The failures

from the FMEA are only generated to a chosen likeli-

hood of occurrence; therefore all possible outcomes

of failure for a system scenario may not be obtained.

Paasch and Mocko [10] use FMEA and fault tree

analysis (FTA) to develop a model that uses matrix

manipulation for diagnosing faults in systems at

one point in time. This research does not, however,

consider multiple failures.

Papadopoulos [11] has carried out work using

state charts and fault trees to provide continuous

on-line monitoring and rectification of systems.

NOT logic is excluded from the fault trees; therefore

only component failures are taken into account to

obtain diagnosis. As a result, some faults occurring

simultaneously have required conflicting remedial

procedures. Yangping et al.[12] also developed a

fault-tree-based method that considers only compo-

nent failures, which uses genetic algorithms to

monitor continuously for faults in nuclear power

plants. Genetic search is slow in obtaining solutions

and there can be problems determining when a

global rather than a local diagnosis has been

obtained.

Many system failures are not usually the result

of one single fault. Therefore the ability to diagnose

multiple faults is vitally important. Methods of

finding faults or combinations of faults as they occu r

are the subject of this paper. The approach is

based on the fault tree method [13–15]. This is tradi-

tionally used to quantify the likelihood of a system

failure. In this application the logic diagram is used

to develop causes of a system symptom exhibited

by a sensor reading, in terms of component

conditions. In order to illustrate the features of

the method described in this paper it will be

applied to a simplified water tank level control

system.

2 THE WATER TANK SYSTEM

The water tank system is illustrated in Fig. 1. It aims

to maintain the level of water between two predeter-

mined limits. In normal operational mode, water

flows out of the system thro ugh valve V2. The level

control system deter mines when water is flowing

from the tank and then refills it by opening valve

V1 until the desired tan k level is reached. The over-

spill tray located beneath the tank catches water if

Main

Supply

Outlet

Valve

Inlet

Valve

Safety

Valve

Fig. 1 Water tank system

44 E E Hurdle, L M Bartlett, and J D Andrews

Proc. IMechE Vol. 221 Part O: J. Risk and Reliability JRR6 Ó IMechE 2007

the tank has ruptured, if water has leaked out

through a crack or hole, or if the water level over-

flows from the top.

2.1 System component description

The water tank system shown in Fig. 1 consists of

three valves labe lled V1, V2, and V3, two level sensors

represented by S1 and S2, two controllers C1 and C2,

and an overspill tray labelled TRAY. There are six

sections of pipes identified by the labels P1 to P6.

V1 is an air-to-open (A/O) inlet valve controlled by

C1. The level sensor S1 detects the height of the

water in the tank. In normal operating mode, if the

water in the tank falls below the required level (as

indicated by the sensor S1), the controller C1 would

open valve V1, allowing water into the tank. Con ver-

sely, if the water in the tank rises to the required

level, then C1 will close V1.

V2 is a manual (MAN) valve, controlled by an

operator in response to demand. Finally valve V3 is

an air-to-close (A/C) valve that operates as a safety

valve controlled by C2. Normally this will only

become operational when a component failure

occurs which causes a very high level of water in

the tank. A signal from S2 would cause the controller

C2 to open valve V3 to reduce the level of water in

the tank.

The overspill tray, located underneath the tank,

collects any spillages that may occur owing to a fail-

ure in the system. So, water in the overspill tray will

occur if the tank has ruptured, if the tank is leaking,

or if the water level overflows from the top of

the tank.

2.2 System operating modes

2.2.1 Sensor locations

The status of the system is determined using mea-

surements provided by flow sensors situated next

to each of the three valves in the system. The sensors

are denoted by VF1, VF2, and VF3 for locations at V1,

V2, and V3 respectively. These each detect the pre-

sence or absence of flow of water, which can be

denoted as flow F or no flow NF respectively. For

this demonstration study it is assumed the sensors

are perfectly reliable. A fourth sensor denoted by

SP1 is located in the overspill tray to indicate

whether any water has escaped from the tank. Its

reading is interpreted as water W or no water NW.

The sensor locations described above are called the

system observation points. The observation points

for V1, V2, V3, and TRAY generated 16 different sce-

narios that the system potentially could produce.

These are listed in Table 1.

Were this a real system, additional sensors could

be added in order to provide a more complete

picture of its operating state. The level control sen-

sors S1 and S2 could also be used. However for the

purposes of demonstrating the fault-tree-based

fault diagnostics technique the system sensors as

described are sufficient, without any additional

complexity.

The system has two operating modes; these being

ACTIVE when the operator opens valve V2, or

DORMANT when V2 is closed. In the ACTIVE operat-

ing mode, water is taken out of the system through

valve V2 and the tank is refilled by wa ter coming in

through valve V1 from the main water suppl y. Water

would not exit the system through val ve V3 and there

would be no water in the overspill tray. The sensor

readings for the system when ACTIVE should be as

those given in scenario 4 in Table 1. In the DOR-

MANT operating mode the system is effectively on

standby with all three valves remaining closed and

the overspill tray empty. This should result in the

sensor readings given in scenario 16 in Table 1.

The sensor readings given by scenarios 4 and 16

are those which, under steady state conditions,

represent the model of how the system should

behave when ACTIVE and DORMANT respectively.

Given that the system is in the ACTIVE or DORMANT

state, any sensor readings that deviate from those

expected are regarded as being indicative of some

fault within the system.

2.2.2 Possible component failures

In order to apply FTA to a system the faults that

could occur for each of the system components

need to be defined. Table 2 contains a list of possible

component failures and their code.

The two operating modes are also represented in

the fault trees. ACTIVE signifies that the operator

has attempted to open valve V2. DORMANT is used

Table 1 System scenarios

Scenario V1 V2 V3 TRAY

1 FFFW

2 FFFNW

3FFNFW

4FFNFNW

5FNFFW

6 F NF F NW

7 F NF NF W

8 F NF NF NW

9 NFFFW

10 NF F F NW

11 NF F NF W

12 NF F NF NW

13 NF NF F W

14 NF NF F NW

15 NF NF NF W

16 NF NF NF NW

System fault diagnostics using fault tree analysis 45

JRR6 Ó IMechE 2007 Proc. IMechE Vol. 221 Part O: J. Risk and Reliability

to indicate that the operator has tried to close V2. It

should be noted that this is a two-mode system and

so only one of the variables ACTIVE or DORMANT

can be true at any time.

3 SENSOR DEVIATION MODELS

In the application of FTA to the system fault diagnos-

tics a series of failure logic diagrams are produced,

representing the causes of any sensor readings.

These are developed in a fault tree in terms of the

component failure conditions of the system operat-

ing state. The list of all possible sensor readings for

the s ystem is shown in Table 3.

Two other possible sensor readings for the system

are ‘no flow through valve V3’ and ‘no water in the

overspill tray’. However, these sensor readings

would occur under normal operational conditions

(i.e. without any failures occurring) in both the

ACTIVE and the DORMANT operating states.

3.1 Fault tree construction

Fault trees were generated for each sensor reading

listed in Table 3 using both coherent and non-coher-

ent methods. Coherent fault trees are constructed

from AND and OR logic and feature only component

failure events in the causes of sensor status. The

non-coherent method expands this, allowing the

use of the NOT operator. This implies that both com-

ponent-failed and component-working states are

taken into account when developing the causes of

the sensor status.

3.1.1 Coherent fault tree for the sensor reading flow

through valve V2

The coherent fault tree for the sensor reading ‘flow

through valve V2’ is presented in Fig. 2. Referring to

Fig. 1, ‘flow through valve V2’ can only occur

because V2 is open. There are two possibilities for

having V2 open, these being that either the valve

has failed open, which is a basic event, or that

the system is in a flow phase and is therefore in the

ACTIVE operating mode, represented by a house

event. The two prospective outcomes terminate

the logic development; in this instance the fault

tree is small.

3.1.2 Non-coherent fault tree for the sensor

reading flow through valve V2

The non-coherent fault tree for the sensor reading

‘flow through valve V2’ is shown in Figs 3(a) to (c).

The information described by the coherent fault

tree in section 3.1.1 can now be expanded to include

everything in the system that is known not to have

failed. Figures 3(a) to (c) show that the introduction

of NOT logic significantly increases the amount of

information known about the system behaviour.

The sensor reading ‘flow through valve V2’ occurs

because V2 is open and also because there is water

available. As in the coherent fault tree, V2 is open

either because it has failed open or because it is in

a flow phase (therefore it is in the ACTIVE operating

mode). If V2 is open, then it is definitely not closed.

Therefore it cannot have failed closed or be in a no-

flow (DORMANT) phase.

Table 2 Potential component failures

Code Component failure

B(16 i 6 6) Pipe P

is blocked

F(16 i 6 6) Pipe P

is fractured

FC (1 6 i 6 3) Valve V

fails closed

FO (1 6 i 6 3) Valve V

fails open

FH (1 6 i 6 2) Sensor S

fails high

FL (1 6 i 6 2) Sensor S

fails low

FH (1 6 i 6 2) Controller C

fails high

FL (1 6 i 6 2) Controller C

fails low

TR Water tank ruptured

TL Water tank leaks

NWMS No water from main supply

Fig. 2 Coherent fault tree for the sensor reading flow

through valve V2

Table 3 Sensor readings

Abbreviation Sensor reading

FTV1 Flow through valve V1

FTV2 Flow through valve V2

FTV3 Flow through valve V3

NFTV1 No flow through valve V1

NFTV2 No flow through valve V2

WOST Water in the overspill tray

46 E E Hurdle, L M Bartlett, and J D Andrews

Proc. IMechE Vol. 221 Part O: J. Risk and Reliability JRR6 Ó IMechE 2007

System fault diagnostics using fault tree analysis

Figures

Citations

Fuzzy fault tree analysis: a review of concept and application

Bayesian belief networks for system fault diagnostics

Integrated system fault diagnostics utilising digraph and fault tree-based approaches

Probability analysis of offshore fire by incorporating human and organizational factor

Literature review and prospect of the development and application of FMEA in manufacturing industry

References

Reliability Evaluation of Engineering Systems

Fault Tree Analysis, Methods, and Applications ߝ A Review

Reliability and Risk Assessment

Application of heuristic search and information theory to sequential fault diagnosis

Reliability and Risk Assessment

Related Papers (5)

Sequential Diagnosis Tool

Automated Diagnosis of Physical Systems

Application of the digraph method of fault tree construction to a complex control configuration

Application of the digraph method of fault tree construction to process plant

A Review of Process Fault Detection and Diagnosis Part I : Quantitative Model-Based Methods

Frequently Asked Questions (4)

Q1. What are the contributions in this paper?

Q2. what is the accurate scheme for modelling the behaviour of the water tank system?

Q3. What is the way to narrow down the possibilities of a leak?

Q4. What is the reason why the water is being supplied to the tank?