scispace - formally typeset
Search or ask a question
Journal ArticleDOI

An overview of fault tree analysis and its application in model based dependability analysis

Sohag Kabir1
01 Jul 2017-Expert Systems With Applications (Elsevier)-Vol. 77, pp 114-135
TL;DR: The standard fault tree with its limitations is reviewed and a number of prominent MBDA techniques where fault trees are used as a means for system dependability analysis are reviewed and an insight into their working mechanism, applicability, strengths and challenges are provided.
Abstract: I provide an overview of the Fault Tree Analysis method.I review different extensions of fault trees.A number of model-based dependability analysis approaches are reviewed.I outline the future outlook for model-based dependability analysis. Fault Tree Analysis (FTA) is a well-established and well-understood technique, widely used for dependability evaluation of a wide range of systems. Although many extensions of fault trees have been proposed, they suffer from a variety of shortcomings. In particular, even where software tool support exists, these analyses require a lot of manual effort. Over the past two decades, research has focused on simplifying dependability analysis by looking at how we can synthesise dependability information from system models automatically. This has led to the field of model-based dependability analysis (MBDA). Different tools and techniques have been developed as part of MBDA to automate the generation of dependability analysis artefacts such as fault trees. Firstly, this paper reviews the standard fault tree with its limitations. Secondly, different extensions of standard fault trees are reviewed. Thirdly, this paper reviews a number of prominent MBDA techniques where fault trees are used as a means for system dependability analysis and provides an insight into their working mechanism, applicability, strengths and challenges. Finally, the future outlook for MBDA is outlined, which includes the prospect of developing expert and intelligent systems for dependability analysis of complex open systems under the conditions of uncertainty.

Summary (5 min read)

1. Introduction

  • Safety critical systems are extensively used in many industries, including the aerospace, automotive, medical, and energy sectors.
  • FTA is deductive in nature meaning that the analysis starts with a top event (a system failure) and works backwards from the top of the tree towards the leaves of the tree to determine the root causes of the top event.
  • Dynamic dependability assessment overcomes many of the limitations of the static dependability analysis by allowing the dependability assessment of dynamic systems.
  • The remainder of this paper is organised as follow: Section 2 reviews the classical fault tree analysis technique and describes the limitation of this technique.

2. Standard Fault Trees

  • The intention behind this invention was to help in the design of US Air Force’s Minuteman missile system.
  • The approach was successfully used by David Haasl from the Boeing Company to analyse the whole system.
  • Several papers on fault tree analysis were presented at the first System Safety Conference in 1965 (Ericson, 1999).
  • After the creation of FTA, it has been used in variety of fields, including but not limited to: automotive, aerospace, and nuclear industries (Walker & Papadopoulos, 2009; Kabir, Azad, Walker, & Gheraibia, 2015).
  • The Fault Tree Handbook (Vesely et al., 2002) provides a broad introduction to standard fault trees.

2.1 Fault Tree Symbology

  • Fault tree consists of three types of nodes: events, gates and transfer symbols.
  • Basic events are represented as leaf nodes in the fault tree and they combine together to cause intermediate events.
  • Normal events are represented by a house symbol.
  • Similar to the OR gate there may be any number of input events to an AND gate but in contrast to the OR gate, the AND gate usually represents a causal relationship between its inputs and outputs.
  • In symbol specifies that the tree is developed further and the branch corresponds to this transfer.

2.2 Analysis of Standard Fault Trees

  • Analysis of standard fault trees is usually performed on two levels: a qualitative level and a quantitative level.
  • This algorithm starts its operation with the top event of the fault tree and recursively explores the fault tree by expanding the intermediate events into their contributing basic events.
  • After performing all the above steps, each row of the resulting table will contain a minimal cut set.
  • It is not the most efficient technique and algorithms like MICSUP (Pande, Spector, & Chatterjee, 1975), ELRAFT (Semanderes, 1971) tend to be faster.
  • In addition to determining dominant MCS, importance of basic events could also be obtained in the similar way.

2.3 Limitations of Standard Fault Tree

  • Modern large and complex systems are becoming increasingly dynamic in nature.
  • Dynamic behaviour of systems lead to different dynamic failure characteristics such as functional dependent events and priorities of failure events.
  • If the authors consider the first two conditions, then one can conclude that without input the system cannot operate, and if all the three components fail then the system cannot operate as well.
  • To overcome this limitation, a number of extensions to static fault trees such as dynamic fault trees (DFTs) (Dugan et al., 1992), State/Event Fault Trees (Kaiser et al., 2007), Temporal Fault Trees (Palshikar, 2002) and Pandora Temporal Fault Trees (Walker, 2009) have been proposed.
  • Secondly, as the system grows in size, the manual nature of the analysis process increases the risk of introducing error or producing incomplete results.

3.1 Component Fault Trees

  • This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/.
  • In the CFT approach, smaller fault trees for each system component are defined and those component fault trees are organised in a hierarchical structure according to the architectural hierarchy of the system (Kaiser, Liggesmeyer, & Mäckel, 2003).
  • Moreover, CFTs utilise input output failure ports and internal failure events.
  • CFTs differ from SFTs in that they allow multiple top events to be defined and represent repeating events only once.
  • This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/.

3.2 Dynamic Fault Trees

  • Dynamic Fault Trees (DFTs) (Dugan et al., 1992) are the most prominent dynamic extension of SFTs that enable a fault tree to capture sequence dependent dynamic behaviour.
  • The FDEP gate helps to design a scenario when the operations of some components of a system are dependent on the operation of another component of the system.
  • The dormancy factor of a component in warm spare mode is considered somewhere in-between the dormancy factor of cold and hot spare modes (e.g., 0.5).
  • If the primary component of any of the SPARE gates fails, it is then replaced by the first available spare component (i.e., neither failed nor already occupied by another SPARE gate).
  • DFTs are intended to perform quantitative reliability analysis of dynamic systems, and consequently they have limited support for qualitative analysis.

3.3 Pandora Temporal Fault Trees

  • Pandora is an extension of classical fault trees, which makes conventional fault trees capable of dynamic analysis (Walker, Bottaci, & Papadopoulos, 2007; Walker, 2009).
  • The POR gate also represents a sequence between the events, but it specifies an ordered disjunction rather than an ordered conjunction, i.e., event X must occur before event Y if event Y occurs at all.
  • In order to relate the Boolean gates with the temporal gates, Pandora defines a set of new temporal laws (Walker, 2009).
  • The analytical solution (Edifor, Walker, & Gordon, 2012, 2013) to Pandora defines mathematical expressions for probabilistic evaluation of Pandora TFTs.
  • Diagnostic analysis involves calculating and updating the posterior probability of basic events given observed evidence of the system failure.

3.4 State Event Fault Trees

  • As already mentioned, the classical combinatorial FTA is suitable for modelling static behaviour of systems but is not suitable for modelling dynamic behaviour.
  • Therefore, elements/components in the system architecture are modelled with a set of states and probabilistic transitions between these states (see Fig. 9).
  • This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/.
  • Different types of techniques are now required to convert SEFTs into other representation like Petri Nets or Markov chains for quantitative evaluation of SEFTs.
  • This problem can be remedied to some extent by using a combinatorial-FTA like algorithm for the static part of the system and using more efficient algorithms for the dynamic part of the system.

4.1 Failure Propagation and Transformation Notation

  • Failure Propagation and Transformation Notation (FPTN), which overcomes many of the limitations of FTA, is the first modular and graphical method to specify failure behaviour of systems with complex architectures.
  • The basic unit of the FPTN is a “module” and usually is represented by a simple box with a set of input and output failure modes.
  • This section is important because it defines how the module is affected by the other modules or environment and how other modules or environment are likely to be affected by this module.
  • In normal operating condition, the sensors monitor the pressure on the steam boiler and report it back to the controller.
  • This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/.

4.2 Hierarchically Performed Hazard Origin and Propagation Studies

  • Hierarchically Performed Hazard Origin & Propagation Studies or HiP-HOPS (Papadopoulos & Mcdermid, 1999; Papadopoulos, 2000) is one of the more advanced and well supported compositional model based dependability analysis techniques.
  • The dependability related information includes component failure modes and expressions for output deviations, which describe how a component can fail and how it responds to failures occurred in other parts of the system.
  • The system modelling and failure annotation phase allows analysts to provide information to the HiP-HOPS tool on how the different system components are interconnected and how they can fail.
  • This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/.
  • In this way the tool traverses the whole architecture and creates local fault trees for the interconnected components.

4.4 Formal Safety Analysis Platform—New Symbolic Model Verifier

  • The Formal Safety Analysis Platform FSAP/NuSMV-SA (Bozzano & Villafiorita, 2003, 2007) consists of a set of tools including a graphical user interface tool, FSAP, and an extension of model checking engine NuSMV.
  • The primary goal of the FSAP/NuSMV-SA toolset is to verify if a system model is satisfying its safety requirements; however, it is capable of performing different types of safety analyses, e.g., automatic fault tree generation.
  • The next step is to assess the annotated system model against its functional safety requirements.
  • The minimal cut sets are obtained from these set of states by extracting only the failure variables from these states and writing the expressions for those failure modes based on the information obtained from the state machine.
  • As standard FTs are combinatorial this process ignores dynamic information stored in the system model.

4.5 AADL

  • Architecture Analysis and Design Language (AADL) is a domain-specific language standardised by the International Society of Automotive Engineers (SAE) for the specification and analysis of hardware and software architectures of performance-critical real-time systems (SAE, 2012).
  • AADL supports different types of interaction between components, for example events and dataflows, and the interactions between hardware and software components are defined through binding.
  • Several model transformation based methods have been proposed so far to support various analyses based on AADL models.
  • Li, Zhu, Ma, & Xu (2011) have also proposed a method to translate AADL models into fault trees.
  • This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/.

4.6 Other Approaches

  • The section briefly describes two other approaches, where fault tree analysis could be used as a means for system dependability analysis.
  • This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/.
  • Deductive Failure Order Analysis (Güdemann, Ortmeier, & Reif, 2008), an extension of DCCA enables it to deduce temporal ordering information of critical sets from DCCA.
  • It utilises finite state automata with parallel synchronous execution capability with discrete time steps to describe a system model consisting of hardware and/or software components, environmental conditions etc.
  • SAML models can be transformed automatically to the input format of other model based safety analysis techniques without changing the architecture of the systems.

5. Discussion and Future Outlook for MBDA

  • It is a priority for system analysts and engineers to ensure the dependability of safety-critical systems by identifying the potential risks they pose as early as possible and then minimising the likelihood of these risks.
  • Model based dependability analysis paradigm overcomes the above problems by simplifying the dependability analysis process by automatically synthesising dependability related data from system models to generate dependability analysis artefacts such as fault trees and FMEAs.
  • This means that the system analysts and risk modellers can apply MBDA approaches on the early design models of the systems as part of model-based design process.
  • This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/.
  • All the data does not necessarily need to be analysed online, offline analysis of vast amount data will also be needed.

6. Conclusion

  • This paper provided an overview of fault tree analysis.
  • Different extensions of standard fault trees have been proposed to overcome some of the limitations.
  • Several tools and techniques have been developed to support MBDA.
  • MBDA techniques allow dependability models and analyses to be automatically synthesised from system engineering models.
  • This paper has discussed some of these challenges and provided directions for future research.

Did you find this useful? Give us your feedback

Citations
More filters
Journal ArticleDOI
TL;DR: The proposed approach makes the use of expert knowledge and fuzzy set theory for handling the uncertainty in the failure data and employs the Bayesian network modeling for capturing dependency among the events and for a robust probabilistic reasoning in the conditions of uncertainty.

179 citations


Cites methods from "An overview of fault tree analysis ..."

  • ...FTA is the most popular among all available techniques and it has been extensively used for risk analysis in several industries (Kabir, 2017; Khan et al., 2008; Wang et al., 2002)....

    [...]

Journal ArticleDOI
TL;DR: A review of fuzzy set theory based methodologies applied to safety and reliability engineering, which include fuzzy FTA, fuzzy FMEA, fuzzy ETA, fuzzy Bayesian networks, fuzzy Markov chains, and fuzzy Petri nets is presented.

108 citations

Journal ArticleDOI
TL;DR: A method incorporating fuzzy probability and Bayesian network (BN) into multi-state systems (MSSs) with CCFs is proposed and can improve the ability of BN on reliability evaluation of complex system with uncertainty issues.
Abstract: Multi-state components, common cause failures (CCFs) and data uncertainty are the general problems for reliability analysis of complex engineering systems. In this paper, a method incorporating fuzzy probability and Bayesian network (BN) into multi-state systems (MSSs) with CCFs is proposed. In particular, basic theories of multi-state BN and fuzzy probability are developed. Moreover, a model integrating CCFs with BN has also been illustrated. In order to incorporate fuzzy probability into MSSs reliability evaluation considering common parent node generated by CCFs, fuzzy probability has to be translated into accurate probability through defuzzification and normalization methods which are both elaborated. In addition, quantitative analysis based on BN is carried out. In this paper, feed system of boring spindle in computer numerical control machine is analyzed as an example to validate the feasibility of the proposed method. It can improve the ability of BN on reliability evaluation of complex system with uncertainty issues.

96 citations

Journal ArticleDOI
TL;DR: In this article, the authors used the Fault Tree Analysis (FTA) method for both qualitative and quantitative evaluation of semi-submersible floating offshore wind turbine failure characteristics, indicating that most of the failures are caused by several basic factors.

93 citations

Journal ArticleDOI
TL;DR: This study aimed at establishing fault tree analysis (FTA) using expert opinion to compute the probability of an event using Boolean algebra, and the effectiveness of the proposed approach is demonstrated with a real-life case study.
Abstract: This study aimed at establishing fault tree analysis (FTA) using expert opinion to compute the probability of an event. To find the probability of the top event (TE), all probabilities of the basic events (BEs) should be available when the FTA is drawn. In this case, employing expert judgment can be used as an alternative to failure data in an awkward situation. The fuzzy analytical hierarchy process as a standard technique is used to give a specific weight to each expert, and fuzzy set theory is engaged for aggregating expert opinion. In this regard, the probability of BEs will be computed and, consequently, the probability of the TE obtained using Boolean algebra. Additionally, to reduce the probability of the TE in terms of three parameters (safety consequences, cost and benefit), the importance measurement technique and modified TOPSIS was employed. The effectiveness of the proposed approach is demonstrated with a real-life case study.

89 citations


Cites background from "An overview of fault tree analysis ..."

  • ...Kabir [6] presented an extension of temporal FTA....

    [...]

References
More filters
Book
01 Aug 1996
TL;DR: A separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.
Abstract: A fuzzy set is a class of objects with a continuum of grades of membership. Such a set is characterized by a membership (characteristic) function which assigns to each object a grade of membership ranging between zero and one. The notions of inclusion, union, intersection, complement, relation, convexity, etc., are extended to such sets, and various properties of these notions in the context of fuzzy sets are established. In particular, a separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.

52,705 citations

Book ChapterDOI
14 Jul 2011
TL;DR: A major new release of the PRISMprobabilistic model checker is described, adding, in particular, quantitative verification of (priced) probabilistic timed automata.
Abstract: This paper describes a major new release of the PRISMprobabilistic model checker, adding, in particular, quantitative verification of (priced) probabilistic timed automata. These model systems exhibiting probabilistic, nondeterministic and real-time characteristics. In many application domains, all three aspects are essential; this includes, for example, embedded controllers in automotive or avionic systems, wireless communication protocols such as Bluetooth or Zigbee, and randomised security protocols. PRISM, which is open-source, also contains several new components that are of independent use. These include: an extensible toolkit for building, verifying and refining abstractions of probabilistic models; an explicit-state probabilistic model checking library; a discrete-event simulation engine for statistical model checking; support for generation of optimal adversaries/strategies; and a benchmark suite.

2,377 citations

Book
01 Jan 1995
TL;DR: This chapter discusses the role of humans in Automated Systems, the nature of risk, and elements of a Safeware Program, which aims to manage Safety and Security through design and implementation.
Abstract: I The Nature Of Risk. Risk In Modern Society. Changing Attitudes Toward Risk. Is Increased Concern Justified?. Unique Risk Factors in Industrialized Society. Computers And Risk. The Role of Computers in Accidents. Software Myths. Why Software Engineering is hard. The Reality We Face. Causes Of Accidents. The Concept of Causality. Flaws in the Safety Culture. Ineffective Organizational Structure. Ineffective Technical Activities. Human Error And Risk. Do Humans Cause Most Accidents?. The Need for Humans in Automated Systems. Human Error as Human-Task Mismatch. Conclusions. The Role Of Humans In Automated Systems. Mental Models. The Human as Monitor. The Human as Backup. The Human as Partner. Conclusions. II Introduction To System Safety. Foundations Of System Safety. Safety Engineering Pre-World War II. Systems Theory. Systems Engineering. Systems Analysis. Fundamentals Of System Safety. Historical Development. Basic Concepts. Software System Safety. Cost and Effectiveness of System Safety. Other Approaches To Safety. Industrial Safety. Reliability Engineering. Application-Specific Approaches to Safety. III Definitions And Models. Terminology. Failure and Error. Accident and Incident. Hazard. Risk. Safety. Safety and Security. Accident And Human Error Models. Accident Models. Human Task and Error Models. Summary. IV Elements Of A Safeware Program. Managing Safety. The Role of General Management. Place in the Organizational Structure. Documentation. The System And Software Safety Process. The General Tasks. Conceptual Development. Design. Full-Scale Development. Production and Deployment. Operation. "Examples. Hazard Analysis. The Hazard Analysis Process. Types of System Models. General Types of Analysis. Limitations and Criticisms of Hazard Analysis. Hazard Analysis Models And Techniques. Checklists. Hazard Indices. Fault Tree Analysis. Management Oversight and Risk Tree (MORT) Analysis. Event Tree Analysis. Cause-Consequence analysis (CCA). Hazards and Operability Analysis (HAZOP). Interface Analyses. Failure Modes and Effects Analysis (FMEA). Failure Modes, Effects, and Criticality Analysis (FMECA). Fault Hazard Analysis (FHA). State Machine Hazard Analysis (SMHA). Task and Human Error Analysis Techniques. Evaluations of Hazard Analysis Techniques. Software Hazard And Requirements Analysis. Process Considerations. Requirements Specification Components. Completeness in Requirements Specifications. Completeness Criteria for Requirements Analysis. Constraint Analysis. Designing For Safety. The Design Process. Design Techniques. Design Modification and Maintenance. Design Of The Human-Machine Interface. General Process Considerations. Matching Tasks to Human Characteristics. Reducing Safety-Critical Human Errors. Providing Appropriate Information and Feedback. Training and Maintaining Skills. Guidelines for Safe HMI Design. Verification Of Safety. Dynamic Analysis. Static Analysis. Independent Verification and Validation. Summary.

1,833 citations


"An overview of fault tree analysis ..." refers background in this paper

  • ..., performed manually either by a single person or a group of persons to produce some comprehensive documents to satisfy the safety requirements and to determine strategies to alleviate the effects of failures (Leveson, 1995)....

    [...]

  • ...…with fault tree analysis is that it is primarily a manual process, i.e., performed manually either by a single person or a group of persons to produce some comprehensive documents to satisfy the safety requirements and to determine strategies to alleviate the effects of failures (Leveson, 1995)....

    [...]

Book
28 Nov 1995
TL;DR: This book presents a unified theory of Generalized Stochastic Petri Nets together with a set of illustrative examples from different application fields to show how this methodology can be applied in a range of domains.
Abstract: From the Publisher: This book presents a unified theory of Generalized Stochastic Petri Nets (GSPNs) together with a set of illustrative examples from different application fields. The continuing success of GSPNs and the increasing interest in using them as a modelling paradigm for the quantitative analysis of distributed systems suggested the preparation of this volume with the intent of providing newcomers to the field with a useful tool for their first approach. Readers will find a clear and informal explanation of the concepts followed by formal definitions when necessary or helpful. The largest section of the book however is devoted to showing how this methodology can be applied in a range of domains.

1,487 citations

Journal ArticleDOI
TL;DR: It is shown that any FT can be directly mapped into a BN and that basic inference techniques on the latter may be used to obtain classical parameters computed from the former, i.e. reliability of the Top Event or of any sub-system, criticality of components, etc.

819 citations

Frequently Asked Questions (2)
Q1. What are the contributions mentioned in the paper "An overview of fault tree analysis and its application in model based dependability analysis" ?

Firstly, this paper reviews the standard fault tree with its limitations. Thirdly, this paper reviews a number of prominent MBDA techniques where fault trees are used as a means for system dependability analysis and provides an insight into their working mechanism, applicability, strengths and challenges. 

Therefore, future research associated with these approaches are likely to concern with the improvement of the power and time complexity of the tools and techniques in the context of large and complex system models. This has open new avenues for further research to develop expert systems by combining MBDA approaches with other soft computing approaches for the assurance of dependability of such open systems. One possible avenue worthy of further research is the improvement of the MBDA approaches to perform real time analysis of systems—though it will complicate the analysis process and affect the scalability of the approaches. Future trends are likely to leading to more robust integrations between different existing MBDA approaches so that different strengths ( e. g. dependability analysis and model checking capability ) of the existing approaches can be utilised in a complementary manner. 

Trending Questions (1)