scispace - formally typeset
Search or ask a question

Showing papers on "Dependability published in 2004"


Journal ArticleDOI
TL;DR: The aim is to explicate a set of general concepts, of relevance across a wide range of situations and, therefore, helping communication and cooperation among a number of scientific and technical communities, including ones that are concentrating on particular types of system, of system failures, or of causes of systems failures.
Abstract: This paper gives the main definitions relating to dependability, a generic concept including a special case of such attributes as reliability, availability, safety, integrity, maintainability, etc. Security brings in concerns for confidentiality, in addition to availability and integrity. Basic definitions are given first. They are then commented upon, and supplemented by additional definitions, which address the threats to dependability and security (faults, errors, failures), their attributes, and the means for their achievement (fault prevention, fault tolerance, fault removal, fault forecasting). The aim is to explicate a set of general concepts, of relevance across a wide range of situations and, therefore, helping communication and cooperation among a number of scientific and technical communities, including ones that are concentrating on particular types of system, of system failures, or of causes of system failures.

4,695 citations


Journal ArticleDOI
TL;DR: It is found that many techniques from dependiveness evaluation can be applied in the security domain, but that significant challenges remain, largely due to fundamental differences between the accidental nature of the faults commonly assumed in dependability evaluation, and the intentional, human nature of cyber attacks.
Abstract: The development of techniques for quantitative, model-based evaluation of computer system dependability has a long and rich history. A wide array of model-based evaluation techniques is now available, ranging from combinatorial methods, which are useful for quick, rough-cut analyses, to state-based methods, such as Markov reward models, and detailed, discrete-event simulation. The use of quantitative techniques for security evaluation is much less common, and has typically taken the form of formal analysis of small parts of an overall design, or experimental red team-based approaches. Alone, neither of these approaches is fully satisfactory, and we argue that there is much to be gained through the development of a sound model-based methodology for quantifying the security one can expect from a particular design. In this work, we survey existing model-based techniques for evaluating system dependability, and summarize how they are now being extended to evaluate system security. We find that many techniques from dependability evaluation can be applied in the security domain, but that significant challenges remain, largely due to fundamental differences between the accidental nature of the faults commonly assumed in dependability evaluation, and the intentional, human nature of cyber attacks.

537 citations


Proceedings Article
06 Dec 2004
TL;DR: By allowing distinct device drivers to reside in separate virtual machines, this technique isolates faults caused by defective or malicious drivers, thus improving a system's dependability, and enables extensive reuse of existing and unmodified drivers.
Abstract: We propose a method to reuse unmodified device drivers and to improve system dependability using virtual machines. We run the unmodified device driver, with its original operating system, in a virtual machine. This approach enables extensive reuse of existing and unmodified drivers, independent of the OS or device vendor, significantly reducing the barrier to building new OS endeavors. By allowing distinct device drivers to reside in separate virtual machines, this technique isolates faults caused by defective or malicious drivers, thus improving a system's dependability. We show that our technique requires minimal support infrastructure and provides strong fault isolation. Our prototype's network performance is within 3-8% of a native Linux system. Each additional virtual machine increases the CPU utilization by about 0.12%. We have successfully reused a wide variety of unmodified Linux network, disk, and PCI device drivers.

278 citations


Book ChapterDOI
01 Jan 2004
TL;DR: This paper gives the main definitions relating to dependability, a generic concept including as special case such attributes as reliability, availability, safety, confidentiality, integrity, maintainability, etc.
Abstract: This paper gives the main definitions relating to dependability, a generic concept including as special case such attributes as reliability, availability, safety, confidentiality, integrity, maintainability, etc. Basic definitions are given first. They are then commented upon, and supplemented by additional definitions, which address the threats to dependability (faults, errors, failures), and the attributes of dependability. The discussion on the attributes encompasses the relationship of dependability with security, survivability and trustworthiness.

250 citations


Journal Article
TL;DR: A survey on fault injection techniques with comparison of the different injection techniques and an overview on the different tools is presented.
Abstract: Fault tolerant circuits are currently required in several major application sectors. Besides and in complement to other possible approaches such as proving or analytical modeling whose applicability and accuracy are significantly restricted in the case of complex fault tolerant systems, fault-injection has been recognized to be particularly attractive and valuable. Fault injection provides a method of assessing the dependability of a system under test. It involves inserting faults into a system and monitoring the system to determine its behavior in response to a fault. Several fault injection techniques have been proposed and practically experimented. They can be grouped into hardware-based fault injection, software-based fault injection, simulation-based fault injection, emulation-based fault injection and hybrid fault injection. This paper presents a survey on fault injection techniques with comparison of the different injection techniques and an overview on the different tools.

234 citations


Journal ArticleDOI
TL;DR: This work has been investigating ways to address the problem of dependability in end-user programming by developing a software engineering paradigm viable for end- user programming, an approach it is called end-users software engineering.
Abstract: End-user programming has become the most common form of programming in use today [2], but there has been little investigation into the dependability of the programs end users create. This is problematic because the dependability of these programs can be very important; in some cases, errors in end-user programs, such as formula errors in spreadsheets, have cost millions of dollars. (For example, see www.theregister.co.uk/content/67/31298.html or panko.cba.hawaii.edu/ssr/Mypapers/whatknow.htm.) We have been investigating ways to address this problem by developing a software engineering paradigm viable for end-user programming, an approach we call end-user software engineering.

226 citations


Proceedings Article
31 Mar 2004
TL;DR: The solution is a tool that automates the design of disaster-tolerant solutions and appropriately selects designs that meet its objectives under specified disaster scenarios, so designing for disasters no longer needs to be a hit-or-miss affair.
Abstract: Losing information when a storage device or data center fails can bring a company to its knees--or put it out of business altogether. Such catastrophic outcomes can readily be prevented with today's storage technology, albeit with some difficulty: the design space of solutions is surprisingly large, the configuration choices are myriad, and the alternatives interact in complicated ways. Thus, solutions are often over- or under-engineered, and administrators may not understand the degree of dependability they provide. Our solution is a tool that automates the design of disaster-tolerant solutions. Driven by financial objectives and detailed models of the behaviors and costs of the most common solutions (tape backup, remote mirroring, site failover, and site reconstruction), it appropriately selects designs that meet its objectives under specified disaster scenarios. As a result, designing for disasters no longer needs to be a hit-or-miss affair.

222 citations


Proceedings ArticleDOI
28 Jun 2004
TL;DR: This paper presents techniques that continuously detect faults and repair the overlay to achieve high dependability and good performance in realistic environments and shows that previous concerns about the performance and dependability are unfounded.
Abstract: Structured peer-to-peer (P2P) overlay networks provide a useful substrate for building distributed applications. They map object keys to overlay nodes and offer a primitive to send a message to the node responsible for a key. They can implement, for example, distributed hash tables and multicast trees. However, there are concerns about the performance and dependability of these overlays in realistic environments. Several studies have shown that current P2P environments have high churn rates: nodes join and leave the overlay continuously. This paper presents techniques that continuously detect faults and repair the overlay to achieve high dependability and good performance in realistic environments. The techniques are evaluated using large-scale network simulation experiments with fault injection guided by real traces of node arrivals and departures. The results show that previous concerns are unfounded; our techniques can achieve dependable routing in realistic environments with an average delay stretch below two and a maintenance overhead of less than half a message per second per node.

200 citations


Journal ArticleDOI
08 Oct 2004
TL;DR: The paper focuses on use of the AOM approach to produce logical, aspect-oriented architecture models (AAMs) that describe how concerns are addressed in technology-independent terms that allow developers to more easily evolve and replace the parts as they explore alternative ways of balancing concerns in the early stages of development.
Abstract: Developers of modern software systems are often required to build software that addresses security, fault-tolerance and other dependability concerns. A decision to address a dependability concern in a particular manner can make it difficult or impossible to address other concerns in software. Proper attention to balancing key dependability and other concerns in the early phases of development can help developers better manage product risks through early identification and resolution of conflicts and undesirable emergent behaviours that arise as a result of interactions across behaviours that address different concerns. The authors describe an aspect-oriented modelling (AOM) approach that eases the task of exploring alternative ways of addressing concerns during software modelling. The paper focuses on use of the AOM approach to produce logical, aspect-oriented architecture models (AAMs) that describe how concerns are addressed in technology-independent terms. An AAM consists of a set of aspect models and a base architecture model called the primary model. An aspect model describes how a dependability concern is addressed, and a primary model describes how other concerns are addressed. Composition of the aspect and primary models in an AAM produces an integrated view of the logical architecture described by the AAM. Composition can reveal conflicts and undesirable emergent properties. Resolving these problems can involve developing and analysing alternative ways of addressing concerns. Localising the parts of an architecture that address pervasive and nonorthogonal dependability concerns in aspect models allows developers to more easily evolve and replace the parts as they explore alternative ways of balancing concerns in the early stages of development.

170 citations


Proceedings ArticleDOI
18 Oct 2004
TL;DR: This paper extends the normal asynchronous system with a special distributed oracle called TTCB to implement an intrusion-tolerant service based on the state machine approach with only 2f + 1 replicas, this is the first time the number of replicas is reduced.
Abstract: The application of dependability concepts and techniques to the design of secure distributed systems is raising a considerable amount of interest in both communities under the designation of intrusion tolerance. However, practical intrusion-tolerant replicated systems based on the state machine approach (SMA) can handle at most f Byzantine components out of a total of n = 3f + 1, which is the maximum resilience in asynchronous systems. This paper extends the normal asynchronous system with a special distributed oracle called TTCB. Using this extended system we manage to implement an intrusion-tolerant service based on the SMA with only 2f + 1 replicas. Albeit a few other papers in the literature present intrusion-tolerant services with this approach, this is the first time the number of replicas is reduced from 3f + 1 to 2f + 1. Another interesting characteristic of the described service is a low time complexity.

156 citations


Journal ArticleDOI
TL;DR: A new concept of aging tokens (tokens with memory) is introduced, and the resulting framework provides for flexible and transparent graphical modeling with excellent representational power that is particularly suited for system reliability modeling with non-exponentially distributed firing times.

Journal ArticleDOI
TL;DR: Building systems to recover fast may be more productive than aiming for systems that never fail, and the authors advocate multiple lines of defense in managing failures.
Abstract: Building systems to recover fast may be more productive than aiming for systems that never fail. Because recovery is not immune to failure either, the authors advocate multiple lines of defense in managing failures.

DOI
31 Aug 2004
TL;DR: SaveCCM is a simple model in which flexibility is limited to facilitate analysis of real-time and dependability, intended for embedded control applications in vehicular systems.
Abstract: Component-based development has proven effective in many engineering domains, and several general component technologies are available. Most of these are focused on providing an efficient software-engineering process. However for the majority of embedded systems, run-time efficiency and prediction of system behaviour are as important as process efficiency. This calls for specialized technologies. There is even a need for further specialized technologies adapted to different types of embedded systems, due to the heterogeneity of the domain and the close relation between the software and the often very application specific system. This work presents the SaveCCM component model, intended for embedded control applications in vehicular systems. SaveCCM is a simple model in which flexibility is limited to facilitate analysis of real-time and dependability. We present and motivate the model, and provide examples of its use.

Book ChapterDOI
01 Jan 2004
TL;DR: System safety and availability principles are presented with an emphasis on their evolution and on future challenges of the Airbus airplanes.
Abstract: This paper deals with the digital electrical flight control system of the Airbus airplanes. This system is built to very stringent dependability requirements both in terms of safety (the systems must not output erroneous signals) and availability. System safety and availability principles are presented with an emphasis on their evolution and on future challenges

Journal ArticleDOI
TL;DR: It is argued that modern complexity poses a major challenge to the ability to achieve successful systems and that this complexity must be understood, predicted and measured if the authors are to engineer systems confidently.
Abstract: This paper considers the creation of Complex Engineered Systems (CESs) and the Systems Engineering approach by which they are designed. The changing nature of the challenges facing Systems Engineering is discussed, with particular focus on the increasing complexity of modern systems. It is argued that modern complexity poses a major challenge to our ability to achieve successful systems and that this complexity must be understood, predicted and measured if we are to engineer systems confidently. We acknowledge previous work which concluded that, in complex systems, failures (“accidents”) may be inevitable and unavoidable. To further explore potential tools for increasing our confidence in complex systems, we review research in the field of Complexity Theory to seek potentially useful approaches and measures and find ourselves particularly interested in the potential usefulness of relationships between the magnitudes of events and their frequency of occurrence. Complexity Theory is found to have characterized naturally occurring systems and to potentially be the source of profitable application to the systems engineering challenge, viz., the creation of complex engineered systems. We are left with the tentative conclusion that truly complex systems, with our present understanding of complex behavior, cannot be designed with a degree of confidence that is acceptable given our current expectations. We recommend that the discipline of systems engineering must investigate this issue as a matter of priority and urgency and seek to develop approaches to respond to the challenge. © 2003 Wiley Periodicals, Inc. Syst Eng 7: 25–34, 2004

Journal ArticleDOI
01 Mar 2004
TL;DR: A systematic investigation of fine grain component-level restarts, microreboots, as high availability medicine, and a set of guidelines for building systems amenable to recursive reboots, known as "crash-only software systems."
Abstract: Even after decades of software engineering research, complex computer systems still fail. This paper makes the case for increasing research emphasis on dependability and, specifically, on improving availability by reducing time-to-recover.All software fails at some point, so systems must be able to recover from failures. Recovery itself can fail too, so systems must know how to intelligently retry their recovery. We present here a recursive approach, in which a minimal subset of components is recovered first; if that does not work, progressively larger subsets are recovered. Our domain of interest is Internet services; these systems experience primarily transient or intermittent failures, that can typically be resolved by rebooting. Conceding that failure-free software will continue eluding us for years to come, we undertake a systematic investigation of fine grain component-level restarts, microreboots, as high availability medicine. Building and maintaining an accurate model of large Internet systems is nearly impossible, due to their scale and constantly evolving nature, so we take an application-generic approach, that relies on empirical observations to manage recovery.We apply recursive microreboots to Mercury, a commercial off-the-shelf (COTS)-based satellite ground station that is based on an lnternet service platform. Mercury has been in successful operation for over 3 years. From our experience with Mercury, we draw design guidelines and lessons for the application of recursive microreboots to other software systems. We also present a set of guidelines for building systems amenable to recursive reboots, known as "crash-only software systems."

Proceedings Article
20 Jul 2004
TL;DR: The study is interested particularly at the stage of fault detection, which precedes any stage of diagnosis, based on the application of the test of Page-Hinckley for a system of pasteurization of agro-alimentary production system.
Abstract: The increasingly important automation of the manufacturing processes has made into evidence the needs in dependability for the installations. To ensure the industrial process dependability, the establishment of a monitoring system is primordial whose role is to recognize and to indicate in real time the behavior anomalies starting from information available on the system. Indeed the function of monitoring of a system is to detect, locate and diagnose the faults, which can affect its performances and its dependability. The objective of these communication consists of the study and the conception of a detection module based on the techniques of static analysis and modelling, the matter is of establishing the operations which starting from the data coming from the industrial system make it possible to detect the abnormal situations in order to prevent or to reduce the dysfunction risks. Thus, the study consists in developing a detection module of the dysfunction for the diagnosis tool system. Our study is interested particularly at the stage of fault detection, which precedes any stage of diagnosis, based on the application of the test of Page-Hinckley for a system of pasteurization of agro-alimentary production system.

Book ChapterDOI
TL;DR: The paper proposes an adaptation model which is built upon a classification of component mismatches, and identifies a number of patterns to be used for eliminating them, and outlines an engineering approach to component adaptation that relies on the use of patterns and provides additional support for the development of trustworthy component-based systems.
Abstract: Component adaptation needs to be taken into account when developing trustworthy systems, where the properties of component assemblies have to be reliably obtained from the properties of its constituent components. Thus, a more systematic approach to component adaptation is required when building trustworthy systems. In this paper, we illustrate how (design and architectural) patterns can be used to achieve component adaptation and thus serve as the basis for such an approach. The paper proposes an adaptation model which is built upon a classification of component mismatches, and identifies a number of patterns to be used for eliminating them. We conclude by outlining an engineering approach to component adaptation that relies on the use of patterns and provides additional support for the development of trustworthy component-based systems.

Book ChapterDOI
TL;DR: While some of the tools needed to assure a swarm for dependability exist, many do not, and hence much work needs to be done before dependable swarms become a reality.
Abstract: This review paper sets out to explore the question of how future complex engineered systems based upon the swarm intelligence paradigm could be assured for dependability. The paper introduces the new concept of ‘swarm engineering': a fusion of dependable systems engineering and swarm intelligence. The paper reviews the disciplines and processes conventionally employed to assure the dependability of conventional complex (and safety critical) systems in the light of swarm intelligence research and in so doing tries to map processes of analysis, design and test for safety-critical systems against relevant research in swarm intelligence. A case study of a swarm robotic system is used to illustrate this mapping. The paper concludes that while some of the tools needed to assure a swarm for dependability exist, many do not, and hence much work needs to be done before dependable swarms become a reality.

Book
29 Sep 2004
TL;DR: The objective of this book is to provide a Discussion of Dependability Problems of the CAN Protocol and its Application in the TTA.
Abstract: Foreword Preface 1: INTRODUCTION 11 Goal of this book 12 Overview 2: BASIC CONCEPTS AND RELATED WORK 21 Distributed Real-Time Systems 22 Concepts of Dependability 23 Degrees of Synchrony 24 Communication System Paradigms 25 Computational Models 26 System Architectures 3: REQUIREMENTS OF AN INTEGRATED ARCHITECTURE 31 Integration Directions 32 Required Properties 33 Generic Architectural Services 4: INTEGRATION OF EVENT-TRIGGERED AND TIME-TRIGGERED CONTROL PARADIGMS 41 Synchrony Model 42 Architecture 43 Component Level 44 Dependability Mechanisms 45 Membership Information 5CAN EMULATION IN THE TTA 51 The Time-Triggered Architecture 52 Controller Area Network 53 Requirements and Objectives 54 CAN Communication Services in the TTA 55 Implementation 6: RESULTS AND VALIDATION 61 Validation Objectives 62 Transmission Latencies 63 Input Message Sets 64 Simulations and Measurements 65 Authentic CAN Communication Service 66 Extended CAN Communication Service 67 Solving of Dependability Problems of the CAN Protocol 7: CONCLUSION References Index

Journal ArticleDOI
TL;DR: DEEM is able to deal with all the scenarios of MPS which have been analytically treated in the literature, at a cost which is comparable with that of the cheapest ones, completely solving the issues posed by the phased-behavior of M PS.
Abstract: Multiple-Phased Systems (MPS), i.e., systems whose operational life can be partitioned in a set of disjoint periods, called "phases", include several classes of systems such as Phased Mission Systems and Scheduled Maintenance Systems. Because of their deployment in critical applications, the dependability modeling and analysis of Multiple-Phased Systems is a task of primary relevance. The phased behavior makes the analysis of Multiple-Phased Systems extremely complex. This paper describes the modeling methodology and the solution procedure implemented in DEEM, a dependability modeling and evaluation tool specifically tailored for Multiple Phased Systems. It also describes its use for the solution of representative MPS problems. DEEM relies upon Deterministic and Stochastic Petri Nets as the modeling formalism, and on Markov Regenerative Processes for the model solution. When compared to existing general-purpose tools based on similar formalisms, DEEM offers advantages on both the modeling side (sub-models neatly model the phase-dependent behaviors of MPS), and on the evaluation side (a specialized algorithm allows a considerable reduction of the solution cost and time). Thus, DEEM is able to deal with all the scenarios of MPS which have been analytically treated in the literature, at a cost which is comparable with that of the cheapest ones, completely solving the issues posed by the phased-behavior of MPS.

Proceedings ArticleDOI
26 Jun 2004
TL;DR: A theoretical model of the proposed communication architecture of the protocols used by the traditional field buses, based on a master-slave architecture, is developed, which allows for the evaluation of some performance metrics.
Abstract: The recent performance improvements of wireless communication systems are making possible the use of such networks for industrial applications, which typically impose severe requirements in term of both real-time communications and dependability. Several independent studies have highlighted that the IEEE802.11 wireless LAN is one of the most suitable products for such applications. However, since such standard is only concerned with the lower layers of the communication stack, it is necessary to integrate it with appropriate protocols, typical of the industrial communications. In this direction, the protocols used by the traditional field buses could represent an interesting opportunity. In this paper we consider one of these protocols, based on a master-slave architecture, and analyze the possibility of implementing it on top of IEEE802.11. After a description of how the master-slave functions could be mapped onto the IEEE802.11 services, we develop a theoretical model of the proposed communication architecture which allows for the evaluation of some performance metrics

Proceedings ArticleDOI
28 Jun 2004
TL;DR: This paper proposes an extension, repairable fault trees, which allows the designer to evaluate the effects of different repair policies on a repairable system and it is supported by a solution technique which transparently exploits generalized stochastic Petri nets for modelling the repairing process.
Abstract: Fault trees are a well known mean for the evaluation of dependability of complex systems. Many extensions have been proposed to the original formalism in order to enhance the advantages of fault tree analysis for the design and assessment of systems. In this paper we propose an extension, repairable fault trees, which allows the designer to evaluate the effects of different repair policies on a repairable system: this extended formalism has been integrated in a multi-formalism multi-solution framework, and it is supported by a solution technique which transparently exploits generalized stochastic Petri nets (GSPN)for modelling the repairing process. The modelling technique and the solution process are illustrated through an example.

Proceedings ArticleDOI
28 Jun 2004
TL;DR: This work presents a framework for evaluating the dependability of data storage systems, including both individual data protection techniques and their compositions, and estimates storage system recovery time, data loss, normal mode system utilization and operational costs under a variety of failure scenarios.
Abstract: Designing storage systems to provide business continuity in the face of failures requires the use of various data protection techniques, such as backup, remote mirroring, point-in-time copies and vaulting, often in concert. Predicting the dependability provided by such compositions of techniques is difficult, yet necessary for dependable system design. We present a framework for evaluating the dependability of data storage systems, including both individual data protection techniques and their compositions. Our models estimate storage system recovery time, data loss, normal mode system utilization and operational costs under a variety of failure scenarios. We demonstrate the effectiveness of these modeling techniques through a case study using real-world storage system designs and workloads.

Proceedings Article
01 Jan 2004
TL;DR: In this article, the authors analyze the relation between the quality attributes of components and those of their compositions, and the types of relations are classified according to the possibility of predicting properties of compositions from the properties of the components and according to their influences of other factors such as software architecture or system environment.
Abstract: One of the main objectives of developing component-based software systems is to enable efficient building of systems through the integration of components. All component models define some form of component interface standard that facilitates the programmatic integration of components, but they do not facilitate or provide theories for the prediction of the quality attributes of the component compositions. This decreases significantly the value of the component-based approach to building dependable systems. If it is not possible to predict the value of a particular attribute of a system prior to integration and deployment to the target environment the system must be subjected to other procedures, often costly, to determine this value empirically. For this reason one of the challenges of the component-based approach is to obtain means for the “composition” of quality attributes. This challenge poses a very difficult task because the diverse types of quality attributes do not have the same underlying conceptual characteristics, since many factors, in addition to component properties, influence the system properties. This paper analyses the relation between the quality attributes of components and those of their compositions. The types of relations are classified according to the possibility of predicting properties of compositions from the properties of the components and according to the influences of other factors such as software architecture or system environment. The classification is exemplified with particular cases of compositions of quality attributes, and its relation to dependability is discussed. Such a classification can indicate the efforts that would be required to predict the system attributes which are essential for system dependability and in this way, the feasibility of the component-based approach in developing dependable systems.

Journal ArticleDOI
TL;DR: An approach for analyzing the propagation and effect of data errors in modular software enabling the profiling of the vulnerabilities of software to find the modules and signals most likely exposed to propagating errors.
Abstract: We present an approach for analyzing the propagation and effect of data errors in modular software enabling the profiling of the vulnerabilities of software to find 1) the modules and signals most likely exposed to propagating errors and 2) the modules and signals which, when subjected to error, tend to cause more damage than others from a systems operation point-of-view. We discuss how to use the obtained profiles to identify where dependability structures and mechanisms will likely be the most effective, i.e., how to perform a cost-benefit analysis for dependability. A fault-injection-based method for estimation of the various measures is described and the software of a real embedded control system is profiled to show the type of results obtainable by the analysis framework.

01 Jan 2004
TL;DR: A next-generation architecture that addresses problems of dependability, maintainability, and manageability of I/O devices and their software drivers on the PC platform is presented, based on the Xen virtual machine monitor.
Abstract: We present a next-generation architecture that addresses problems of dependability, maintainability, and manageability of I/O devices and their software drivers on the PC platform. Our architecture resolves both hardware and software issues, exploiting emerging hardware features to improve device safety. Our high-performance implementation, based on the Xen virtual machine monitor, provides an immediate transition opportunity for today’s systems.

Journal ArticleDOI
TL;DR: An Information Dependability Attribute Value Estimation model (iDAVE) is developed for reasoning about software dependability's ROI and helps stakeholders in determining their desired levels for each dependability attribute and estimating the cost, value, and ROI for achieving those.
Abstract: In most organizations, proposed investments in software dependability compete for limited resources with proposed investments in software and system functionality, response time, adaptability, speed of development, ease of use, and other system capabilities. The lack of good return-on-investment models for software dependability makes determining the overall business case for dependability investments difficult. So, with a weak business case, investments in software dependability and the resulting system dependability are frequently inadequate. Dependability models will need to support stakeholders in determining their desired levels for each dependability attribute and estimating the cost, value, and ROI for achieving those. At the University of Southern California, researchers have developed software cost- and quality-estimation models and value-based software engineering processes, methods, and tools. We used these models and the value-based approach to develop an Information Dependability Attribute Value Estimation model (iDAVE) for reasoning about software dependability's ROI.

Journal ArticleDOI
TL;DR: An overview of probabilistic model checking and of the tool PRISM which supports these techniques is provided and it is demonstrated that a wide range of useful dependability properties can be analysed in this way.

Book ChapterDOI
01 Jan 2004
TL;DR: In this article, the authors used the AltaRica formal language and associated tools to perform safety assessments of an electrical and hydraulic system, and learned lessons learnt during the study of a system.
Abstract: AIRBUS and ONERA used the AltaRica formal language and associated tools to perform safety assessments. Lessons learnt during the study of an electrical and hydraulic system are presented.