scispace - formally typeset
Search or ask a question

Showing papers on "Dependability published in 2005"


Journal ArticleDOI
TL;DR: This research shows that a BN based reliability formalism is a powerful potential solution to modeling and analyzing various kinds of system components behaviors and interactions and provides a basis for more advanced and useful analyses such as system diagnosis.

381 citations


Journal ArticleDOI
TL;DR: Results show that user session data can be used to produce test suites more effective overall than those produced by the white-box techniques considered; however, the faults detected by the two classes of techniques differ, suggesting that the techniques are complementary.
Abstract: Web applications are vital components of the global information infrastructure, and it is important to ensure their dependability. Many techniques and tools for validating Web applications have been created, but few of these have addressed the need to test Web application functionality and none have attempted to leverage data gathered in the operation of Web applications to assist with testing. In this paper, we present several techniques for using user session data gathered as users operate Web applications to help test those applications from a functional standpoint. We report results of an experiment comparing these new techniques to existing white-box techniques for creating test cases for Web applications, assessing both the adequacy of the generated test cases and their ability to detect faults on a point-of-sale Web application. Our results show that user session data can be used to produce test suites more effective overall than those produced by the white-box techniques considered; however, the faults detected by the two classes of techniques differ, suggesting that the techniques are complementary.

250 citations


Proceedings ArticleDOI
12 Jul 2005
TL;DR: This paper describes a tool architecture called PUMA, which provides a unified interface between different kinds of design information and different kind of performance models, for example Markov models, stochastic Petri nets and process algebras, queues and layered queues.
Abstract: Evaluation of non-functional properties of a design (such as performance, dependability, security, etc.) can be enabled by design annotations specific to the property to be evaluated. Performance properties, for instance, can be annotated on UML designs by using the "UML Profile for Schedulability, Performance and Time (SPT)". However the communication between the design description in UML and the tools used for non-functional properties evaluation requires support, particularly for performance where there are many alternative performance analysis tools that might be applied. This paper describes a tool architecture called PUMA, which provides a unified interface between different kinds of design information and different kinds of performance models, for example Markov models, stochastic Petri nets and process algebras, queues and layered queues.The paper concentrates on the creation of performance models. The unified interface of PUMA is centered on an intermediate model called Core Scenario Model (CSM), which is extracted from the annotated design model. Experience shows that CSM is also necessary for cleaning and auditing the design information, and providing default interpretations in case it is incomplete, before creating a performance model.

221 citations


Journal ArticleDOI
TL;DR: The use of stricter assessment criteria or more structured and prescribed content would improve interrater reliability, but would obliterate the essence of portfolio assessment in terms of flexibility, personal orientation and authenticity.
Abstract: Aim Because it deals with qualitative information, portfolio assessment inevitably involves some degree of subjectivity. The use of stricter assessment criteria or more structured and prescribed content would improve interrater reliability, but would obliterate the essence of portfolio assessment in terms of flexibility, personal orientation and authenticity. We resolved this dilemma by using qualitative research criteria as opposed to reliability in the evaluation of portfolio assessment. Methodology/research design Five qualitative research strategies were used to achieve credibility and dependability of assessment: triangulation, prolonged engagement, member checking, audit trail and dependability audit. Mentors read portfolios at least twice during the year, providing feedback and guidance (prolonged engagement). Their recommendation for the end-of-year grade was discussed with the student (member checking) and submitted to a member of the portfolio committee. Information from different sources was combined (triangulation). Portfolios causing persistent disagreement were submitted to the full portfolio assessment committee. Quality assurance procedures with external auditors were used (dependability audit) and the assessment process was thoroughly documented (audit trail). Results A total of 233 portfolios were assessed. Students and mentors disagreed on 7 (3%) portfolios and 9 portfolios were submitted to the full committee. The final decision on 29 (12%) portfolios differed from the mentor's recommendation. Conclusion We think we have devised an assessment procedure that safeguards the characteristics of portfolio assessment, with credibility and dependability of assessment built into the judgement procedure. Further support for credibility and dependability might be sought by means of a study involving different assessment committees.

201 citations


Proceedings ArticleDOI
02 Feb 2005
TL;DR: A combination of IR and instance-based learning along with the consistency-based subset evaluation technique provides a relatively better consistency in accuracy prediction compared to other models, and "size" and "complexity" metrics are not sufficient for accurately predicting real-time software defects.
Abstract: The wide-variety of real-time software systems, including telecontrol/telepresence systems, robotic systems, and mission planning systems, can entail dynamic code synthesis based on runtime mission-specific requirements and operating conditions. This necessitates the need for dynamic dependability assessment to ensure that these systems perform as specified and not fail in catastrophic ways. One approach in achieving this is to dynamically assess the modules in the synthesized code using software defect prediction techniques. Statistical models; such as stepwise multi-linear regression models and multivariate models, and machine learning approaches, such as artificial neural networks, instance-based reasoning, Bayesian-belief networks, decision trees, and rule inductions, have been investigated for predicting software quality. However, there is still no consensus about the best predictor model for software defects. In this paper; we evaluate different predictor models on four different real-time software defect data sets. The results show that a combination of IR and instance-based learning along with the consistency-based subset evaluation technique provides a relatively better consistency in accuracy prediction compared to other models. The results also show that "size" and "complexity" metrics are not sufficient for accurately predicting real-time software defects.

198 citations


Journal ArticleDOI
TL;DR: A corpus of spreadsheets is assembled that is suitable for evaluating dependability devices in Microsoft Excel and a variety of feature of these spreadsheets are measured to aid researchers in selecting subsets of the corpus appropriate to their needs.
Abstract: In recent years several tools and methodologies have been developed to improve the dependability of spreadsheets However, there has been little evaluation of these dependability devices on spreadsheets in actual use by end users To assist in the process of evaluating these methodologies, we have assembled a corpus of spreadsheets from a variety of sources We have ensured that these spreadsheets are suitable for evaluating dependability devices in Microsoft Excel (the most commonly used commercial spreadsheet environment) and have measured a variety of feature of these spreadsheets to aid researchers in selecting subsets of the corpus appropriate to their needs

193 citations


Journal Article
TL;DR: Singularity demonstrates the practicality of new technologies and architectural decisions, which should lead to the construction of more robust and dependable systems.
Abstract: . Singularity is a research project in Microsoft Research that started with the question: what would a software platform look like if it was designed from scratch with the primary goal of dependability? Singularity is working to answer this question by building on advances in programming languages and tools to develop a new system architecture and operating system (named Singularity), with the aim of producing a more robust and dependable software platform. Singularity demonstrates the practicality of new technologies and architectural decisions, which should lead to the construction of more robust and dependable systems.

162 citations


Book ChapterDOI
28 Sep 2005
TL;DR: This paper reports the initial experience in using model-based safety analysis on an example system taken from the ARP Safety Assessment guidelines document.
Abstract: Safety analysis techniques have traditionally been performed manually by the safety engineers. Since these analyses are based on an informal model of the system, it is unlikely that these analyses will be complete, consistent, and error-free. Using precise formal models of the system as the basis of the analysis may help reduce errors and provide a more thorough analysis. Further, these models allow automated analysis, which may reduce the manual effort required. The process of creating system models suitable for safety analysis closely parallels the model-based development process that is increasingly used for critical system and software development. By leveraging the existing tools and techniques, we can create formal safety models using tools that are familiar to engineers and we can use the static analysis infrastructure available for these tools. This paper reports our initial experience in using model-based safety analysis on an example system taken from the ARP Safety Assessment guidelines document.

159 citations


Journal ArticleDOI
J. Bertsch, C. Carnal, D. Karlson1, J. McDaniel, Khoi Vu 
09 May 2005
TL;DR: The main targets for this paper are to sort out the terminology used in this area, describe different application areas and related requirements, and illustrate different design principles-"top-down", "bottom-up", hierarchy, flat, etc., for different applications.
Abstract: This paper describes basic principles and philosophy for wide-area protection schemes, also known as remedial action schemes (RAS) or system protection schemes (SPS). In the areas of power system automation and substation automation, there are two parallel trends in different directions: centralization and decentralization. More and more functions are moved from local and regional control centers toward the central or national control center. At the same time we also observe more and more "intelligence" and "decision-power" moving closer toward the actual power system process. We also see a great deal of functional integration, i.e., more and more functionality enclosed in the same hardware. This raises discussions concerning reliability (security and dependability). The main targets for this paper is therefore to: 1) sort out the terminology used in this area; 2) describe different application areas and related requirements; 3) illustrate different design principles-"top-down", "bottom-up", hierarchy, flat, etc., for different applications; and 4) identify similarities and differences between classic equipment protection and system protection-concerning philosophy as well as concerning product and system design. The theme of the paper is on the use of information technology to obtain more flexibility and smartness in power-system controls.

156 citations


Proceedings ArticleDOI
15 May 2005
TL;DR: This tutorial gives insights into basic principles of CBD, the main concerns and characteristics of embedded systems and possible directions of adaptation of component-based approach for these systems.
Abstract: Although attractive, CBD has not been widely adopted in domains of embedded systems. The main reason is inability of these technologies to cope with the important concerns of embedded systems, such as resource constraints, real-time or dependability requirements. However, an increasing understanding of principles of CBD makes it possible to utilize these principles in implementation of different component-based models more appropriate for embedded systems. The aim of this tutorial is to point to the opportunity of applying this approach for development and maintenance of embedded systems. The tutorial gives insights into basic principles of CBD, the main concerns and characteristics of embedded systems and possible directions of adaptation of component-based approach for these systems. Different types of embedded systems and approaches for applying CBD are presented and illustrated by examples from research and practices. Also, challenges and research directions of CBD for embedded systems are discussed.

112 citations


Journal ArticleDOI
TL;DR: A new analytical approach is described to estimate the dependability of TMR designs implemented on SRAM-based FPGAs that is able to predict the effects of single event upsets with the same accuracy of fault injection but at a fraction of the fault-injection's execution time.
Abstract: In order to deploy successfully commercially-off-the-shelf SRAM-based FPGA devices in safety- or mission-critical applications, designers need to adopt suitable hardening techniques, as well as methods for validating the correctness of the obtained designs, as far as the system's dependability is concerned. In this paper we describe a new analytical approach to estimate the dependability of TMR designs implemented on SRAM-based FPGAs that, by exploiting a detailed knowledge of FPGAs architectures and configuration memory, is able to predict the effects of single event upsets with the same accuracy of fault injection but at a fraction of the fault-injection's execution time.

Journal ArticleDOI
TL;DR: Discussion on some of the recent advances in logic- and architectural-level techniques to deal with transient errors serves as a "springboard" to motivate the need for hardware-level application-aware runtime checks, which the application can invoke from within its instruction stream per its dependability requirements.
Abstract: Discussion on some of the recent advances in logic- and architectural-level techniques to deal with transient errors serves as a "springboard" to motivate the need for hardware-level application-aware runtime checks, which the application can invoke from within its instruction stream per its dependability requirements. In contrast with traditional approaches of software and hardware duplication, alternative techniques such as fine-grained application-aware runtime checking offer more efficient, low-overhead detection, correction, and recovery.

Journal ArticleDOI
TL;DR: In this paper, the authors describe embedded system coursework during the first four years of university education (the U.S. undergraduate level) and describe lessons learned from teaching courses in many of these areas, as well as general skills taught and approaches used.
Abstract: Embedded systems encompass a wide range of applications, technologies, and disciplines, necessitating a broad approach to education. We describe embedded system coursework during the first 4 years of university education (the U.S. undergraduate level). Embedded application curriculum areas include: small and single-microcontroller applications, control systems, distributed embedded control, system-on-chip, networking, embedded PCs, critical systems, robotics, computer peripherals, wireless data systems, signal processing, and command and control. Additional cross-cutting skills that are important to embedded system designers include: security, dependability, energy-aware computing, software/systems engineering, real-time computing, and human--computer interaction. We describe lessons learned from teaching courses in many of these areas, as well as general skills taught and approaches used, including a heavy emphasis on course projects to teach system skills.

Journal ArticleDOI
TL;DR: The MEAD (Middleware for Embedded Adaptive Dependability) system attempts to identify and to reconcile the conflicts between real‐time and fault tolerance, in a resource‐aware manner, for distributed CORBA applications.
Abstract: SUMMARY The OMG’s Real-Time CORBA (RT-CORBA) and Fault-Tolerant CORBA (FT-CORBA) specifications make it possible for today’s CORBA implementations to exhibit either real-time or fault tolerance in isolation. While real-time requires ap rioriknowledge of the system’s temporal operation, fault tolerance necessarily deals with faults that occur unexpectedly, and with possibly unpredictable fault recovery times. The MEAD (Middleware for Embedded Adaptive Dependability) system attempts to identify and to reconcile the conflicts between real-time and fault tolerance, in a resource-aware manner, for distributed CORBA applications. MEAD supports transparent yet tunable fault tolerance in real-time, proactive dependability, resource-aware system adaptation to crash, communication and timing faults with bounded fault detection and fault recovery. Copyright c � 2005 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: The paper introduces the design of wormhole-aware intrusion-tolerant protocols using a classical distributed systems problem: consensus and shows the significance of the TTCB as an engineering paradigm, since the protocol manages to be simple when compared with other protocols in the literature.
Abstract: The application of the tolerance paradigm to security - intrusion tolerance - has been raising a reasonable amount of attention in the dependability and security communities. In this paper we present a novel approach to intrusion tolerence. The idea is to use privileged components - generically designated by wormholes - to support the execution of intrusion-tolerant protocols, often called Byzantine-resilient in the literature.The paper introduces the design of wormhole-aware intrusion-tolerant protocols using a classical distributed systems problem: consensus. The system where the consensus protocol runs is mostly asynchronous and can fail in an arbitrary way, except for the wormhole, which is secure and synchronous. Using the wormhole to execute a few critical steps, the protocol manages to have a low time complexity: in the best case, it runs in two rounds, even if some processes are malicious. The protocol also shows how often theoretical partial synchrony assumptions can be substantiated in practical distributed systems. The paper shows the significance of the TTCB as an engineering paradigm, since the protocol manages to be simple when compared with other protocols in the literature.

Book ChapterDOI
01 Jan 2005
TL;DR: This work presents an approach to the reliability prediction of service-oriented computing services, based on the partial information published with each service, and that lends itself to automatization.
Abstract: In service-oriented computing, services are dynamically built as an assembly of pre-existing, independently developed, network accessible services. Hence, predicting as much as possible automatically their dependability is important to appropriately drive the selection and assembly of services, in order to get some required dependability level. We present an approach to the reliability prediction of such services, based on the partial information published with each service, and that lends itself to automatization. The proposed methodology exploits ideas from the Software Architecture- and Component-based approaches to software design.

Book ChapterDOI
01 Jan 2005
TL;DR: A classification can indicate the efforts that would be required to predict the system attributes which are essential for system dependability and in this way, the feasibility of the component-based approach in developing dependable systems is indicated.
Abstract: One of the main objectives of developing component-based software systems is to enable efficient building of systems through the integration of components. All component models define some form of component interface standard that facilitates the programmatic integration of components, but they do not facilitate or provide theories for the prediction of the quality attributes of the component compositions. This decreases significantly the value of the component-based approach to building dependable systems. If it is not possible to predict the value of a particular attribute of a system prior to integration and deployment to the target environment the system must be subjected to other procedures, often costly, to determine this value empirically. For this reason one of the challenges of the component-based approach is to obtain means for the “composition” of quality attributes. This challenge poses a very difficult task because the diverse types of quality attributes do not have the same underlying conceptual characteristics, since many factors, in addition to component properties, influence the system properties. This paper analyses the relation between the quality attributes of components and those of their compositions. The types of relations are classified according to the possibility of predicting properties of compositions from the properties of the components and according to the influences of other factors such as software architecture or system environment. The classification is exemplified with particular cases of compositions of quality attributes, and its relation to dependability is discussed. Such a classification can indicate the efforts that would be required to predict the system attributes which are essential for system dependability and in this way, the feasibility of the component-based approach in developing dependable systems.

Proceedings ArticleDOI
26 Jul 2005
TL;DR: This paper demonstrates the Web Service based N-Version model, WS-FTM (Web Service-Fault Tolerance Mechanism), which applies this well proven technique to the domain of Web Services to increase system dependability and achieves transparent usage of replicated Web Services by use of a modified stub.
Abstract: This paper demonstrates our Web Service based N-Version model, WS-FTM (Web Service-Fault Tolerance Mechanism), which applies this well proven technique to the domain of Web Services to increase system dependability. WS-FTM achieves transparent usage of replicated Web Services by use of a modified stub. The stub is created using tools included in WS-FTM. Our initial implementation includes a simple consensus voter that allows generic result comparison. Finally we show, through the use of a non-trivial example, that WS-FTM can be used to increase the reliability of a Web Service system.

01 Jan 2005
TL;DR: This chapter analyzes the real-time and dependability constraints of X-by-Wire systems, review the fault-tolerant services that are needed and the communication protocols considered for use in such systems.
Abstract: X-by-Wire is a generic term referring to the replacement of mechanical or hydraulic systems, such as braking or steering, by electronic ones. In this chapter, we analyze the real-time and dependability constraints of X-by-Wire systems, review the fault-tolerant services that are needed and the communication protocols (TTP/C, FlexRay and TTCAN) considered for use in such systems. Using a Steer-by-Wire case-study, we detail the design principles and verification methods that can be used to ensure the stringent constraints of X-by-Wire systems.

Journal ArticleDOI
TL;DR: Main theoretical aspects and practical implications are presented, including a numerical application, of the multicriteria decision model based on the ELECTRE method combined with utility functions.
Abstract: A DM faces a choice among several alternatives of repair contract for a system. Each alternative of a repair contract implies specific results regarding the following characteristics or criteria: response time, quality service, dependability and related cost. This problem has been analysed through a multicriteria decision model. The model is based on the ELECTRE method combined with utility functions. Main theoretical aspects and practical implications are presented, including a numerical application.

01 Jan 2005
TL;DR: In this article, the authors proposed CANcentrate, a new active star topology for the Controller Area Network (CAN) topology, which is fully compatible with existing CAN controllers, but requires double links.
Abstract: The Controller Area Network (CAN) is a field bus that is nowadays widespread in distributed embedded systems due to its electrical robustness, low price, and deterministic access delay. However, its use in safety-critical applications has been controversial due to dependability limitations, such as those arising from its bus topology. In particular, in a CAN bus there are multiple components such that if any of them is faulty, a general failure of the communication system may happen. In this document, we propose the design of a new active star topology called CANcentrate 1 . Our design solves the limitations indicated above by means of an active hub which prevents error propagation from any of its ports to the others. CANcentrate exhibits improved fault diagnosis and isolation mechanisms with respect to both all communication systems that rely on a CAN bus and all commercially available CAN communication systems based on a hub. Due to the specific characteristics of our hub, CANcentrate is fully compatible with existing CAN controllers, but requires double links. The present document is devoted to report in detail the work we have done regarding CANcentrate. First, the document compares bus and star topologies, analyzes related work and describes the CANcentrate basics, paying special attention to the mechanisms used for detecting faulty ports. Afterwards, the document explains the reintegration policy the hub performs to deal with transient faults and addresses some issues concerning the cabling length in a star topology. Finally it describes the implementation and tests of a prototype of CANcentrate. 2 .

Book
27 Dec 2005
TL;DR: System developers, stakeholders, decision makers, policy makers and academics will find this book highlights the core issues for all those involved in dependence on complex computer-based environment.
Abstract: This book collects the latest research on computer-based system structure of computer scientists, sociologists, psychologists and combines statistical and become a picture that can be read from recent work on dependable computer-based systems. Stakeholders and system designers as well as the scientific community now agrees that the issues of human and social development must be covered along with technical problems. System developers, stakeholders, decision makers, policy makers and academics will find this book highlights the core issues for all those involved in dependence on complex computer-based environment.

Proceedings ArticleDOI
28 Jun 2005
TL;DR: This paper identifies some examples of CRC usage that compromise ultra-dependable system design goals, and recommends alternate ways to improve system dependability via architectural approaches rather than error detection coding approaches.
Abstract: A cyclic redundancy code (CRC), when used properly, can be an effective and relatively inexpensive method to detect data corruption across communication channels. However, some systems use CRCs in ways that violate common assumptions made in analyzing CRC effectiveness, resulting in an overly optimistic prediction of system dependability. CRCs detect errors with some finite probability, which depends on factors including the strength of the particular code used, the bit-error rate, and the message length being checked. Common assumptions also include a passive network inter-stage, explicit data words, memoryless channels, and random independent symbol errors. In this paper we identify some examples of CRC usage that compromise ultra-dependable system design goals, and recommend alternate ways to improve system dependability via architectural approaches rather than error detection coding approaches.

Journal ArticleDOI
TL;DR: TTCAN, TTP/C, Byteflight, FlexRay and Bluetooth are some of the most promising emerging solutions that have already been defined and can be embedded right in new projects and their characteristics are compared, pointing out the main advantages and drawbacks.

Journal ArticleDOI
TL;DR: This paper gives an introduction to the reliable data transport problem and surveys protocols and approaches for this protocol, often developed for particular applications to reflect the application-specific dependability requirements.
Abstract: Reliable data transport is an important facet of dependability and quality of service in wireless sensor networks This paper gives an introduction to the reliable data transport problem and surveys protocols and approaches for this protocol, often developed for particular applications to reflect the applicationspecific dependability requirements A joint characteristic of many of the discussed protocols is that they combine mechanisms from several layers to achieve their reliability goals while being energy-efficient This very need to be energy-efficient precludes Internet-style approaches to reliability – handle it in the end system – and necessitates in-network solutions

Proceedings ArticleDOI
27 Jul 2005
TL;DR: There is statistical evidence that large, networked, evolving, systems either fixed or mobile, with demanding requirements driven by their domain of application suffer from a significant drop in dependability and security in comparison with the former systems.
Abstract: The current state-of-knowledge and state-of-the-art reasonably enable the construction and operation of critical systems, be they safety-critical or availability-critical. The situation drastically worsens when considering large, networked, evolving, systems either fixed or mobile, with demanding requirements driven by their domain of application. There is statistical evidence that these emerging systems suffer from a significant drop in dependability and security in comparison with the former systems. The cost of failures in service is growing rapidly, as a consequence of the degree of dependence placed on computing systems, up to several million euros per hour of downtime for some businesses

Proceedings ArticleDOI
28 Jun 2005
TL;DR: An experiment is performed on wide area network to assess and fairly compare the quality of service provided by a large family of failure detectors and the choices for estimators and safety margins used to build several failure detectors.
Abstract: This paper describes an experiment performed on wide area network to assess and fairly compare the quality of service provided by a large family of failure detectors. Failure detectors are a popular middleware mechanism used for improving the dependability of distributed systems and applications. Their QoS greatly influences the QoS that upper layers may provide. It is thus of uttermost importance to equip a system with an appropriate failure detector and to properly tune its parameters for the most desirable QoS to be provided. The paper first analyzes the QoS indicators and the structure of push-style failure detectors and then introduces the choices for estimators and safety margins used to build several (30) failure detectors. The experimental setup designed and implemented to allow a fair comparison of QoS of the several alternatives in a real representative experimental setting is then described. Finally the results obtained through the experiments and their interpretation are provided.

Proceedings ArticleDOI
20 Sep 2005
TL;DR: It is argued that distinguishing between trust and assurance yields a wider range of strategies for ensuring dependability of the human element in a secure socio-technical system, and correctly placed trust can also benefit an organisation's culture and performance.
Abstract: In order to be effective, secure systems need to be both correct (i.e. effective when used as intended) and dependable (i.e. actually being used as intended). Given that most secure systems involve people, a strategy for achieving dependable security must address both people and technology. Current research in Human-Computer Interactions in Security (HCISec) aims to increase dependability of the human element by reducing mistakes (e.g. through better user interfaces to security tools). We argue that a successful strategy also needs to consider the impact of social interaction on security, and in this respect trust is a central concept. We compare the understanding of trust in secure systems with the more differentiated models of trust in social science research. The security definition of "trust" turns out to map onto strategies that would be correctly described as "assurance" in the more differentiated model. We argue that distinguishing between trust and assurance yields a wider range of strategies for ensuring dependability of the human element in a secure socio-technical system. Furthermore, correctly placed trust can also benefit an organisation's culture and performance. We conclude by presenting design principles to help security designers decide "when to trust" and "when to assure", and give examples of how both strategies would be implemented in practice.

Reference EntryDOI
15 Oct 2005
TL;DR: Generalizability (G) theory is a statistical theory for evaluating the dependability (reliability) of behavioral measurements and permits decision makers to design a measurement procedure that minimizes error.
Abstract: Generalizability (G) theory is a statistical theory for evaluating the dependability (reliability) of behavioral measurements. G theory estimates multiple sources of measurement error and permits decision makers to design a measurement procedure that minimizes error. Keywords: dependability; generalizability theory; reliability

01 Jan 2005
TL;DR: This paper addresses dependability as a whole, but focuses specifically on fault tolerance, and presents some particularities of autonomous systems and discusses the dependability mechanisms that are currently employed.
Abstract: Autonomous systems are starting to appear in space exploration , elderly care and domestic service; they are particularly attractive for such applications because their advanced decisional mechanisms allow them to execute complex missions in uncertain environments. However, systems embedding such mechanisms simultaneously raise new concerns regarding their dependability. We aim in this paper to present these concerns and suggest possible ways to resolve them. We address dependability as a whole, but focus specifically on fault tolerance. We present some particularities of autonomous systems and discuss the dependability mechanisms that are currently employed. We then concentrate on the dependability concerns raised by decisional mechanisms and consider the introduction and assessment of appropriate fault tolerance mechanisms.