scispace - formally typeset
Search or ask a question

Showing papers on "Dependability published in 2006"


Book
01 Jan 2006
TL;DR: In this paper, the authors present a comparison and combination of fault-detection methods for different types of fault detection methods: Fault detection with classification methods, fault detection with inference methods, and fault detection using Principal Component Analysis (PCA).
Abstract: Fundamentals.- Supervision and fault management of processes - tasks and terminology.- Reliability, Availability and Maintainability (RAM).- Safety, Dependability and System Integrity.- Fault-Detection Methods.- Process Models and Fault Modelling.- Signal models.- Fault detection with limit checking.- Fault detection with signal models.- Fault detection with process-identification methods.- Fault detection with parity equations.- Fault detection with state observers and state estimation.- Fault detection of control loops.- Fault detection with Principal Component Analysis (PCA).- Comparison and combination of fault-detection methods.- Fault-Diagnosis Methods.- Diagnosis procedures and problems.- Fault diagnosis with classification methods.- Fault diagnosis with inference methods.- Fault-Tolerant Systems.- Fault-tolerant design.- Fault-tolerant components and control.- Application Examples.- Fault detection and diagnosis of DC motor drives.- Fault detection and diagnosis of a centrifugal pump-pipe-system.- Fault detection and diagnosis of an automotive suspension and the tire pressures.

1,754 citations


Journal ArticleDOI
27 Oct 2006-Science
TL;DR: The economics of information security has recently become a thriving and fast-moving discipline and provides valuable insights into more general areas such as the design of peer-to-peer systems, the optimal balance of effort by programmers and testers, why privacy gets eroded, and the politics of digital rights management.
Abstract: The economics of information security has recently become a thriving and fast-moving discipline. As distributed systems are assembled from machines belonging to principals with divergent interests, we find that incentives are becoming as important as technical design in achieving dependability. The new field provides valuable insights not just into "security" topics (such as bugs, spam, phishing, and law enforcement strategy) but into more general areas such as the design of peer-to-peer systems, the optimal balance of effort by programmers and testers, why privacy gets eroded, and the politics of digital rights management.

737 citations


DOI
01 Feb 2006
TL;DR: The AADL is a modeling language that supports early and repeated analyses of a system's architecture with respect to performance-critical properties through an extendable notation, a tool framework, and precisely defined semantics.
Abstract: In November 2004, the Society of Automotive Engineers (SAE) released the aerospace standard AS5506, named the Architecture Analysis & Design Language (AADL). The AADL is a modeling language that supports early and repeated analyses of a system's architecture with respect to performance-critical properties through an extendable notation, a tool framework, and precisely defined semantics. The language employs formal modeling concepts for the description and analysis of application system architectures in terms of distinct components and their interactions. It includes abstractions of software, computational hardware, and system components for (a) specifying and analyzing real-time embedded and high dependability systems, complex systems of systems, and specialized performance capability systems and (b) mapping of software onto computational hardware elements. The AADL is especially effective for model-based analysis and specification of complex real-time embedded systems. This technical note is an introduction to the concepts, language structure, and application of the AADL.

650 citations


Proceedings ArticleDOI
09 Oct 2006
TL;DR: The state of the art in the field as surveyed by the PHRIDOM project is presented, as well as it enlightens a number of challenges that will be undertaken within the PHRIENDS project.
Abstract: In the immediate future, metrics related to safety and dependability have to be found in order to successfully introduce robots in everyday environments. The crucial issues needed to tackle the problem of a safe and dependable physical human-robot interaction (pHRI) were addressed in the EURON Perspective Research Project PHRIDOM (Physical Human- Robot Interaction in Anthropic Domains), aimed at charting the new "territory" of pHRI. While there are certainly also "cognitive" issues involved, due to the human perception of the robot (and vice versa), and other objective metrics related to fault detection and isolation, the discussion in this paper will focus on the peculiar aspects of "physical" interaction with robots. In particular, safety and dependability will be the underlying evaluation criteria for mechanical design, actuation, and control architectures. Mechanical and control issues will be discussed with emphasis on techniques that provide safety in an intrinsic way or by means of control components. Attention will be devoted to dependability, mainly related to sensors, control architectures, and fault handling and tolerance. After PHRIDOM, a novel research project has been launched under the Information Society Technologies Sixth Framework Programme of the European Commission. This "Specific Targeted Research or Innovation" project is dedicated to "Physical Human-Robot Interaction: depENDability and Safety" (PHRIENDS). PHRIENDS is about developing key components of the next generation of robots, including industrial robots and assist devices, designed to share the environment and to physically interact with people. The philosophy of the project proposes an integrated approach to the co-design of robots for safe physical interaction with humans, which revolutionizes the classical approach for designing industrial robots – rigid design for accuracy, active control for safety – by creating a new paradigm: design robots that are intrinsically safe, and control them to deliver performance. This paper presents the state of the art in the field as surveyed by the PHRIDOM project, as well as it enlightens a number of challenges that will be undertaken within the PHRIENDS project.

231 citations


Journal ArticleDOI
TL;DR: In this paper, the authors identify eight relatively surprise-free trends in software engineering: increasing interaction of software engineering and systems engineering, increased emphasis on users and end value, increasing emphasis on systems and software dependability, increasing rapid change, increasing global connectivity and need for systems to interoperate, increasingly complex systems of systems, increasing needs for COTS, reuse, and legacy systems, and software integration; and computational plenty.
Abstract: In response to the increasing criticality of software within systems and the increasing demands being put onto 21st century systems, systems and software engineering processes will evolve significantly over the next two decades. This paper identifies eight relatively surprise-free trends—the increasing interaction of software engineering and systems engineering; increased emphasis on users and end value; increased emphasis on systems and software dependability; increasingly rapid change; increasing global connectivity and need for systems to interoperate; increasingly complex systems of systems; increasing needs for COTS, reuse, and legacy systems and software integration; and computational plenty. It also identifies two “wild card” trends: increasing software autonomy and combinations of biology and computing. It then discusses the likely influences of these trends on systems and software engineering processes between now and 2025, and presents an emerging scalable spiral process model for coping with the resulting challenges and opportunities of developing 21st century software-intensive systems and systems of systems. © 2006 Wiley Periodicals, Inc. Syst Eng 9: 1–19, 2006

146 citations


Journal ArticleDOI
TL;DR: The purpose of this article is to provide a comprehensive overview of measurement error as it applies to research design and instrumentation issues and to serve as a succinct, practical reminder of the definitions and relationships of the concepts of validity and reliability.

110 citations


BookDOI
TL;DR: This paper analyzes the applicability of various performance pre- diction methods for the development of component-based systems and contrast their inherent strengths and weaknesses in different engineering problem scenar- ios to establish a basis to select an appropriate prediction method.
Abstract: Performance predictions of component assemblies and the ability of obtaining system-level performance properties from these predictions are a cru- cial success factor when building trustworthy component-based systems. In order to achieve this goal, a collection of methods and tools to capture and analyze the performance of software systems has been developed. These methods and tools aim at helping software engineers by providing them with the capability to understand design trade-offs, optimize their design by identifying performance inhibitors, or predict a systems performance within a specified deployment envi- ronment. In this paper, we analyze the applicability of various performance pre- diction methods for the development of component-based systems and contrast their inherent strengths and weaknesses in different engineering problem scenar- ios. In so doing, we establish a basis to select an appropriate prediction method and to provide recommendations for future research activities, which could sig- nificantly improve the performance prediction of component-based systems.

110 citations


Journal ArticleDOI
TL;DR: This article used generalizability theory (G-theory) procedures to examine the impact of the number of tasks and raters per speech sample and of subsection lengths on the dependability of speaking scores.
Abstract: A multitask speaking measure consisting of both integrated and independent tasks is expected to be an important component of a new version of the TOEFL test. This study considered two critical issues concerning score dependability of the new speaking measure: How much would the score dependability be impacted by (1) combining scores on different task types into a composite score and (2) rating each task only once? To answer these questions, generalizability theory (G-theory) procedures were used to examine the impact of the numbers of tasks and raters per speech sample and of subsection lengths on the dependability of speaking scores. Univariate and multivariate G-theory analyses were conducted on rating data collected for 261 examinees for the study. The finding in the univariate analyses was that it would be more efficient to increase the number of tasks rather than the number of ratings per speech sample in maximizing the score dependability. The multivariate G-theory analyses also revealed that (1) th...

95 citations


Proceedings ArticleDOI
10 Jul 2006
TL;DR: Results present in this paper show that fault-tolerant techniques can be easily optimized by the tools reducing the robustness of the final design.
Abstract: This work discusses the use of two fault-tolerant techniques, duplication with self-checking and triple modular redundancy, for one-hot encoding FSM in SRAM-based techniques. The FSM encoding styles have a significant influence on the dependability of the machine in presence of bit-flips, known as single event upsets (SEUs). Although the one-hot encoding style presents the best trade-off in terms of reliability, modern synthesis tools tend to optimize crucial characteristic of the one-hot style. Consequently, techniques must be applied in the hardware description language to ensure reliability of protected one-hot FSM. Results present in this paper show that fault-tolerant techniques can be easily optimized by the tools reducing the robustness of the final design. Solutions in the RTL level are proposed to ensure reliability.

81 citations


01 Jan 2006
TL;DR: An overview of this emerging research field, suggesting directions that could usefully be taken in the field of dependability requirements, and the links between the dependability ontology, an ontology for requirements and domain ontologies are considered.
Abstract: There is a long history of research into utilising ontologies in the Requirements Engineering process. An ontology is generally based upon some logical formalism, and has the benefits for requirements of explicitly modelling domain knowledge in a machine interpretable way, e.g. allowing requirements to be traced and checked for consistency by an inference engine, and software specifications to be derived. With the emergence of the semantic web, the interest in ontologies for Requirements Engineering is on the increase. Whilst efforts have been concentrated upon re-interpreting software engineering techniques for the semantic web, it is interesting to consider what benefits there are to be passed from the semantic web to traditional Software Engineering techniques. In this paper we give an overview of this emerging research field, suggesting directions that could usefully be taken in the field of dependability requirements. We present our work on a dependability ontology compliant with the IFIP Working Group 10.4 taxonomy and discuss how this, and other ontologies, must interact in the course of Dependability Requirements Engineering. In particular we consider the links between the dependability ontology, an ontology for requirements and domain ontologies, identifying the advantages and difficulties of this approach.

77 citations


Journal Article
TL;DR: This paper proposes a new approach to anomaly detection, based on the design diversity, a technique from the dependability field that has been widely ignored in the intrusion detection area, and provides an implicit, and complete reference model, instead of the explicit model usually required.
Abstract: It is commonly accepted that intrusion detection systems (IDS) are required to compensate for the insufficient security mechanisms that are available on computer systems and networks. However, the anomaly-based IDSes that have been proposed in the recent years present some drawbacks, e.g., the necessity to explicitly define a behaviour reference model. In this paper, we propose a new approach to anomaly detection, based on the design diversity, a technique from the dependability field that has been widely ignored in the intrusion detection area. The main advantage is that it provides an implicit, and complete reference model, instead of the explicit model usually required. For practical reasons, we actually use Components-off-the-shelf (COTS) diversity, and discuss on the impact of this choice. We present an architecture using COTS-diversity, and then apply it to web servers. We also provide experimental results that confirm the expected properties of the built IDS, and compare them with other IDSes.

Proceedings ArticleDOI
20 Apr 2006
TL;DR: A new iterative algorithm is applied that efficiently combines dynamic discretisation with robust propagation algorithms on junction tree structures to perform inference in hybrid BNs.
Abstract: A hybrid Bayesian network (BN) is one that incorporates both discrete and continuous nodes. In our extensive applications of BNs for system dependability assessment the models are invariably hybrid and the need for efficient and accurate computation is paramount. We apply a new iterative algorithm that efficiently combines dynamic discretisation with robust propagation algorithms on junction tree structures to perform inference in hybrid BNs. We illustrate its use on two example dependability problems: reliability estimation and diagnosis of a faulty sensor in a temporal system. Dynamic discretisation can be used as an alternative to analytical or Monte Carlo methods with high precision and can be applied to a wide range of dependability problems.

Proceedings ArticleDOI
14 Jun 2006
TL;DR: The dynamic RBD permits to model the dynamic reliability behavior of a system through the proposed various dependencies models as well as redundancy and load sharing policies models.
Abstract: Nowadays, also as consequence of recent chronicles, aspects like security, availability, reliability and other features of a generic system summarized under the concept of dependability are receiving increasing attention. This fact is translated into specific requirements as well as explicit and tighter constraints the system must satisfy. But sometimes, with particular reference to the reliability, there is lack of suitable tools to properly model, analyze and study in depth these aspects. To fill this gap, a new modeling tool is proposed, extending and enhancing the existing reliability block diagram (RBD) formalism: the dynamic RBD (DRBD). The DRBD permits to model the dynamic reliability behavior of a system through the proposed various dependencies models as well as redundancy and load sharing policies models. The flexible DRBD dependencies management is based on an articulated dynamic model composed by states and events. The capabilities, flexibility, and other features of the proposed new modeling approach are illustrated using the modeling of an example computer system. Different possible solution techniques are also discussed for a DRBD model: Markov chain, Petri nets, analytic, Monte Carlo simulation, and a modular approach which combines the classical analytic RBD solution and the dynamic solutions

Proceedings ArticleDOI
24 Jul 2006
TL;DR: A new SLA-oriented software rejuvenation technique is proposed that proved to be a simple way to increase the dependability of the SOAP-server, the degree of self-healing and to maintain a sustained level of performance in the applications.
Abstract: Web-services and service-oriented architectures are gaining momentum in the area of distributed systems and Internet applications. However, as we increase the abstraction level of the applications we are also increasing the complexity of the underlying middleware. In this paper, we present a dependability benchmarking study to evaluate and compare the robustness of some of the most popular SOAP-RPC implementations that are intensively used in the industry. The study was focused on Apache Axis where we have observed a high susceptibility of software aging. Building on these results we propose a new SLA-oriented software rejuvenation technique that proved to be a simple way to increase the dependability of the SOAP-server, the degree of self-healing and to maintain a sustained level of performance in the applications

Journal ArticleDOI
TL;DR: This paper proposes a design for an active star topology called CANcentrate, which solves the limitations of a CAN bus by means of an active hub, which prevents error propagation from any of its ports to the others.
Abstract: The controller area network (CAN) is a field bus that is nowadays widespread in distributed embedded systems due to its electrical robustness, low price, and deterministic access delay. However, its use in safety-critical applications has been controversial due to dependability limitations, such as those arising from its bus topology. In particular, in a CAN bus, there are multiple components such that if any of them is faulty, a general failure of the communication system may happen. In this paper, we propose a design for an active star topology called CANcentrate. Our design solves the limitations indicated above by means of an active hub, which prevents error propagation from any of its ports to the others. Due to the specific characteristics of this hub, CANcentrate is fully compatible with existing CAN controllers. This paper compares bus and star topologies, analyzes related work, describes the CANcentrate basics, paying special attention to the mechanisms used for detecting faulty ports, and finally describes the implementation and test of a CANcentrate prototype.

Proceedings ArticleDOI
20 Apr 2006
TL;DR: This position paper suggests a high-level conceptual model that is aimed to give a novel approach to the area of security and dependability and to provide an overall means for finding and applying fundamental defense mechanisms.
Abstract: It is now commonly accepted that security and dependability largely represent two different aspects of an overall meta-concept that reflects the trust that we put in a computer system. There exist a large number of models of security and dependability with various definitions and terminology. This position paper suggests a high-level conceptual model that is aimed to give a novel approach to the area. The model defines security and dependability characteristics in terms of a system's interaction with its environment via the system boundaries and attempts to clarify the relation between malicious environmental influence, e.g. attacks, and the service delivered by the system. The model is intended to help reasoning about security and dependability and to provide an overall means for finding and applying fundamental defense mechanisms. Since the model is high-level and conceptual it must be interpreted into each specific sub-area of security/dependability to be practically useful.

Journal ArticleDOI
TL;DR: A new approach to integrated security and dependability evaluation, which is based on stochastic modeling techniques, and opens up for use of traditional Markov analysis to make new types of probabilistic predictions for a system, such as its expected time to security failure.
Abstract: This paper presents a new approach to integrated security and dependability evaluation, which is based on stochastic modeling techniques. Our proposal aims to provide operational measures of the trustworthiness of a system, regardless if the underlying failure cause is intentional or not. By viewing system states as elements in a stochastic game, we can compute the probabilities of expected attacker behavior, and thereby be able to model attacks as transitions between system states. The proposed game model is based on a reward- and cost concept. A section of the paper is devoted to the demonstration of how the expected attacker behavior is affected by the parameters of the game. Our model opens up for use of traditional Markov analysis to make new types of probabilistic predictions for a system, such as its expected time to security failure.

Book ChapterDOI
19 Dec 2006
TL;DR: In this article, the authors present an approach for simulating threats to corporate assets, taking the entire infrastructure into account, using the ontology used for the simulation is based on Landwehr's [ALRL04] taxonomy of computer security and dependability.
Abstract: Threat analysis and mitigation, both essential for corporate security, are time consuming, complex and demand expert knowledge. We present an approach for simulating threats to corporate assets, taking the entire infrastructure into account. Using this approach effective countermeasures and their costs can be calculated quickly without expert knowledge and a subsequent security decisions will be based on objective criteria. The ontology used for the simulation is based on Landwehr's [ALRL04] taxonomy of computer security and dependability.

Journal Article
TL;DR: In this paper, the authors identify parameters impacting the Web services dependability, describe the methods of dependability enhancement by redundancy in space and redundancy in time and perform a series of experiments to evaluate the availability of Web services.
Abstract: With ever growing use of Internet, Web services become increasingly popular and their growth rate surpasses even the most optimistic predictions. Services are self-descriptive, self-contained, platform-independent and openly-available components that interact over the network. They are written strictly according to open specifications and/or standards and provide important and often critical functions for many business-to-business systems. Failures causing either service downtime or producing invalid results in such systems may range from a mere inconvenience to significant monetary penalties or even loss of human lives. In applications where sensing and control of machines and other devices take place via services, making the services highly dependable is one of main critical goals. Currently, there is no experimental investigation to evaluate the reliability and availability of Web services systems. In this paper, we identify parameters impacting the Web services dependability, describe the methods of dependability enhancement by redundancy in space and redundancy in time and perform a series of experiments to evaluate the availability of Web services. To increase the availability of the Web service, we use several replication schemes and compare them with a single service. The Web services are coordinated by a replication manager. The replication algorithm and the detailed system configuration are described in this paper.

Proceedings ArticleDOI
18 Oct 2006
TL;DR: The results show the consequences of internal component faults in several operational scenarios and provide empirical evidences that interface faults and software component faults cause different impact in the system.
Abstract: The injection of interface faults through API parameter corruption is a technique commonly used in experimental dependability evaluation. Although the interface faults injected by this approach can be considered as a possible consequence of actual software faults in real applications, the question of whether the typical exceptional inputs and invalid parameters used in these techniques do represent the consequences of software bugs is largely an open issue. This question may not be an issue in the context of robustness testing aimed at the identification of weaknesses in software components. However, the use of interface faults by API parameter corruption as a general approach for dependability evaluation in component-based systems requires an in depth study of interface faults and a close observation of the way internal component faults propagate to the component interfaces. In this paper we present the results of experimental evaluation of realistic component-based applications developed in Java and C using the injection of interface faults by API parameter corruption and the injection of software faults inside the components by modification of the target code. The faults injected inside software components emulate typical programming errors and are based on an extensive field data study previously published. The results show the consequences of internal component faults in several operational scenarios and provide empirical evidences that interface faults and software component faults cause different impact in the system

Proceedings ArticleDOI
20 Apr 2006
TL;DR: A new approach to integrated security and dependability evaluation, which is based on stochastic modelling techniques, and opens up for use traditional Markov analysis to make new types of probabilistic predictions for a system, such as its expected time to security failure.
Abstract: We present a new approach to integrated security and dependability evaluation, which is based on stochastic modelling techniques. Our proposal aims to provide operational measures of the trustworthiness of a system, regardless if the underlying failure cause is intentional or not. By viewing system states as elements in a stochastic game, we can compute the probabilities of expected attacker behavior, and thereby be able to model attacks as transitions between system states. The proposed game model is based on a reward-and cost concept. A section of the paper is devoted to the demonstration of how the expected attacker behavior is affected by the parameters of the game. Our model opens up for use traditional Markov analysis to make new types of probabilistic predictions for a system, such as its expected time to security failure.

Book
27 Nov 2006
TL;DR: Train Systems, Fault-Tolerant Insulin Pump Therapy, and Programming-Logic Analysis of Fault Tolerance: Expected Performance of Self-stabilisation.
Abstract: Train Systems.- Train Systems.- Formalising Reconciliation in Partitionable Networks with Distributed Services.- The Fault-Tolerant Insulin Pump Therapy.- Reasoning About Exception Flow at the Architectural Level.- Are Practitioners Writing Contracts?.- Determining the Specification of a Control System: An Illustrative Example.- Achieving Fault Tolerance by a Formally Validated Interaction Policy.- F(I)MEA-Technique of Web Services Analysis and Dependability Ensuring.- On Specification and Verification of Location-Based Fault Tolerant Mobile Systems.- Formal Development of Mechanisms for Tolerating Transient Faults.- Separating Concerns in Requirements Analysis: An Example.- Rigorous Fault Tolerance Using Aspects and Formal Methods.- Rigorous Development of Fault-Tolerant Agent Systems.- Formal Service-Oriented Development of Fault Tolerant Communicating Systems.- Programming-Logic Analysis of Fault Tolerance: Expected Performance of Self-stabilisation.- Formal Analysis of the Operational Concept for the Small Aircraft Transportation System.- Towards a Method for Rigorous Development of Generic Requirements Patterns.- Rigorous Design of Fault-Tolerant Transactions for Replicated Database Systems Using Event B.- Engineering Reconfigurable Distributed Software Systems: Issues Arising for Pervasive Computing.- Position Papers.- Tools for Developing Large Systems (A Proposal).- Why Programming Languages Still Matter.

Journal ArticleDOI
TL;DR: Three methods: verification of the only controller, constraints- based verification, in which the plant is simply modeled as a set of physical constraints, and model-based verification, that relies on a detailed model of the plant, are presented.

Journal ArticleDOI
TL;DR: This paper gathers and reviews the main mechanisms that were developed to provide dependability to the FFT-CAN protocol, namely, master replication and fail-silence enforcement.
Abstract: The traditional approaches to the design of distributed safety-critical systems, due to fault-tolerance reasons, have mostly considered static cyclic table-based traffic scheduling. However, there is a growing demand for operational flexibility and integration, mainly to improve efficiency in the use of system resources, with the network playing a central role to support such properties. This calls for dynamic online traffic scheduling techniques so that dynamic communication requirements are adequately supported. Nevertheless, using dynamic traffic management mechanisms raises additional problems, in terms of fault-tolerance, related with the weaker knowledge of the future system state caused by the higher level of operational flexibility. Such problems have been recently addressed in the scope of using flexible time-triggered CAN (FTT-CAN) in safety-critical applications in order to benefit from the high operational flexibility of this protocol. This paper gathers and reviews the main mechanisms that were developed to provide dependability to the protocol, namely, master replication and fail-silence enforcement.

Journal ArticleDOI
TL;DR: The approach, a specialization of the peer-to-peer architectural style, hides inside the architectural elements the complexities of exception handling and propagation to improve a system's overall reliability and availability by making it tolerant of nonmalicious faults.
Abstract: A system's structure enables it to generate its intended behavior from its components' behavior. A well-structured system simplifies relationships among components, which can increase dependability. With software systems, the architecture is an abstraction of the structure. Architectural reasoning about dependability has become increasingly important because emerging applications are increasingly complex. We've developed an architectural approach for effectively representing and analyzing fault-tolerant software systems. The proposed solution relies on exception handling to tolerate faults associated with component and connector failures, architectural mismatches, and configuration faults. Our approach, a specialization of the peer-to-peer architectural style, hides inside the architectural elements the complexities of exception handling and propagation. Our goal is to improve a system's overall reliability and availability by making it tolerant of nonmalicious faults.

Journal ArticleDOI
TL;DR: This work explores how regression testing can be systematically applied at the software architecture level in order to reduce the cost of retesting modified systems, and also to assess the regression testability of the evolved system.

Journal Article
TL;DR: In this paper, the authors present a comprehensive fault hypothesis for safety-critical real-time computer systems, under the assumption that a distributed system node is expected to be a system-on-a-chip (SOC).
Abstract: A safety-critical real-time computer system must provide its services with a dependability that is much better than the dependability of any one of its constituent components. This challenging goal can only be achieved by the provision of fault tolerance. The design of any fault-tolerant system proceeds in four distinct phases. In the first phase the fault hypothesis is shaped, i.e. assumptions are made about the types and numbers of faults that must be tolerated by the planned system. In the second phase an architecture is designed that tolerates the specified faults. In the third phase the architecture is implemented and the functions and fault-tolerance mechanisms are validated. Finally, in the fourth phase it has to be confirmed experimentally that the assumptions contained in the fault-hypothesis are met by reality. The first part of this contribution focuses on the establishment of a comprehensive fault hypothesis for safety-critical real-time computer systems. The size of the fault containment regions, the failure mode of the fault containment regions, the assumed frequency of the faults and the assumptions about error detection latency and error containment are discussed under the premise that in future a distributed system node is expected to be a system-on-a-chip (SOC). The second part of this contribution focuses on the implications that such a fault hypothesis will have on the future architecture of distributed safety-critical real-time computer systems in the automotive domain.

Book ChapterDOI
01 Jan 2006
TL;DR: This chapter has argued that complex systems exhibit behaviour at many different time levels and that a useful aid in structuring, describing and specifying such behaviour is to use time bands.
Abstract: In this chapter we have argued that complex systems exhibit behaviour at many different time levels and that a useful aid in structuring, describing and specifying such behaviour is to use time bands. Viewing a system as a collection of activities within a finite set of bands is an effective means of separating concerns and identifying inconsistencies between different ‘layers’ of the system. Time bands are not mapped on to a single notion of physical time. Within a system there will always be a relation between bands but the bands need not be tightly synchronised. There is always some level of imprecision between any two adjacent bands. Indeed the imprecision may be large in social systems and be a source of dependability (robustness).

Journal ArticleDOI
TL;DR: A practical framework for eliciting and modeling dependability requirements devised to support and improve stakeholders' participation is suggested and an air traffic control system, adopted as testbed within the NASA High Dependability Computing Project, is used as a case study.

Proceedings ArticleDOI
24 Apr 2006
TL;DR: This work presents a systematic resource allocation approach for the consolidated mapping of safety critical and non-safety critical applications onto a distributed platform such that their operational delineation is maintained over integration.
Abstract: Mapping of software onto hardware elements under platform resource constraints is a crucial step in the design of embedded systems. As embedded systems are increasingly integrating both safety-critical and non-safety critical software functionalities onto a shared hardware platform, a dependability driven integration is desirable. Such an integration approach faces new challenges of mapping software components onto shared hardware resources while considering extra-functional (dependability, timing, power consumption, etc.) requirements of the system. Considering dependability and real-time as primary drivers, we present a systematic resource allocation approach for the consolidated mapping of safety critical and non-safety critical applications onto a distributed platform such that their operational delineation is maintained over integration. The objective of our allocation technique is to come up with a feasible solution satisfying multiple concurrent constraints. Ensuring criticality partitioning, avoiding error propagation and reducing interactions across components are addressed in our approach. In order to demonstrate the usefulness and effectiveness of the mapping, the developed approach is applied to an actual automotive system.