Proceedings ArticleDOI
DEFINE: a distributed fault injection and monitoring environment
Wei lun Kao,Ravishankar K. Iyer +1 more
- pp 252-259
TLDR
DEFINE can inject both hardware faults and software faults into any process running in a distributed system, either in user mode or in supervisor mode, and monitor the fault impact and propagation in software systems and among machines.Abstract:
This paper presents a distributed fault injection and monitoring environment (DEFINE) as a tool to evaluate system dependability, to investigate fault propagation, and to validate fault-tolerant mechanisms. DEFINE can inject both hardware faults (hardware-induced software errors) and software faults into any process running in a distributed system, either in user mode or in supervisor mode, and monitor the fault impact and propagation in software systems and among machines. It employs two fault injection techniques: (i) using hardware clock interrupts to control the time of fault injection and activation, and (ii) using software traps to inject all the faults except communication faults and memory faults in the data/stack segment. Experiments on six Sun SPARCstations to study the system behavior under faults are conducted to demonstrate the application of DEFINE.read more
Citations
More filters
Journal ArticleDOI
Xception: a technique for the experimental evaluation of dependability in modern computers
TL;DR: Experimental, results are presented to demonstrate the accuracy and potential of Xception in the evaluation of the dependability properties of the complex computer systems available nowadays.
Journal ArticleDOI
FERRARI: a flexible software-based fault and error injection system
TL;DR: The methodology and guidelines for the design of flexible software based fault and error injection are described and a tool, FERRARI, that incorporates the techniques are presented that demonstrates the effectiveness of the software-based error injection tool in evaluating the dependability properties of complex systems.
Journal ArticleDOI
Dependability of COTS microkernel-based systems
TL;DR: A prototype environment, called MAFALDA (Microkernel Assessment by Fault injection AnaLysis and Design Aid), that is aimed at providing objective failure data on a candidate microkernel and also improving its error detection capabilities is described.
Journal ArticleDOI
DEPEND: a simulation-based environment for system level dependability analysis
TL;DR: The rationale for a functional simulation tool, called DEPEND, which provides an integrated design and fault injection environment for system level dependability analysis is presented and techniques developed to simulate realistic fault scenarios, reduce simulation time explosion, and handle the large fault model and component domain associated with system level analysis are presented.
Journal ArticleDOI
Assessing Dependability with Software Fault Injection: A Survey
TL;DR: This survey provides a comprehensive overview of the state of the art on Software Fault Injection to support researchers and practitioners in the selection of the approach that best fits their dependability assessment goals.
References
More filters
Proceedings ArticleDOI
FIAT-fault injection based automated testing environment
Zary Segall,D. Vrsalovic,Daniel P. Siewiorek,D. Yaskin,J. Kownacki,J. Barton,R. Dancey,Anna Robinson,T. Lin +8 more
TL;DR: An automated real-time distributed accelerated fault injection environment (FIAT) is presented as an attempt to provide suitable tools for the validation process and an example of fault tolerant systems such as checkpointing and duplicate and match is used to show its usefulness.
Journal ArticleDOI
Fault injection experiments using FIAT
TL;DR: FIAT is capable of emulating a variety of distributed system architectures and it provides the capabilities to monitor system behavior and inject faults for the purpose of experimental characterization and validation of a system's dependability.
Proceedings ArticleDOI
FERRARI: a tool for the validation of system dependability properties
TL;DR: FERRARI as mentioned in this paper is a fault and error automatic real-time injector, which can evaluate complex systems by emulating most hardware faults in software, including permanent faults and transient errors.
Journal ArticleDOI
FINE: A fault injection and monitoring environment for tracing the UNIX system behavior under faults
TL;DR: Experimental results show that memory and software faults usually have a very long latency, while bus andCPU faults tend to crash the system immediately, and Markov reward analysis shows that the performance loss incurred by bus faults and CPU faults is much higher than that incurred by software and memory faults.
Proceedings ArticleDOI
Understanding large system failures-a fault injection experiment
R. Chillarege,N.S. Bowen +1 more
TL;DR: The idea of failure acceleration is introduced to conduct experiments that enhance the understanding of large system failures and provide a foundation for design enhancements and modeling of availability.