Proceedings ArticleDOI
Real-Time Distributed Discrete-Event Execution with Fault Tolerance
Thomas Huining Feng,Edward A. Lee +1 more
- pp 205-214
Reads0
Chats0
TLDR
This paper takes a program transformation approach to automatically enhance DE models with incremental checkpointing and state recovery functionality and incorporates this mechanism into PTIDES for efficient execution of fault- tolerant real-time distributed DE systems.Abstract:
We build on PTIDES, a programming model for distributed embedded systems that uses discrete-event (DE) models as program specifications. PTIDES improves on distributed DE execution by allowing more concurrent event processing without backtracking. This paper discusses the general execution strategy for PTIDES, and provides two feasible implementations. This execution strategy is then extended with tolerance for hardware errors. We take a program transformation approach to automatically enhance DE models with incremental checkpointing and state recovery functionality. Our fault tolerance mechanism is lightweight and has low overhead. It requires very little human intervention. We incorporate this mechanism into PTIDES for efficient execution of fault- tolerant real-time distributed DE systems.read more
Citations
More filters
Journal ArticleDOI
The past, present and future of cyber-physical systems: a focus on models.
TL;DR: Two projects show that deterministic CPS models with faithful physical realizations are possible and practical and shows that the timing precision of synchronous digital logic can be practically made available at the software level of abstraction.
Proceedings ArticleDOI
CPS foundations
TL;DR: It is argued that cyber-physical systems present a substantial intellectual challenge that requires changes in both theories of computation and dynamical systems theory, and demands models that embrace both.
Book ChapterDOI
A Provenance-Based Fault Tolerance Mechanism for Scientific Workflows
Daniel Crawl,Ilkay Altintas +1 more
TL;DR: This paper presents a method for capturing data value- and control- dependencies for provenance information collection in the Kepler scientific workflow system and describes how the collected information based on these dependencies could be used for a fault tolerance framework in different models of computation.
Proceedings ArticleDOI
Execution Strategies for PTIDES, a Programming Model for Distributed Embedded Systems
TL;DR: This paper first defines a general execution strategy that conforms to the DE semantics, and then specializes this strategy to give practical, implementable and distributed policies.
ReportDOI
PTIDES: A Programming Model for Distributed Real-Time Embedded Systems
Patricia Derler,Thomas Huining Feng,Edward A. Lee,Slobodan Matic,Hiren D. Patel,Yang Zheo,Jia Zou +6 more
TL;DR: An execution strategy that is aggressive in concurrent execution of events is presented, that allows independent events to be processed out of time stamp order and uses clock synchronization as a replacement for null message communication across distributed platforms.
References
More filters
Journal ArticleDOI
Virtual time
TL;DR: Virtual time is a new paradigm for organizing and synchronizing distributed systems which can be applied to such problems as distributed discrete event simulation and distributed database concurrency control.
Journal ArticleDOI
A survey of rollback-recovery protocols in message-passing systems
TL;DR: This survey covers rollback-recovery techniques that do not require special language constructs and distinguishes between checkpoint-based and log-based protocols, which rely solely on checkpointing for system state restoration.
Journal ArticleDOI
System structure for software fault tolerance
TL;DR: In this article, the authors present a method for structuring complex computing systems by the use of what they term "recovery blocks", "conversations", and "fault-tolerant interfaces".
Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems
Kang B. Lee,J Eldson +1 more
TL;DR: A protocol is provided in this standard that enables precise synchronization of clocks in measurement and control systems implemented with technologies such as network communication, local computing, and distributed objects.
Journal ArticleDOI
Synchronization and Linearity: An Algebra for Discrete Event Systems
TL;DR: This book proposes a unified mathematical treatment of a class of 'linear' discrete event systems, which contains important subclasses of Petri nets and queuing networks with synchronization constraints, which is shown to parallel the classical linear system theory in several ways.
Related Papers (5)
A Programming Model for Time-Synchronized Distributed Real-Time Systems
Yang Zhao,Jie Liu,Edward A. Lee +2 more