scispace - formally typeset
Open Access

User-Space Process Virtualization in the Context of Checkpoint-Restart and Virtual Machines

Kapil Arya
Reads0
Chats0
TLDR
This dissertation presents user-space process virtualization to decouple application processes from the external subsystems and an adaptive plugin based approach is used to implement the virtualization layers that allow the checkpoint-restart system to grow organically.
Abstract
Checkpoint-Restart is the ability to save a set of running processes to a checkpoint image on disk, and to later restart them from the disk. In addition to its traditional use in fault tolerance, recovering from a system failure, it has numerous other uses, such as for application debugging and save/restore of the workspace of an interactive problem-solving environment. Transparent checkpointing operates without modifying the underlying application program, but it implicitly relies on a “Closed World Assumption” — the world (including file system, network, etc.) will look the same upon restart as it did at the time of checkpoint. This is not valid for more complex programs. Until now, checkpoint-restart packages have adopted ad hoc solutions for each case where the environment changes upon restart. This dissertation presents user-space process virtualization to decouple application processes from the external subsystems. A thin virtualization layer is introduced between the application and each external subsystem. It provides the application with a consistent view of the external world and allows for checkpoint-restart to succeed. The ever growing number of external subsystems make it harder to deploy and maintain virtualization layers in a monolithic checkpoint-restart system. To address this, an adaptive plugin based approach is used to implement the virtualization layers that allow the checkpoint-restart system to grow organically. The principle of decoupling the external subsystem through process virtualization is also applied in the context of virtual machines for providing a solution to the long standing double-paging problem. Double-paging occurs when the guest attempts to page out memory that has previously been swapped out by the hypervisor and leads to long delays for the guest as the contents are read back into machine memory only to be written out again. The performance rapidly drops as a result of significant lengthening of the time to complete the guest I/O request.

read more

Citations
More filters
Journal ArticleDOI

Guest Editor's Introduction: Fault tolerance

Karl E. Grosspietsch
- 01 Feb 1994 - 
TL;DR: This special issue reports on the methodological progress in some selected areas of fault tolerance as well as practical experience gained by developing concrete fault-tolerant systems.
Journal Article

Application-level checkpointing techniques for parallel programs

TL;DR: In this article, a survey of techniques used in application-level checkpointing is presented, with special attention being paid to techniques for checkpointing parallel and distributed applications, with a variety of techniques in every level of the system, from utilizing special hardware/architectural checkpointing features through modification of the user's source code.
Proceedings ArticleDOI

Design and Implementation for Checkpointing of Distributed Resources Using Process-Level Virtualization

TL;DR: This work presents DMTCP-PV, a new user-space transparent checkpointing system based on the concept of process virtualization, which separately models the state of each local or distributed subsystem while decoupling it from the core checkpointing engine.
BookDOI

Logic-Based Program Synthesis and Transformation

TL;DR: Finite Tree Automata are described, both prescriptively and descriptively, for expressing and checking intended properties of programs, and extensions that go beyond the expressiveness of tree automata are looked at, as well as integrating arithmetic constraints.
Dissertation

Enabling sender-initiated distributed applications and checkpointing in content centric networks

TL;DR: CCN Application Checkpoint (CCNAC) Tool is presented, a plugin for checkpointing tool DMTCP, which enables checkpointing applications in CCN and a novel, efficient, application-layer based algorithm for sender-initiated communications based on proposed "pro-active naming" scheme inCCN is proposed.
References
More filters
Journal ArticleDOI

Xen and the art of virtualization

TL;DR: Xen, an x86 virtual machine monitor which allows multiple commodity operating systems to share conventional hardware in a safe and resource managed fashion, but without sacrificing either performance or functionality, considerably outperform competing commercial and freely available solutions.
Book

The Java Virtual Machine Specification

Tim Lindholm, +1 more
TL;DR: In this article, the authors present a detailed overview of the Java Virtual Machine, including the internal structure of the class file format, the internal form of Fully Qualified Class and Interface names, and the implementation of new class instances.
Journal ArticleDOI

Distributed snapshots: determining global states of distributed systems

TL;DR: An algorithm by which a process in a distributed system determines a global state of the system during a computation, which helps to solve an important class of problems: stable property detection.
Journal ArticleDOI

Memory resource management in VMware ESX server

TL;DR: Several novel ESX Server mechanisms and policies for managing memory are introduced, including a ballooning technique that reclaims the pages considered least valuable by the operating system running in a virtual machine, and an idle memory tax that achieves efficient memory utilization.
Proceedings ArticleDOI

Exokernel: an operating system architecture for application-level resource management

TL;DR: The prototype exokernel system implemented here is at least five times faster on operations such as exception dispatching and interprocess communication, and allows applications to control machine resources in ways not possible in traditional operating systems.
Related Papers (5)