scispace - formally typeset
Search or ask a question

Showing papers on "Rollback published in 1991"


Patent
26 Nov 1991
TL;DR: An improved concurrency control system for application to a distributed concurrent transaction and query processing system using multi-version database records to overcome delays arising from lock conflicts is presented in this article. But it requires at least two database versions, although availability of more versions permits long read-only queries to phase-out over time without forcing new queries to use aged "stable-state" data and without roll-back.
Abstract: An improved concurrency control system for application to a distributed concurrent transaction and query processing system using multi-version database records to overcome delays arising from lock conflicts. Read-only queries are afforded a consistent "stable state" of the database during the life of the query. Updating transactions requiring locks can proceed without waiting for the termination of long queries. At least two database versions are necessary, although availability of more versions permits long read-only queries to phase-out over time without forcing new queries to use aged "stable-state" data and without roll-back. Read-only queries can be terminated and converted to locking transactions to permit an update of the "stable state" database version before the queries would normally terminate. A novel record key structure having a plurality of substructures corresponding to the several database versions is used to access database records. Rapid selection of proper record version and efficient version tracking and updating is effected using several bit-mapped transaction index tables.

207 citations


Journal ArticleDOI
TL;DR: A general model of rollback in parallel processing is presented and a tunable algorlthm, Filtered Rollback, is given that is designed to avoid the failure modes and a rigorous mathematical proof that Faltered Rollbacks is efficient, if implemented on a reasonably efficient multiprocessor is provided.
Abstract: We present and analyze a general model of rollback in parallel processing, The analysis points out three possible modes where rollback may become excessive; we provide an example of each type. We identify the parameters that determme a stability, or efficiency region for the simulation. Our analysis suggests the possibility of a dangerous “phase-transition” from stabil ity to instability y in the parameter space. In particular, a rollback algorlthm may work efficiently for a small system but become inefficient for a large system. Moreover, for a given system, it may work quickly for a while and then suddenly slow down On the positive side, we give a tunable algorlthm, Filtered Rollback, that is designed to avoid the failure modes, Under appropriate assumptions, we provide a rigorous mathematical proof that Faltered Rollback m efficient, if implemented on a reasonably efficient multiprocessor. In particular, we show that the average time r to complete the simulation of a system with N nodes and R events on a p-processor PRAM satisfies

100 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the use of massively parallel architectures to execute discrete-event simulations of self-initiating models and derive upper and lower bounds on optimal performance.
Abstract: This paper considers the use of massively parallel architectures to execute discrete-event simulations of what we term “self-initiating” models. A logical process in a self-initiating model schedules its own state reevaluation times, independently of any other logical process, and sends its new state to other logical processes following the reevaluation. Our interest is in the effects of that communication on synchronization. Using a model that idealizes the communication topology of a simulation, we consider the performance of various synchronization protocols by deriving upper and lower bounds on optimal performance, upper bounds on Time Warp's performance, and lower bounds on the performance of a new consevative protocol. Our analysis of Time Warp includes some of the overhead costs of state saving and rollback; the effects of propogating rollbacks are ignored. The analysis points out sufficient conditions for the conservitive protocol to outperform Time Warp. The analysis also quantifies the sensitivity of performance to message fanout, lookahead ability, and the probability distributions underlying the simulation.

91 citations


Journal ArticleDOI
TL;DR: The goal is to design an efficient memory management protocol which guarantees that the memory consumption of parallel simulation is of the same order as sequential simulation, and proposes an optimal algorithm called artifical rollback.
Abstract: Recently there has been a great deal of interest in performance evalution of parallel simulation. Most work is devoted to the time complexity and assumes that the amount of memory available for parallel simulation is unlimited. This paper studies the space complexity of parallel simulation. Our goal is to design an efficient memory management protocol which guarantees that the memory consumption of parallel simulation is of the same order as sequential simulation. (Such an algorithm is referred to as a optimal.) First, we derive the relationships among the space complexities of sequential simulation, Chandy-Misra simulation [2], and Time Warp simulation [7]. We show that Chandy-Misra may consume more storage than sequential simulation, or vice versa. Then we show that Time Warp never consumes less memory than sequential simulation. Then we describe cancelback, an optimal Time Warp memory management protocol proposed by Jefferson. Although cancelback is considered to be complete solution for the storage management problem in Time Warp, some efficiency issues in implementing this algorithm must be considered. We propose an optimal algorithm called artifical rollback. We show that this algorithm is easy to implement and analyze. An implementation of artificial rollback is given, which is integrated with processor scheduling to adjust the memory consumption rate based on the amount of free storage available in the system.

82 citations


Journal ArticleDOI
TL;DR: Results indicate that message preemtion has a significant effect on performance when (1) the processor is highly utilized, (2) the execution times of messages have high varience, and (3) rollbacks occur frequently.
Abstract: The Time Warp “optimistic” approach is one of the most important parallel simulation protocols. Time Warp synchronizes processes via rollback. The original rollback mechanism called lazy cancellation has aroused great interest. This paper studies these rollback mechanisms. The general tradeoffs between aggressive and lazy cancellation are discussed, and by a conservitive-optimal simulation is defined for comparitive purposes. Within the framework of aggressive cancellation, we offer some observations and analyze the rollback behavior of tandom systems. The lazy cancellation mechanism iss examined using a metric called the sensitivity of output message. Both aggressive and lazy cancellation are shown to work well for a process with a small simulated load intensity. Finally, an analytical model is given to analyze message preemtion, an important factor that affects the performance of rollback mechanisms. Results indicate that message preemtion has a significant effect on performance when (1) the processor is highly utilized, (2) the execution times of messages have high varience, and (3) rollbacks occur frequently.

72 citations


Proceedings ArticleDOI
01 Dec 1991
TL;DR: Any consistent global state of the computation can be restored; execution can be replayed either exactly as it occurred initially or with user-controlled variations; there is no need to know a prioti what states might be of interest.
Abstract: We present a mechanism for restoring any consistent global state of a distributed computation. This capability can form the baais of support for rollback and replay of computations, an activity we view aa essential in a comprehensive environment for debugging distributed programs. Our mechanism records occasional state checkpoints and logs all messages communicated between processes. Our mechanism offers flexibility in the following ways: any consistent global state of the computation can be restored; execution can be replayed either exactly as it occurred initially or with user-controlled variations; there is no need to know a prioti what states might be of interest. In addition, if checkpoints and logs are written to stable storage, our mechanism can be used to restore states of computations that cause the system to crash.

62 citations


Proceedings ArticleDOI
25 Feb 1991
TL;DR: An environment designed to support activities such as a purchase order is described, which consists of a system call interface which programs can use to request services to support some of the important requirements of data processing activities.
Abstract: The authors describe an environment designed to support activities such as a purchase order. They propose a simple set of services which would be useful for describing and executing activities. In an implementation, an underlying system would provide these services for activities, much as an operating system provides a set of services for processes. The environment consists of a system call interface (Create, Bind, Commit, Abort, CompensationBind, Send, Receive) which programs can use to request services. These services are designed to support some of the important requirements of data processing activities, including concurrency, modularity, fault tolerance, rollback, and communication. >

60 citations


Patent
Kobayashi Seiichi1
23 May 1991
TL;DR: In this paper, a distributed transaction processing system of a two-phase commit scheme is presented, where a client sequentially requests all the servers to perform PHASE I processing, and the client stores data indicating the completion of the processing.
Abstract: In a distributed transaction processing system of a two-phase commit scheme, a client sequentially requests all the servers to perform PHASE I processing. When all the servers complete the PHASE I processing, the client stores data indicating the completion of the processing. When an operation is restarted after a system down of a given server, the server inquires of the client whether all the servers have completed the PHASE I processing. If all the servers have completed the PHASE I processing, the server executes PHASE II processing. If not all the servers have completed the PHASE I processing, the server in which failures occur, causing abnormal system termination performs rollback processing, and the client requests other servers which have completed the PHASE I processing to perform rollback processing.

59 citations


Journal ArticleDOI
TL;DR: The behavior of n interacting processors synchronized by the Time Warp protocol is analyzed using a discrete-state, continuous-time Markov chain model and the results have been validated through performance measurements of a Time Warp testbed executing on a shared-memory multiprocessor.
Abstract: The behavior of n interacting processors synchronized by the Time Warp protocol is analyzed using a discrete-state, continuous-time Markov chain model. The performance and dynamics of the processes (or processors) are analyzed under the following assumptions: exponential task times and timestamp increments on messages, each event message generates one new message that is sent to a randomly selected process, negligible rollback, state saving, and communication delay, unbounded message buffers, and homogeneous processors. Several performance measures are determined, such as: the fraction of processed events that commit, speedup, rollback probability, expected length of rollback, the probability mass function for the number of uncommitted processed events, the probability distribution function for the virtual time of a process, and the fraction of time the processors remain idle. The analysis is approximate, thus the results have been validated through performance measurements of a Time Warp testbed executing on a shared-memory multiprocessor. >

54 citations


Proceedings Article
01 Aug 1991
TL;DR: The results indicate that the cache-based schemes can provide checkpointing capability with low performance overhead but uncontrollable high variability in the checkpoint interval.
Abstract: Several variations of cache-based checkpointing for rollback error recovery in shared-memory multiprocessors have been recently developed. By modifying the cache replacement policy, these techniques use the inherent redundancy in the memory hierarchy to periodically checkpoint the computation state. Three schemes, different in the manner in which they avoid rollback propagation, are evaluated. By simulation with address traces from parallel applications running on an Encore Multimax shared-memory multiprocessor, the performance effect of integrating the recovery schemes in the cache coherence protocol are evaluated. The results indicate that the cache-based schemes can provide checkpointing capability with low performance overhead but uncontrollable high variability in the checkpoint interval.

30 citations


Proceedings ArticleDOI
25 Jun 1991
TL;DR: It is shown that micro rollback can be implemented in a practical VLSI chip and is a practical technique for minimizing the latencies normally associated with concurrent error detection.
Abstract: The design and implementation of a RISC microprocessor, called the UCLA mirror processor, which is capable of micro rollback, are reported. Two mirror processors operating in lock step achieve concurrent error detection by comparing external signals and a signature of internal signals every clock cycle. A mismatch causes both processors to roll back to the beginning of the cycle in which the error occurred. In some cases an erroneous state is corrected by copying a value from the fault-free processor to the faulty processor. The architecture, microarchitecture, and VLSI implementation of the mirror processor, with an emphasis on its error-detection and error-recovery capabilities, are described. The overhead and design issues encountered are evaluated. It is shown that micro rollback can be implemented in a practical VLSI chip and is a practical technique for minimizing the latencies normally associated with concurrent error detection. >

Proceedings ArticleDOI
01 Jun 1991
TL;DR: The problem and efficient solution of automated synthesis of a self-recovery chip using micro-roll-back and checkpoint insertion techniques is discussed and proposed and the proposed checkpointing algorithm will allow the system to recover from most transient faults.
Abstract: The problem and efficient solution of automated synthesis of a self-recovery chip using micro-roll-back and checkpoint insertion techniques is discussed and proposed. An efficient design of micro-roll-back and checkpoint insertion can be achieved by considering them during the scheduling and allocation steps. The rollback and recovery scheme is designed to satisfy the constraints of the available number of registers and the maximum allowable recovery time. The proposed checkpointing( rollback point) algorithm will allow the system to recover from most transient faults.

Proceedings ArticleDOI
08 Apr 1991
TL;DR: A database concurrency control object called ROLL (request order linked list), which is a linked list of bit vectors, is introduced together with three simple operations available to transactions: POST, CHECK and RELEASE.
Abstract: A database concurrency control object called ROLL (request order linked list), which is a linked list of bit vectors, is introduced together with three simple operations available to transactions: POST, CHECK and RELEASE. POST is used to establish serialization order. CHECK is used to determine current resource availability. RELEASE is used to relinquish resources. ROLL is based on the serialization graph testing method, but no system scheduler module is involved. Using ROLL, waiting, restarting, deadlock and livelock are minimized and almost all operations can be invoked in parallel by individual transaction manager modules. The ROLL object, performance, problems and desirable extensions are discussed. >

Journal ArticleDOI
TL;DR: This work has analyzed the behavior of a new deadlock-free two-phase locking mechanism, called Cautious Waiting, and has compared its performance with other well-known mechanisms and shown that this mechanism performs consistently better under a wide range of parameter values than the other mechanisms investigated here.

Proceedings ArticleDOI
J.T. Lim1, Songchun Moon1
20 May 1991
TL;DR: The proposed algorithm never enforces termination of normal operations of transactions and changes of checkpointing algorithms in local database systems, so the global checkpoints generated by the algorithm can be used to reconstruct the previous consistent states of a database efficiently.
Abstract: For efficient construction of the distributed database from media failure, a transaction-consistent checkpointing algorithm is proposed for heterogeneous distributed database systems. For full design autonomy and increased availability on the heterogeneous distributed database systems, the proposed algorithm never enforces termination of normal operations of transactions and changes of checkpointing algorithms in local database systems. The global checkpoints generated by the algorithm can be used to reconstruct the previous consistent states of a database efficiently. >

Proceedings ArticleDOI
02 Apr 1991
TL;DR: The behavior of n interacting processors synchronized by the "Time Warp" protocol is analyzed using a discrete state continuous time Markov chain model to determine the fraction of processed events that commit, speedup, rollback probability, expected length of rollback, and the probability distribution function for the virtual time of a process.
Abstract: The behavior of n interacting processors synchronized by the "Time Warp" protocol is analyzed using a discrete state continuous time Markov chain model. The performance and dynamics of the processes are analyzed under the following assumptions: exponential task times and times-tamp increments on messages, each event message generates one new message that is sent to a randomly selected process, negligible rollback, state saving, and communication delay, unbounded message buffers, and homogeneous processors that are never idle. We determine the fraction of processed events that commit, speedup, rollback probability, expected length of rollback, the probability mass function for the number of uncommitted processed events, and the probability distribution function for the virtual time of a process. The analysis is approximate, so the results have been validated through performance measurements of a Time Warp testbed (PHOLD workload model) executing on a shared memory multiprocessor.

Proceedings ArticleDOI
M.J. Iacoponi1
14 Oct 1991
TL;DR: The hardware macro-rollback technique presented has been implemented in the advanced fault-tolerant data processor (AFTDP), which is a high-performance fault-Tolerant shared memory multiprocessor.
Abstract: Rollback recovery offers an efficient method of recovering from transient faults and permanent faults when rollback is combined with spare resource reconfiguration. The hardware macro-rollback technique presented has been implemented in the advanced fault-tolerant data processor (AFTDP), which is a high-performance fault-tolerant shared memory multiprocessor. The architecture discussion focuses on the unique problems of achieving both low overhead and fast recovery in high-throughput cached multiprocessors. Macro-rollback recovery from transient faults and hard macro-rollback from permanent faults are examined. In addition, deadline analysis based on a semi-Markov model is presented. >

Proceedings ArticleDOI
01 Aug 1991
TL;DR: In this article, the authors present a technique that embeds the support for checkpoint and rollback recovery into the memory translation hardware, which can be implemented on various scopes of data such as a portion of an address space, a sin- gle address space or multiple address spaces.
Abstract: Checkpoint and rollback recovery is a technique that allows a system to tolerate a failure by per iodically sav­ ing the entir e state and if an error occurs, rolling back to the prior checkpoint. This technique zs partic ularly suited to applic ations with long execution times such as those typic ally found m supercomputer environments. This paper presents a technique that embeds the sup­ port for checkpoint and rollback rec overy dzrectly into the virtua l memory translation hardware. The scheme is general enough to be implemented on various scopes of data such as a portion of an address space, a sin­ gle address space or multiple address spaces. A basic model is developed which measures the amount of work required by the scheme as a function of the checkpoint interval Stze. Using this model the degree to which the overhead decreases as the interva l size increases is shown.

Proceedings ArticleDOI
28 Aug 1991
TL;DR: A solution is proposed by adopting transaction classification and a mixed concurrency control to provide an efficient technique for management of real time transactions.
Abstract: Real time systems are characterised by the existence of such transactions, that need to complete execution within limited time. Because of their special nature, these transactions cause scheduling difficulties. Traditional concurrency control methods are unsatisfactory for their management. In this Study we propose a method to handle real time transactions efficiently. We propose a solution by adopting transaction classification and a mixed concurrency control to provide an efficient technique for management of real time transactions. Extension to the case of a multitude of real time transactions has also been considered.

Proceedings ArticleDOI
27 Mar 1991
TL;DR: A variation of the time warp protocol for efficient parallel logic simulation on a massively parallel SIMD machine is proposed and an immediate cancellation technique of the rollback mechanism of time warp is proposed which works well for the single queue structure.
Abstract: A variation of the time warp protocol for efficient parallel logic simulation on a massively parallel SIMD machine is proposed. A scheme in which each object has a single event queue instead of three queues needed in the conventional time warp protocol is developed for fast queue manipulation and small storage requirement. An immediate cancellation technique of the rollback mechanism of time warp is proposed which does not require the use of antimessages and works well for the single queue structure. To minimize the queue size needed for simulation, a lower bound of rollback at each object is computed and used to discard processed events with timestamps less than the bound. Even if extra work is needed for computing and delivering the lower bound, the simulation using the lower bound of rollback is faster than the time warp protocol since queue manipulation time is significantly saved. Preliminary experimental results of the time warp variation on the Connection Machine-2 indicate significant savings in time and space over the time warp protocol. Moreover, the proposed scheme is much faster than the Intermetric VHDL simulator. >

Proceedings ArticleDOI
01 Apr 1991
TL;DR: This work discusses extensions to the basic model that include two new kinds of objects, multifutures and guarded objects, and language constructs such as parallel-do and divide-and-conquer, as well as other constructs that allow the programmer to control how futures are processed.
Abstract: We present an overview of a model of execution for concurrent object-oriented general-purpose computation, and a run-time system---SAM---that supports the model of execution. The basic model, which is transparent to the programmer, uses data-driven synchronization and speculative computation to obtain concurrency, and rollback to ensure correctness. We discuss extensions to the basic model that include two new kinds of objects, multifutures and guarded objects, and language constructs such as parallel-do and divide-and-conquer, as well as other constructs that allow the programmer to control how futures are processed. While all of these extensions appear useful, some can be integrated more naturally than others into the model of execution.

Book ChapterDOI
01 Jan 1991
TL;DR: The data model underlying Cactis is based on a principle the authors call active semantics, and is designed to support complex functionally-defined data, and supports an efficient rollback and recovery mechanism, which enables the user to freely explore the database.
Abstract: Cactis is an object-oriented database management system developed at the University of Colorado. The data model underlying Cactis is based on a principle we call active semantics, and is designed to support complex functionally-defined data. In an active semantics database, each entity is assigned a behavioral specification which allows it to respond to changes elsewhere in the database. Each entity may be a piece of non-derived or (possibly complex) derived data, and may have constraints associated with it. Derived data and constraint specifications are maintained automatically and efficiently by the system. Furthermore, the active semantics data model supports an efficient rollback and recovery mechanism, which enables the user to freely explore the database. Cactis has been implemented and a distributed version is under development.

Patent
17 Dec 1991
TL;DR: In this paper, the authors propose to save a memory area automatically at the point of time of the execution of a commit instruction and restore the memory area at the time of execution of the rollback instruction.
Abstract: PURPOSE:To simplify a recovery processing after the occurrence of deadlock, and to improve the productivity of programming by saving a memory area automatically at the point of time of the execution of a commit instruction, and restoring the memory area automatically at the point of time of the execution of a rollback instruction. CONSTITUTION:At the time of linking an object program, information about the memory area which is generated by a linker not shown in a figure and is the object of the saving and the restoration in an object program is stored in a backup control table 6. Next, when a memory save area links the object program, the memory area which is secured by the linker and is the object of the saving and the restoration in the program is saved, and when a commit mechanism 2 requires the commitment of a data base file, a table 1 called out from a saving mechanism 4 is referred to, and the contents of the memory area to be the object of the saving and the restoration are saved in the mechanism 4, and the memory area to be the object is restored by a rollback mechanism 3.

Proceedings ArticleDOI
08 Apr 1991
TL;DR: It is argued that the unique ways stable memory is used and the structure that is imposed on the log are steps in the right direction in the anticipated evolution of recovery management.
Abstract: A page-based incremental restart algorithm is proposed that enables resuming transaction processing immediately after recovering from a crash. Data items are recovered individually and according to the demands of the post-crash transactions. The support that such an algorithm needs in terms of nonvolatile RAM and efficient log retrieval methods is outlined. It is discussed how to construct high-level recovery based on operation logging on top of the page-based algorithm proposed. It is argued that the unique ways stable memory is used and the structure that is imposed on the log are steps in the right direction in the anticipated evolution of recovery management. >