scispace - formally typeset
Search or ask a question

Showing papers in "ACM Transactions on Modeling and Computer Simulation in 1991"


Journal ArticleDOI
TL;DR: A general model of rollback in parallel processing is presented and a tunable algorlthm, Filtered Rollback, is given that is designed to avoid the failure modes and a rigorous mathematical proof that Faltered Rollbacks is efficient, if implemented on a reasonably efficient multiprocessor is provided.
Abstract: We present and analyze a general model of rollback in parallel processing, The analysis points out three possible modes where rollback may become excessive; we provide an example of each type. We identify the parameters that determme a stability, or efficiency region for the simulation. Our analysis suggests the possibility of a dangerous “phase-transition” from stabil ity to instability y in the parameter space. In particular, a rollback algorlthm may work efficiently for a small system but become inefficient for a large system. Moreover, for a given system, it may work quickly for a while and then suddenly slow down On the positive side, we give a tunable algorlthm, Filtered Rollback, that is designed to avoid the failure modes, Under appropriate assumptions, we provide a rigorous mathematical proof that Faltered Rollback m efficient, if implemented on a reasonably efficient multiprocessor. In particular, we show that the average time r to complete the simulation of a system with N nodes and R events on a p-processor PRAM satisfies

100 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the use of massively parallel architectures to execute discrete-event simulations of self-initiating models and derive upper and lower bounds on optimal performance.
Abstract: This paper considers the use of massively parallel architectures to execute discrete-event simulations of what we term “self-initiating” models. A logical process in a self-initiating model schedules its own state reevaluation times, independently of any other logical process, and sends its new state to other logical processes following the reevaluation. Our interest is in the effects of that communication on synchronization. Using a model that idealizes the communication topology of a simulation, we consider the performance of various synchronization protocols by deriving upper and lower bounds on optimal performance, upper bounds on Time Warp's performance, and lower bounds on the performance of a new consevative protocol. Our analysis of Time Warp includes some of the overhead costs of state saving and rollback; the effects of propogating rollbacks are ignored. The analysis points out sufficient conditions for the conservitive protocol to outperform Time Warp. The analysis also quantifies the sensitivity of performance to message fanout, lookahead ability, and the probability distributions underlying the simulation.

91 citations


Journal ArticleDOI
Shu Tezuka1, Pierre L'Ecuyer
TL;DR: This paper proposes three combined Tausworthe random number generators with period length about 1018, whose k-distribution properties are good and which can be implemented in a portable way by applying a battery of statistical tests to these generators.
Abstract: In this paper, we propose three combined Tausworthe random number generators with period length about 1018, whose k-distribution properties are good and which can be implemented in a portable way. These generators are found through an exhaustive search for the combination with the best lattice structure in GF{2, x }~, the k-dimensional vector space over the field of all Laurent series with coefficients in GF(2). We then apply a battery of statistical tests to these generators for the comprehensive investigation of their empirical statistical properties. No apparent defect was found. In the appendix, we give a sample program in C for the generators.

88 citations


Journal ArticleDOI
TL;DR: This work considers simulation experiments in which t is viewed as either being a strict constraint, or a guideline, in which case simulation beyond time t is permitted, and proposes an unbiased estimator for a simple mean value.
Abstract: We analyze properties associated with a simple yet effective way to exploit parallel processors in discrete event simulations: averaging the results of multiple, independent replications that are run, in parallel, on multiple processors. We focus on estimating expectations from terminating simulations, or steady state parameters from regenerative simulations. We assume that there is a CPU time constraint, t, on each of P processors. Unless the replication lengths are bounded, one must be willing to simulate beyond any fixed, finite time t on at least some processors in order to always obtain a strongly consistent estimator (as the number of processors increases). We therefore consider simulation experiments in which t is viewed as either being a strict constraint, or a guideline, in which case simulation beyond time t is permitted. The statistical properties, including strong laws, central limit theorems, bias expansions, and completion time distributions of a variety of estimators obtainable from such an experiment are derived. We propose an unbiased estimator for a simple mean value. This estimator requires preselecting a fraction of the processors. Simulation beyond time t may be required on a preselected processor, but only if no replications have yet been completed on that processor. being a strict constraint, or a guideline, in which case simulation beyond time t is permitted. The statistical properties, including strong laws, central limit theorems, bias expansions, and completion time distributions of a variety of estimators obtainable from such an experiment are derived. We propose an unbiased estimator for a simple mean value. This estimator requires preselecting a fraction of the processors. Simulation beyond time t may be required on a preselected processor, but only if no replications have yet been completed on that processor.

83 citations


Journal ArticleDOI
TL;DR: The goal is to design an efficient memory management protocol which guarantees that the memory consumption of parallel simulation is of the same order as sequential simulation, and proposes an optimal algorithm called artifical rollback.
Abstract: Recently there has been a great deal of interest in performance evalution of parallel simulation. Most work is devoted to the time complexity and assumes that the amount of memory available for parallel simulation is unlimited. This paper studies the space complexity of parallel simulation. Our goal is to design an efficient memory management protocol which guarantees that the memory consumption of parallel simulation is of the same order as sequential simulation. (Such an algorithm is referred to as a optimal.) First, we derive the relationships among the space complexities of sequential simulation, Chandy-Misra simulation [2], and Time Warp simulation [7]. We show that Chandy-Misra may consume more storage than sequential simulation, or vice versa. Then we show that Time Warp never consumes less memory than sequential simulation. Then we describe cancelback, an optimal Time Warp memory management protocol proposed by Jefferson. Although cancelback is considered to be complete solution for the storage management problem in Time Warp, some efficiency issues in implementing this algorithm must be considered. We propose an optimal algorithm called artifical rollback. We show that this algorithm is easy to implement and analyze. An implementation of artificial rollback is given, which is integrated with processor scheduling to adjust the memory consumption rate based on the amount of free storage available in the system.

82 citations


Journal ArticleDOI
TL;DR: Results indicate that message preemtion has a significant effect on performance when (1) the processor is highly utilized, (2) the execution times of messages have high varience, and (3) rollbacks occur frequently.
Abstract: The Time Warp “optimistic” approach is one of the most important parallel simulation protocols. Time Warp synchronizes processes via rollback. The original rollback mechanism called lazy cancellation has aroused great interest. This paper studies these rollback mechanisms. The general tradeoffs between aggressive and lazy cancellation are discussed, and by a conservitive-optimal simulation is defined for comparitive purposes. Within the framework of aggressive cancellation, we offer some observations and analyze the rollback behavior of tandom systems. The lazy cancellation mechanism iss examined using a metric called the sensitivity of output message. Both aggressive and lazy cancellation are shown to work well for a process with a small simulated load intensity. Finally, an analytical model is given to analyze message preemtion, an important factor that affects the performance of rollback mechanisms. Results indicate that message preemtion has a significant effect on performance when (1) the processor is highly utilized, (2) the execution times of messages have high varience, and (3) rollbacks occur frequently.

72 citations


Journal ArticleDOI
TL;DR: This work presents a time-division parallel simulation algorithm that partitions the time domain via state matching and shows that linear speed up can be achieved.
Abstract: Most parallel simulation algorithms (e.g., Chandy and Misra’s algorithm or the Time Warp algorithm) are based on a “space-division” approach. The parallelism of this approach is limited by the causality constraints. Another approach, the “time-division” approach, may provide more parallelism if the time domain is appropriately partitioned. We present a time-division parallel simulation algorithm that partitions the time domain via state matching, We show that linear speed up can be achieved. For a complex system, the best parallel simulation approach is to integrate “time-division” and “space-division” algorithms: the simulated system is partitioned into several subsystems; a subsystem may be simulated by the time-division approach (e. g., our algorithm), while the overall system is simulated by the space-division approach.

71 citations


Journal ArticleDOI
TL;DR: In this article, a theory of distributed simulation applicable to both discrete-event and continuous simulation is presented, and an implementation of a new algorithm derived from the theory is described, using the new algorithm, on parallel computers.
Abstract: A theory of distributed simulation applicable to both discrete-event and continuous simulation is presented. It derives many existing simulation algorithms from the theory and describes an implementation of a new algorithm derived from the theory. A high-level discrete-event simulation language has been implemented, using the new algorithm, on parallel computers; performance results of the implementation are also presented.

64 citations


Journal ArticleDOI
TL;DR: In this article, the suitability of the Chandy-Misra-Bryant (CMB) algorithm for the domain of digital logic simulation is explored based on results for six realistic benchmark circuits, one of them being the R6000 microprocessor form MIPS.
Abstract: We explore the suitability of the Chandy-Misra-Bryant (CMB) algorithm for the domain of digital logic simulation. Our evaluation is based on results for six realistic benchmark circuits, one of them being the R6000 microprocessor form MIPS. A quantitative evaluation of the concurrency exhibited by the CMB algorithm shows that an average of 42-196 element activations can be evaluated in parallel if arbitrarily many processors are available. One major factor limiting the parallel performance is the large number of deadlocks that occur. We present a classification of the types of deadlocks and describe them in terms of circuit structure. Using domain-specific knowledge, we propose and evaluate several methods for both reducing the number of deadlock occurences and for reducing the time spent on each occurence. Running on a 16-processor Encore Multimax we observe speedups of 6-9. While these self-relative speedups are larger than a parallel version of the traditional centralized-time event-driven algorithm, they come at the price of large overheads: significantly more complex element evaluations, extra element evaluations, and deadlock resolution time. These overheads overwhelm the advantages of using distributed time and consistently make the parallel performance of the CMB algorithm about three times slower than that of the traditional parallel event-driven algorithm. Our experience leads us to conclude that the distributed-time CMB algorithm does not present a viable alternative to the centralized-time event-driven algorithm in the domain of parallel digital logic simulation.

49 citations


Journal ArticleDOI
TL;DR: It is found that the optimistic approach scales well as P increases, and the model tracks the progress of Global Virtual Time and eliminates the need to know the virtual time positions of all processors, thus making the analysis quite straightforward.
Abstract: We provide upper and lower bounds and an approximation for speedup of an optimistic self-initiated distributed simulation using a very simple model. We assume an arbitrary number of processors and a uniform connection topology. By showing that the lower bound increases essentially linearly with P, the number of processors, we find that the optimistic approach scales well as P increases. The model tracks the progress of Global Virtual Time (GVT) and eliminates the need to know the virtual time positions of all processors, thus making the analysis quite straightforward.

26 citations


Journal ArticleDOI
TL;DR: It is shown how the pruning process supports reuse of previously pruned structures and concepts of context-sensitive pruning and partitioned entity structure bases are introduced to promote model base coherence and evolvability.
Abstract: This article describes further efforts to employ the Systems Entity Structure/Model Base framework as a workable foundation for model base management in advanced simulation environments and workbenches. Such management facilities aim to provide a sharable repository of models and a means of assisting users to synthesize models to satisfy the objectives of the current study. In our approach, we view a multifaceted system as needing many models on which to base control, management, design and other interventions. These models differ in level of abstraction and in formalism. Concepts and tools are needed to organize the models into a coherent whole. This paper deals with the management of model bases using system entity structure concepts. We show how the pruning process supports reuse of previously pruned structures. Concepts of context-sensitive pruning and partitioned entity structure bases are introduced to promote model base coherence and evolvability.

Journal ArticleDOI
TL;DR: I’d like to thank Dick Nance and Randy Sadowski for inviting me to give this address, and was somewhat reluctant to do so, as I have been out of touch with the simulation community and was afraid that too much had changed since I was active to make me feel comfortable as a keynote speaker.

Journal ArticleDOI
TL;DR: It is proposed that synchronization be enforced by a distributed computing system itself, instead of the computation, and this model of distributed computation is called the Se&S’ynch ron wing Concurrent Computmg System (SES YCCS).
Abstract: Models for synchronization of distributed computation on message-passing computing systems are proposed and new algorithms for efficient synchronization are analyzed It is observed that synchronization separated from computation results m an efficient implementation We propose that synchronization be enforced by a distributed computing system itself, instead of the computation. This model of distributed computation is called the Se&S’ynch ron wing Concurrent Computmg System (SES YCCS). Simulations of the models confirm the analytical estimates derived. These results find immediate application in the parallel and distributed simulation of discrete-event dynamical systems. An operating system for parallel simulations has recently been implemented on the BBN Butterfly Parallel Computer that validates the results of the theory.

Journal ArticleDOI
TL;DR: The design characteristics and programming constructs of HSL are described and several issues relevant to simulation languages in general and HSL in particular are discussed.
Abstract: The Hierarchical Simulation Language (HSL) was designed and developed to serve process-oriented simulation of discrete systems. It is interpreter-based and hence offers certain advantages, such as portability (hexdware independence) and modifiability (during program execution). An HSL model consists of two major sections. The Environment contains the specifications of the model and model control statements. The Simulator is a set of functions and processes that carry out the run-time activities of the model. Processes can be hierarchically refined or compressed, to whatever level of model detail desired. This paper describes the design characteristics and programming constructs of HSL. Several issues relevant to simulation languages in general and HSL in particular are then discussed. This is followed by an example HSL program presented to illustrate many of its features.

Journal ArticleDOI
TL;DR: It is argued that support for a retraction primitive within the Time Warp kernel can significantly improve performance for programs that use event retraction to a moderate or heavy degree.
Abstract: We examine a primitive that allows application programs to explicitly retract previously scheduled events in Time Warp. Specifically, a simple mechanism for retracting messages is defined, and a correctness proof is presented. Various optimizations to the proposed mechanism are also described. The proposed mechanism is intended to be implemented within the kernel of the Time Warp runtime executive. It is also possible to implement event retraction within the application program itself, without kernel support. Empirical data is presented to compare the performance observed in these two approaches. Based on these studies, we argue that support for a retraction primitive within the Time Warp kernel can significantly improve performance for programs that use event retraction to a moderate or heavy degree.