scispace - formally typeset
Search or ask a question

Showing papers by "Richard M. Fujimoto published in 1999"


Book
01 Oct 1999
TL;DR: PADS expert Richard M. Fujimoto provides software developers with cutting-edge techniques for speeding up the execution of simulations across multiple processors and dealing with data distribution over wide area networks, including the Internet.
Abstract: From the Publisher: A state-of-the-art guide for the implementation of distributed simulation technology. The rapid expansion of the Internet and commodity parallel computers has made parallel and distributed simulation (PADS) a hot technology indeed. Applications abound not only in the analysis of complex systems such as transportation or the next-generation Internet, but also in computer-generated virtual worlds for military and professional training, interactive computer games, and the entertainment industry. In this book, PADS expert Richard M. Fujimoto provides software developers with cutting-edge techniques for speeding up the execution of simulations across multiple processors and dealing with data distribution over wide area networks ,including the Internet. With an emphasis on parallel and distributed discrete event simulation technologies, Dr. Fujimoto compiles and consolidates research results in the field spanning the last twenty years, discussing the use of parallel and distributed computers in both the modeling and analysis of system behavior and the creation of distributed virtual environments. While other books on PADS concentrate on applications, Parallel and Distributed Simulation Systems clearly shows how to implement the technology. It explains in detail the synchronization algorithms needed to properly realize the simulations, including an in-depth discussion of time warp and advanced optimistic techniques. Finally, the book is richly supplemented with references, tables and illustrations, and examples of contemporary systems such as the Department of Defense's High Level Architecture (HLA), which has become the standard architecture for defense programsin the United States.

547 citations


Journal ArticleDOI
TL;DR: For certain fine-grain models, such as queuing network models, it is shown that reverse computation can yield significant improvement in execution speed coupled with significant reduction in memory utilization, as compared to traditional state-saving.
Abstract: In optimistic parallel simulations, state-saving techniques have traditionally been used to realize rollback. In this article, we propose reverse computation as an alternative approach, and compare its execution performance against that of state-saving. Using compiler techniques, we describe an approach to automatically generate reversible computations, and to optimize them to reap the performance benefits of reverse computation transparently. For certain fine-grain models, such as queuing network models, we show that reverse computation can yield significant improvement in execution speed coupled with significant reduction in memory utilization, as compared to traditional state-saving. On sample models using reverse computation, we observe as much as a six-fold improvement in execution speed over traditional state-saving.

235 citations


Proceedings ArticleDOI
24 Mar 1999
TL;DR: This research develops and demonstrates a practical, scalable approach to parallel and distributed simulation that will enable widespread reuse of sequential network simulation models and software and describes the lessons learned in applying it to the publicly available ns software package.
Abstract: Discrete event simulation is widely used within the networking community for purposes such as demonstrating the validity of network protocols and architectures. Depending on the level of detail modeled within the simulation, the running time and memory requirements can be excessive. The goal of our research is to develop and demonstrate a practical, scalable approach to parallel and distributed simulation that will enable widespread reuse of sequential network simulation models and software. We focus on an approach to parallelization where an existing network simulator is used to build models of subnetworks that are composed to create simulations of larger networks. Changes to the original simulator care minimized, enabling the parallel simulator to easily track enhancements to the sequential version. We describe our lessons learned in applying this approach to the publicly available ns software package (McCanne and Floyd, 1997) and converting it to run in a parallel fashion on a network of workstations. This activity highlights a number of important problems, from the standpoint of how to parallelize an existing serial simulation model and achieving acceptable parallel performance.

168 citations


Proceedings ArticleDOI
01 May 1999
TL;DR: A partial order called approximate time (AT) is proposed to order events in both domains, facilitating reuse of simulations across DVE and analysis applications and to exploit temporal uncertainty in the model to achieve efficient conservative parallel simulation despite little or no lookahead.
Abstract: Parallel discrete event simulation algorithms are usually based on time stamp ordering of events. Distributed virtual environment (DVE) applications such as DIS typically use unordered event delivery. A partial order called approximate time (AT) is proposed to order events in both domains, facilitating reuse of simulations across DVE and analysis applications. A variation on AT-order called approximate time causal (ATC) order is also described. Synchronisation algorithms to realize these orderings are presented as well as performance measurements on a workstation cluster. A long-term goal of this work is to use AT and ATC order to exploit temporal uncertainty in the model to achieve efficient conservative parallel simulation despite little or no lookahead, a long-standing problem in the field.

97 citations


Proceedings ArticleDOI
01 May 1999
TL;DR: It is shown that reverse computation can yield significant improvement in execution speed coupled with significant reduction in memory utilization, as compared to traditional state-saving in certain fine-grain models.
Abstract: In optimistic parallel simulations, state-saving techniques have been traditionally used to realize rollback. We propose reverse computation as an alternative approach, and compare its execution performance against that of state-saving. Using compiler techniques, we describe an approach to automatically generate reversible computations, and to optimize them to transparently reap the performance benefits of reverse computation. For certain fine-grain models, such as queuing network models, we show that reverse computation can yield significant improvement in execution speed coupled with significant reduction in memory utilization, as compared to traditional state-saving. On sample models using reverse computation, we observe as much as three-fold improvement in execution speed over traditional state-saving.

92 citations


01 Jan 1999
TL;DR: Results of this study demonstrate the technical feasibility and performance that can be obtained by exploiting high performance interconnection hardware and software in realizing HLA RTIs in high-speed LAN environments.
Abstract: This paper presents recent results concerning the realization of HLA RTIs in high-speed LAN environments. Specifically, the UK-RTI and a second, simplified RTI implementation were realized on a cluster of Sun workstations using Myrinet, a gigabit, low-latency interconnection switch developed by Myricomm Inc. The performance of these implementations was compared with the UK-RTI and version 1.3 of the DMSO RTI using UDP and TCP/IP on an Ethernet LAN. The Myrinet implementations utilize a software package called RTI-Kit that implements group communication services and time management algorithms on high-speed interconnection hardware. Results of this study demonstrate the technical feasibility and performance that can be obtained by exploiting high performance interconnection hardware and software in realizing HLA RTIs. In particular, in most experiments the RTIs using Myrinet achieved one to two orders of magnitude improvement in performance relative to the UK-RTI and DMSO version 1.3 RTI using UDP/TCP in attribute update latency time, and the wallclock time required to perform a time management cycle. 1 Work completed while on leave at the Defence Evaluation and Research Agency, Malvern, UK. © British Crown Copyright 1998/DERA Published with the permission of the Controller of Her Britannic Majesty’s Stationery Office.

46 citations


01 Jan 1999
TL;DR: Extensions to the existing time management services in the HLA are proposed that allow repeatable executions, federate control over the ordering of simultaneous events, and zero lookahead, a feature not supported in the baseline HLA.
Abstract: A distributed simulation is said to be repeatable if successive executions utilizing the same inputs produce exactly the same outputs. Repeatability is a highly desirable property, particularly for analytic simulation models. This paper discusses the question of repeatability in distributed simulations in general, and in the context of the High Level Architecture in particular. Specifically, allowing zero lookahead, a feature not supported in the baseline HLA, has important ramifications with respect to achieving repeatable executions. Extensions to the existing time management services in the HLA are proposed that allow (1) repeatable executions, (2) federate control over the ordering of simultaneous events, and (3) zero lookahead. The extensions proposed in this paper are currently under consideration by DMSO for possible future inclusion in the High Level Architecture time management services.

46 citations


Proceedings ArticleDOI
01 Dec 1999
TL;DR: The article addresses problems involving the size and complexity of models, verification, validation and accreditation, the modeling methodological and model execution implications of parallel and distributed simulation, and random number generation and execution efficiency improvements through quasi-Monte Carlo, and variance reduction.
Abstract: The future directions of simulation research are analysed. The formulation of such a vision could provide valuable guidance and assistance with respect to decisions involving the generation and allocation of future research funding. The article addresses problems involving: (1) the size and complexity of models; (2) verification, validation and accreditation; (3) the modeling methodological and model execution implications of parallel and distributed simulation; (4) the centrality of modeling to the discipline of computer science; and (5) random number generation and execution efficiency improvements through quasi-Monte Carlo, and variance reduction.

36 citations


Proceedings ArticleDOI
01 Dec 1999
TL;DR: The low level machine performance statistics, especially those that relate to memory system performance, such as caching, and translation look-aside buffer misses, are examined and suggest that TLB misses are the primary culprit for state-saving's performance degradation.
Abstract: State-saving and reverse computation are two different approaches by which rollback is realized in Time Warp-based parallel simulation systems. Of the two approaches, state-saving is, in general, more memory-intensive than reverse computation. When executed on a state-of-the-art commercial CC-NUMA (Cache Coherent Non-Uniform Memory Architecture) multiprocessor, our Time Warp system runs almost 6 times slower if state-saving is used than if reverse computation is used. The focus of this paper is to understand why state-saving yields such poor performance when compared to reverse computation on a CC-NUMA multiprocessor. To address this question, we examined the low level machine performance statistics, especially those that relate to memory system performance, such as caching, and translation look-aside buffer (TLB) misses. The outcome of the performance study suggests that TLB misses are the primary culprit for state-saving's performance degradation.

22 citations



Proceedings ArticleDOI
01 Dec 1999
TL;DR: To mark the 50th anniversary of the Association for Com putting Machinery, Volume 28, Number 4 of ACM Computing Surveys was released entitled, “Strategic Direction in Computing Research,” and among the topics covered was computer simulation.
Abstract: To mark the 50th anniversary of the Association for Com puting Machinery, Volume 28, Number 4 of ACM Computing Surveyswas released entitled, “Strategic Direction in Computing Research.” Notable—to attendees of W ter Simulation Conference at least—among the topics covered was computer simulation. One may reasonably ask why this is so. Are there unanswered questions remaining in computer simulation? computer simulation an unimportant topic? Does simulat not have relevance as a computing discipline? Should rather be considered solely in terms of operations resea statistics or mathematics? Arguably computer simulation is quite relevant to tec nological advance in many arenas. Modeling and simu tion are playing key roles within industry, academia a the government. The papers appearing in these Proceedings

01 Jan 1999
TL;DR: The High Level Architecture effort is viewed by many as the next generation for DIS, and the proposed time management services include different categories of message transportation reliability and ordering, and mechanisms for controlling time advances.
Abstract: The High Level Architecture (HLA) effort is viewed by many as the next generation for DIS. HLA encompasses a broad range of simulation applications including training, analysis, and test and evaluation of components and systems. A challenging aspect of the HLA concerns defining a single time management structure that not only supports a wide variety of federations (e.g., DIS, ALSP, hardware-inthe-loop simulations), but also supports interoperability among simulations using different local time management mechanisms. For example, a single federation execution may include both DIS simulations where interactions are based on the real-time arrival of simulation messages, and constructive simulations such as ALSP where events must be processed according to logical time timestamp order to ensure causeand-effect relationships are correctly reproduced by the simulation. This paper describes the time management services that have been proposed for the HLA. These services include different categories of message transportation reliability and ordering, and mechanisms for controlling time advances. The ramifications of these time advance services on DIS are discussed.

Dissertation
01 Jan 1999
TL;DR: It is asserted that, by use of appropriately constructed novel techniques, it is indeed possible to perform fast optimistic parallel simulation of fine-grained models, using both event-oriented and process-oriented views, and that this work serves to demonstrate the readiness and feasibility of applying parallel simulation technology to today's large and complex models.
Abstract: It is widely recognized that parallel simulation technology is necessary to address the new simulation requirements of important applications such as large-scale network models. However, current parallel simulation techniques possess limitations in the type and sizes of models that can be efficiently simulated in parallel. In particular, optimistic parallel simulation of fine-grained models have been plagued by large state saving overheads, both in event-oriented and process-oriented views, resulting in unsatisfactory parallel execution speed for many important applications. In the absence of alternative solutions, it was generally believed that optimistic approaches are inapplicable in such applications. This thesis asserts that, by use of appropriately constructed novel techniques, it is indeed possible to perform fast optimistic parallel simulation of fine-grained models, using both event-oriented and process-oriented views. In support of this claim, techniques are presented here that significantly lower the overheads, thereby enabling the capability to efficiently simulate large-scale, fine-grained models in parallel. On sample models, when compared to previously known approaches, the techniques presented here improve the simulation speed by a factor of 3 or more, while simultaneously reducing the memory requirements by almost half. The first technique addresses the high overheads of state saving mechanisms that are traditionally used in supporting rollback operations in optimistic parallel simulation. An alternative approach called reverse computation is identified for realizing rollback, which is demonstrated to significantly improve the parallel simulation speed while greatly reducing the memory utilization of the simulation. The next technique concerns the process-oriented worldview, which is extremely useful in many domains, such as telecommunication network protocol modeling. An approach called stack reconstruction is developed to address the high execution overheads traditionally associated with process-oriented views, and its effectiveness is demonstrated in achieving a high rate of process-context switching during optimistic simulation. Additional contributions of this thesis include the identification and solution of other typical problems encountered in the design, development and parallel simulation of models for real-life telecommunication network protocols. This work serves to demonstrate the readiness and feasibility of applying parallel simulation technology to today's large and complex models. The parallel simulation techniques described in this thesis are applied and analyzed in the context of telecommunication network simulation, using representative network models. The techniques, however, are not restricted to network simulation, but are equally applicable to other domains as well. For example, the parallel simulation of any network of queues can benefit from the reverse computation system presented here. In fact, the reverse computation system is relevant to other application areas as well, such as in database recovery, and debugging environments. Similarly, the stack reconstruction approach is applicable to any multi-threaded system that requires an efficient incremental checkpointing facility for its thread states.

Journal ArticleDOI
TL;DR: PVaniM is extended into a new system, called PVaniM–GTW, by adding middleware-specific views that enable one to better satisfy the needs of PDES middleware than general-purpose visualization systems while also not requiring the development of application-specific visualizations by the end-user.

Journal ArticleDOI
TL;DR: Measurements indicate that schedulers can be composed and specialized to offer performance similar to that of dedicated scheduling co-processors.
Abstract: Dynamic, high-performance or real-time applications require scheduling latencies and throughput not typically offered by current kernel or user-level threads schedulers. Moreover, it is widely accepted that it is important to be able to specialize scheduling policies for specific target applications and their execution environments. This paper presents one solution to the construction of such high-performance, application-specific thread schedulers. Specifically, scheduler implementations are composed from modular components, where individual scheduler modules may be specialized to underlying hardware characteristics or implement precisely the mechanisms and policies desired by application programs. The resulting user-level schedulers' implementations can provide resource guarantees by interaction with kernel-level facilities which provide means of resource reservation. This paper demonstrates the concept of composable schedulers by construction of several compositions for highly dynamic target applications, where low scheduling latencies are critical to application performance. Claims about the importance and effectiveness of scheduler composition are validated experimentally on a shared-memory multiprocessor. Scheduler compositions are optimized to take advantage of different low-level hardware attributes and of knowledge about application requirements specific to certain applications, including a Time Warp-based real-time discrete event simulator. Experimental evaluations are based on synthetic workloads, on a real-time simulation blending simulated with implemented control system components, and on a dynamic robot control program. Measurements indicate that schedulers can be composed and specialized to offer performance similar to that of dedicated scheduling co-processors. Copyright © 1999 John Wiley & Sons, Ltd.