scispace - formally typeset
Search or ask a question
Author

Marc Snir

Bio: Marc Snir is an academic researcher from University of Illinois at Urbana–Champaign. The author has contributed to research in topics: Shared memory & Message Passing Interface. The author has an hindex of 63, co-authored 227 publications receiving 15410 citations. Previous affiliations of Marc Snir include New York University & Courant Institute of Mathematical Sciences.


Papers
More filters
Book
01 Jan 1996
TL;DR: MPI: The Complete Reference is an annotated manual for the latest 1.1 version of the standard that illuminates the more advanced and subtle features of MPI and covers such advanced issues in parallel computing and programming as true portability, deadlock, high-performance message passing, and libraries for distributed and parallel computing.
Abstract: From the Publisher: MPI, the Message Passing Interface, is a standard and portable library of communications subroutines for parallel programming designed to function on a wide variety of parallel computers. It is useful on both parallel computers, such as IBM's SP2, the Cray ResearchT3D, and the Connection Machine, as well as networks of workstations. Written by five of the principal creators of the latest MPI standard MPI: The Complete Reference is an annotated manual for the latest 1.1 version of the standard that illuminates the more advanced and subtle features of MPI. It can be read in conjunction with the companion tutorial volume, Using MPI: Portable Parallel Programming with the Message-Passing Interface, by William Gropp, Ewing Lusk, and Anthony Skjellum. MPI: The Complete Reference is the only source that covers such advanced issues in parallel computing and programming as true portability, deadlock, high-performance message passing, and libraries for distributed and parallel computing. The annotations provide numerous illustrative programming examples and delve into even the most esoteric features or consequences of the standard. They explain why certain design choices were made, how users should use the interface, and how implementors should construct their own version of MPI. Scientific and Engineering Computation series

2,635 citations

Journal ArticleDOI
01 Feb 2011
TL;DR: The work of the community to prepare for the challenges of exascale computing is described, ultimately combing their efforts in a coordinated International Exascale Software Project.
Abstract: Over the last 20 years, the open-source community has provided more and more software on which the world’s high-performance computing systems depend for performance and productivity. The community has invested millions of dollars and years of effort to build key components. However, although the investments in these separate software elements have been tremendously valuable, a great deal of productivity has also been lost because of the lack of planning, coordination, and key integration of technologies necessary to make them work together smoothly and efficiently, both within individual petascale systems and between different systems. It seems clear that this completely uncoordinated development model will not provide the software needed to support the unprecedented parallelism required for peta/ exascale computation on millions of cores, or the flexibility required to exploit new hardware models and features, such as transactional memory, speculative execution, and graphics processing units. This report describes the work of the community to prepare for the challenges of exascale computing, ultimately combing their efforts in a coordinated International Exascale Software Project.

736 citations

Book
19 Sep 1998
TL;DR: This volume, the definitive reference manual for the latest version of MPI-1, contains a complete specification of the MPI Standard, annotated with comments that clarify complicated issues, including why certain design choices were made, how users are intended to use the interface, and how they should construct their version ofMPI.
Abstract: From the Publisher: Since its release in summer 1994, the Message Passing Interface (MPI) specification has become a standard for message-passing libraries for parallel computations. There exist more than a dozen implementations on a variety of computing platforms, from the IBM SP-2 supercomputer to PCs running Windows NT. The initial MPI Standard, known as MPI-1, has been modified over the last two years. This volume, the definitive reference manual for the latest version of MPI-1, contains a complete specification of the MPI Standard. It is annotated with comments that clarify complicated issues, including why certain design choices were made, how users are intended to use the interface, and how they should construct their version of MPI. The volume also provides many detailed, illustrative programming examples.

437 citations

Journal ArticleDOI
01 May 2014
TL;DR: This report presents a report produced by a workshop on ‘Addressing failures in exascale computing’ held in Park City, Utah, 4–11 August 2012, which summarizes and builds on discussions on resilience.
Abstract: We present here a report produced by a workshop on 'Addressing failures in exascale computing' held in Park City, Utah, 4-11 August 2012. The charter of this workshop was to establish a common taxonomy about resilience across all the levels in a computing system, discuss existing knowledge on resilience across the various hardware and software layers of an exascale system, and build on those results, examining potential solutions from both a hardware and software perspective and focusing on a combined approach. The workshop brought together participants with expertise in applications, system software, and hardware; they came from industry, government, and academia, and their interests ranged from theory to implementation. The combination allowed broad and comprehensive discussions and led to this document, which summarizes and builds on those discussions.

406 citations

Book
01 Jan 2011
TL;DR: The analysis finds a minimal set of delays that enforces sequential consistency in the execution of parallel programs on shared-memory multiple-instruction-stream, multiple-data-stream (MIMD) computers and uses a conflict graph similar to that used to schedule transactions in distributed databases to do without locks.
Abstract: In this paper we consider an optimization problem that arises in the execution of parallel programs on shared-memory multiple-instruction-stream, multiple-data-stream (MIMD) computers A program on such machines consists of many sequential program segments, each executed by a single processor These segments interact as they access shared variables Access to memory is asynchronous, and memory accesses are not necessarily executed in the order they were issued An execution is correct if it is sequentially consistent: It should seem as if all the instructions were executed sequentially, in an order obtained by interleaving the instruction streams of the processors Sequential consistency can be enforced by delaying each access to shared memory until the previous access of the same processor has terminated For performance reasons, however, we want to allow several accesses by the same processor to proceed concurrently Our analysis finds a minimal set of delays that enforces sequential consistency The analysis extends to interprocessor synchronization constraints and to code where blocks of operations have to execute atomically We use a conflict graph similar to that used to schedule transactions in distributed databases Our graph incorporates the order on operations given by the program text, enabling us to do without locks even when database conflict graphs would suggest that locks are necessary Our work has implications for the design of multiprocessors; it offers new compiler optimization techniques for parallel languages that support shared variables

402 citations


Cited by
More filters
01 May 1993
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

29,323 citations

Book
01 Jan 1995
TL;DR: This book introduces the basic concepts in the design and analysis of randomized algorithms and presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications.
Abstract: For many applications, a randomized algorithm is either the simplest or the fastest algorithm available, and sometimes both. This book introduces the basic concepts in the design and analysis of randomized algorithms. The first part of the text presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications. Algorithmic examples are also given to illustrate the use of each tool in a concrete setting. In the second part of the book, each chapter focuses on an important area to which randomized algorithms can be applied, providing a comprehensive and representative selection of the algorithms that might be used in each of these areas. Although written primarily as a text for advanced undergraduates and graduate students, this book should also prove invaluable as a reference for professionals and researchers.

4,412 citations

Journal ArticleDOI
TL;DR: PTRAJ and its successor CPPTRAJ are described, two complementary, portable, and freely available computer programs for the analysis and processing of time series of three-dimensional atomic positions and the data therein derived.
Abstract: We describe PTRAJ and its successor CPPTRAJ, two complementary, portable, and freely available computer programs for the analysis and processing of time series of three-dimensional atomic positions (i.e., coordinate trajectories) and the data therein derived. Common tools include the ability to manipulate the data to convert among trajectory formats, process groups of trajectories generated with ensemble methods (e.g., replica exchange molecular dynamics), image with periodic boundary conditions, create average structures, strip subsets of the system, and perform calculations such as RMS fitting, measuring distances, B-factors, radii of gyration, radial distribution functions, and time correlations, among other actions and analyses. Both the PTRAJ and CPPTRAJ programs and source code are freely available under the GNU General Public License version 3 and are currently distributed within the AmberTools 12 suite of support programs that make up part of the Amber package of computer programs (see http://ambe...

4,382 citations

Book
01 Jan 1996
TL;DR: This book familiarizes readers with important problems, algorithms, and impossibility results in the area, and teaches readers how to reason carefully about distributed algorithms-to model them formally, devise precise specifications for their required behavior, prove their correctness, and evaluate their performance with realistic measures.
Abstract: In Distributed Algorithms, Nancy Lynch provides a blueprint for designing, implementing, and analyzing distributed algorithms. She directs her book at a wide audience, including students, programmers, system designers, and researchers. Distributed Algorithms contains the most significant algorithms and impossibility results in the area, all in a simple automata-theoretic setting. The algorithms are proved correct, and their complexity is analyzed according to precisely defined complexity measures. The problems covered include resource allocation, communication, consensus among distributed processes, data consistency, deadlock detection, leader election, global snapshots, and many others. The material is organized according to the system model-first by the timing model and then by the interprocess communication mechanism. The material on system models is isolated in separate chapters for easy reference. The presentation is completely rigorous, yet is intuitive enough for immediate comprehension. This book familiarizes readers with important problems, algorithms, and impossibility results in the area: readers can then recognize the problems when they arise in practice, apply the algorithms to solve them, and use the impossibility results to determine whether problems are unsolvable. The book also provides readers with the basic mathematical tools for designing new algorithms and proving new impossibility results. In addition, it teaches readers how to reason carefully about distributed algorithms-to model them formally, devise precise specifications for their required behavior, prove their correctness, and evaluate their performance with realistic measures. Table of Contents 1 Introduction 2 Modelling I; Synchronous Network Model 3 Leader Election in a Synchronous Ring 4 Algorithms in General Synchronous Networks 5 Distributed Consensus with Link Failures 6 Distributed Consensus with Process Failures 7 More Consensus Problems 8 Modelling II: Asynchronous System Model 9 Modelling III: Asynchronous Shared Memory Model 10 Mutual Exclusion 11 Resource Allocation 12 Consensus 13 Atomic Objects 14 Modelling IV: Asynchronous Network Model 15 Basic Asynchronous Network Algorithms 16 Synchronizers 17 Shared Memory versus Networks 18 Logical Time 19 Global Snapshots and Stable Properties 20 Network Resource Allocation 21 Asynchronous Networks with Process Failures 22 Data Link Protocols 23 Partially Synchronous System Models 24 Mutual Exclusion with Partial Synchrony 25 Consensus with Partial Synchrony

4,340 citations