scispace - formally typeset
Search or ask a question

Showing papers by "Steven J. Plimpton published in 1993"


01 May 1993
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

29,323 citations


ReportDOI
01 May 1993
TL;DR: In this article, three parallel algorithms for classical molecular dynamics are presented, which can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors.
Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a subset of atoms; the second assigns each a subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently -- those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 10,000,000 atoms on three parallel supercomputers, the nCUBE 2, Intel iPSC/860, and Intel Delta. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and the Intel Delta performs about 30 times faster than a single Y-MP processor and 12 times faster than a single C90 processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

218 citations


Journal ArticleDOI
TL;DR: The authors present a detailed description of the implementation on a parallel supercomputer (hypercube) of the first-order equation-of-motion solution to Schrodinger`s equation, using plane-wave basis functions and ab initio separable pseudopotentials.
Abstract: The development of iterative solutions of Schrodinger`s equation in a plane-wave (pw) basis over the last several years has coincided with great advances in the computational power available for performing the calculations. These dual developments have enabled many new and interesting condensed matter phenomena to be studied from a first-principles approach. The authors present a detailed description of the implementation on a parallel supercomputer (hypercube) of the first-order equation-of-motion solution to Schrodinger`s equation, using plane-wave basis functions and ab initio separable pseudopotentials. By distributing the plane-waves across the processors of the hypercube many of the computations can be performed in parallel, resulting in decreases in the overall computation time relative to conventional vector supercomputers. This partitioning also provides ample memory for large Fast Fourier Transform (FFT) meshes and the storage of plane-wave coefficients for many hundreds of energy bands. The usefulness of the parallel techniques is demonstrated by benchmark timings for both the FFT`s and iterations of the self-consistent solution of Schrodinger`s equation for different sized Si unit cells of up to 512 atoms.

31 citations


Journal ArticleDOI
01 Jun 1993
TL;DR: This article documents the implementation of digital spotlight SAR processing components on three commercially avail able massively parallel computers: the Connection Ma chine (CM-2), the nCUBE 2, and the Intel iPSC/860.
Abstract: Near-real-time digital formation of large synthetic aper ture radar SAR images requires the computational throughput that only a dedicated special processor or a massively parallel computer can offer. This article documents the implementation of digital spotlight SAR processing components on three commercially avail able massively parallel computers: the Connection Ma chine CM-2, the nCUBE 2, and the Intel iPSC/860. The three basic spotlight SAR processing components, the polar reformatter, the two-dimensional fast Fourier transformation, and autofocus, are briefly discussed to provide a technical background for the implementation issues that apply to massively parallel computers. As pects of the SAR components that can exploit features of a SIMD or MIMD architecture are also presented. Finally, timing test results on various computers are provided for evaluation and comparison.

13 citations