Search or ask a question

Showing papers by "Robert E. Walkup published in 2005"

PDF

Open Access

Journal Article•DOI•

Optimizing task layout on the Blue Gene/L supercomputer

[...]

Gyan Bhanot¹, Alan Gara¹, P. Heidelberger¹, E. Lawless², James C. Sexton¹, Robert E. Walkup¹ - Show less +2 more•Institutions (2)

IBM¹, Trinity College, Dublin²

01 Mar 2005-Ibm Journal of Research and Development

TL;DR: A heuristic map is implemented that attempts to sequentially map a domain and its communication neighbors either to the same BG/L node or to near-neighbor nodes on theBG/L torus, while keeping the number of domains mapped to a BG/ L node constant.

...read moreread less

Abstract: A general method for optimizing problem layout on the Blue Gene®/L (BG/L) supercomputer is described. The method takes as input the communication matrix of an arbitrary problem as an array with entries C(i, j), which represents the data communicated from domain i to domain j. Given C(i, j), we implement a heuristic map that attempts to sequentially map a domain and its communication neighbors either to the same BG/L node or to near-neighbor nodes on the BG/L torus, while keeping the number of domains mapped to a BG/L node constant. We then generate a Markov chain of maps using Monte Carlo simulation with free energy F =Σi,j C(i, j)H(i, j), where H(i, j) is the smallest number of hops on the BG/L torus between domain i and domain j. For two large parallel applications, SAGE and UMT2000, the method was tested against the default Message Passing Interface rank order layout on up to 2,048 BG/L nodes. It produced maps that improved communication efficiency by up to 45%.

...read moreread less

80 citations

Journal Article•DOI•

Blue Gene/L performance tools

[...]

Xavier Martorell¹, N. Smeds², Robert E. Walkup³, Jose R. Brunheroto³, G. Almasi³, John A. Gunnels³, Luiz DeRose⁴, Jesús Labarta¹, Francesc Escalé¹, Judit Gimenez¹, Harald Servat¹, José E. Moreira⁵ - Show less +8 more•Institutions (5)

Polytechnic University of Catalonia¹, Royal Institute of Technology², IBM³, Cray⁴, University of Rochester⁵

01 Mar 2005-Ibm Journal of Research and Development

TL;DR: This work provides a variety of performance analysis tools for the new Blue Gene®/L supercomputer, and demonstrates their usefulness and applicability with case studies of application optimization.

...read moreread less

Abstract: Good performance monitoring is the basis of modern performance analysis tools for application optimization. We are providing a variety of such performance analysis tools for the new Blue Gene®/L supercomputer. Those tools can be divided into two categories: single-node performance tools and multinode performance tools. From a single-node perspective, we provide standard interfaces and libraries, such as PAPI and libHPM, that provide access to the hardware performance counters for applications running on the Blue Gene/L compute nodes. From a multinode perspective, we focus on tools that analyze Message Passing Interface (MPI) behavior. Those tools work by first collecting message-passing trace data when a program runs. The trace data is then used by graphical interface tools that analyze the behavior of applications. Using the current prototype tools, we demonstrate their usefulness and applicability with case studies of application optimization.

...read moreread less

10 citations

Patent•

Optimizing layout of an aplication on a massively parallel supercomputer

[...]

Gyan V. Bhanot, Alan Gara, Philip Heidelberger, Eoin M. Lawless, James C. Sexton, Robert E. Walkup - Show less +2 more

06 Oct 2005

TL;DR: In this paper, a general computer-implement method and apparatus to optimize problem layout on a massively parallel supercomputer is described, which takes as input the communication matrix of an arbitrary problem in the form of an array whose entries C(i, j) are the amount to data communicated from domain i to domain j. Given C( i, j), first implement a heuristic map is implemented which attempts sequentially to map a domain and its communications neighbors either to the same supercomputer node or to near-neighbor nodes on the supercomputer torus while keeping the

...read moreread less

Abstract: A general computer-implement method and apparatus to optimize problem layout on a massively parallel supercomputer is described. The method takes as input the communication matrix of an arbitrary problem in the form of an array whose entries C(i, j)are the amount to data communicated from domain i to domain j. Given C(i, j), first implement a heuristic map is implemented which attempts sequentially to map a domain and its communications neighbors either to the same supercomputer node or to near-neighbor nodes on the supercomputer torus while keeping the number of domains mapped to a supercomputer node constant (as much as possible). Next a Markov Chain of maps is generated from the initial map using Monte Carlo simulation with Free Energy (cost function) F=ΣEij C(i,j)H(i,j)-where H(i,j)is the smallest number of hops on the supercomputer torus between domain i and domain j. On the cases tested, found was that the method produces good mappings and has the potential to be used as a general layout optimization tool for parallel codes. At the moment, the serial code implemented to test the method is un-optimized so that computation time to find the optimum map can be several hours on a typical PC. For production implementation, good parallel code for our algorithm would be required which could itself be implemented on supercomputer.

...read moreread less

2 citations