Showing papers by "Michael Wilde published in 2009"

PDF

Open Access

Journal Article•DOI•

Parallel Scripting for Applications at the Petascale and Beyond

[...]

Michael Wilde¹, Ian Foster¹, Kamil Iskra¹, Pete Beckman¹, Zhao Zhang², Allan Espinosa², Mihael Hategan², Ben Clifford², Ioan Raicu³ - Show less +5 more•Institutions (3)

Argonne National Laboratory¹, University of Chicago², Northwestern University³

01 Nov 2009-IEEE Computer

TL;DR: Parallel scripting extends this technique to allow for the rapid development of highly parallel applications that can run efficiently on platforms ranging from multicore workstations to petascale supercomputers.

...read moreread less

Abstract: Scripting accelerates and simplifies the composition of existing codes to form more powerful applications. Parallel scripting extends this technique to allow for the rapid development of highly parallel applications that can run efficiently on platforms ranging from multicore workstations to petascale supercomputers.

...read moreread less

110 citations

Journal Article•DOI•

Extreme-scale scripting: Opportunities for large task-parallel applications on petascale computers

[...]

Michael Wilde¹, Ioan Raicu², Allan Espinosa², Zhao Zhang¹, Ben Clifford¹, Mihael Hategan¹, Sarah Kenny¹, Kamil Iskra¹, Pete Beckman¹, Ian Foster¹, Ian Foster² - Show less +7 more•Institutions (2)

Argonne National Laboratory¹, University of Chicago²

01 Jul 2009

TL;DR: The applications that can benefit from parallel scripting on petascale-class machines are characterized, the mechanisms that make this feasible on such systems are described, and results achieved with parallel scripts on currently available petascales computers are presented.

...read moreread less

Abstract: Parallel scripting is a loosely-coupled programming model in which applications are composed of highly parallel scripts of program invocations that process and exchange data via files We characterize here the applications that can benefit from parallel scripting on petascale-class machines, describe the mechanisms that make this feasible on such systems, and present results achieved with parallel scripts on currently available petascale computers

...read moreread less

28 citations

Proceedings Article•DOI•

Case studies in storage access by loosely coupled petascale applications

[...]

Justin M. Wozniak¹, Michael Wilde¹•Institutions (1)

Argonne National Laboratory¹

14 Nov 2009

TL;DR: This work profiles the essential operations in the I/O workload for five loosely coupled scientific applications and offers an analysis to motivate and aid the development of programming tools,I/O subsystems, and filesystems.

...read moreread less

Abstract: A large number of real-world scientific applications can be characterized as loosely coupled: the communication among tasks is infrequent and can be performed by using file operations. While these applications may be ported to large scale machines designed for tightly coupled, massively parallel jobs, direct implementations do not perform well because of the large number of small, latency-bound file accesses. This problem may be overcome through the use of a variety of custom, hand-coded strategies applied at various subsystems of modern near-petascale computers- but is a labor intensive process that will become increasingly difficult at the petascale and beyond. This work profiles the essential operations in the I/O workload for five loosely coupled scientific applications. We characterize the I/O workload induced by these applications and offer an analysis to motivate and aid the development of programming tools, I/O subsystems, and filesystems.

...read moreread less

23 citations

Journal Article•DOI•

Parallel workflows for data-driven structural equation modeling in functional neuroimaging.

[...]

Sarah Kenny¹, Michael Andric¹, Steven M. Boker², Michael C. Neale³, Michael Wilde⁴, Michael Wilde¹, Steven L. Small¹ - Show less +3 more•Institutions (4)

University of Chicago¹, University of Virginia², Virginia Commonwealth University³, Argonne National Laboratory⁴

20 Oct 2009-Frontiers in Neuroinformatics

TL;DR: A computational framework suitable for a data-driven approach to structural equation modeling (SEM) is presented and several workflows for modeling functional magnetic resonance imaging (fMRI) data within this framework are described.

...read moreread less

Abstract: We present a computational framework suitable for a data-driven approach to structural equation modeling (SEM) and describe several workflows for modeling functional magnetic resonance imaging (fMRI) data within this framework. The Computational Neuroscience Applications Research Infrastructure (CNARI) employs a high-level scripting language called Swift, which is capable of spawning hundreds of thousands of simultaneous R processes (R Core Development Team, 2008), consisting of self-contained structural equation models, on a high performance computing system (HPC). These self-contained R processing jobs are data objects generated by OpenMx, a plug-in for R, which can generate a single model object containing the matrices and algebraic information necessary to estimate parameters of the model. With such an infrastructure in place a structural modeler may begin to investigate exhaustive searches of the model space. Specific applications of the infrastructure, statistics related to model fit, and limitations are discussed in relation to exhaustive SEM. In particular, we discuss how workflow management techniques can help to solve large computational problems in neuroimaging.

...read moreread less

18 citations

Journal Article•DOI•

Database-managed Grid-enabled analysis of neuroimaging data: The CNARI framework

[...]

Steven L. Small¹, Michael Wilde¹, Michael Wilde², Sarah Kenny¹, Michael Andric¹, Uri Hasson³ - Show less +2 more•Institutions (3)

University of Chicago¹, Argonne National Laboratory², University of Trento³

01 Jul 2009-International Journal of Psychophysiology

TL;DR: The Computational Neuroscience Applications Research Infrastructure (CNARI) incorporates novel methods for maintaining, serving, and analyzing massive amounts of fMRI data and believes that these advanced computational approaches will fundamentally change the future shape of cognitive brain imaging with fMRI.

...read moreread less

15 citations

Proceedings Article•DOI•

ADEM: Automating deployment and management of application software on the Open Science Grid

[...]

Zhengxiong Hou¹, Jing Tie², Xingshe Zhou¹, Ian Foster³, Michael Wilde³ - Show less +1 more•Institutions (3)

Northwestern Polytechnic University¹, University of Chicago², Argonne National Laboratory³

11 Dec 2009

TL;DR: An automation tool, ADEM, is proposed for grid application software deployment and management, and experimental results on the Open Science Grid show that ADEM is easy to use and more productive for users than manual operation.

...read moreread less

Abstract: In grid environments, the deployment and management of application software presents a major practical challenge for end users. Performing these tasks manually is error-prone and not scalable to large grids. In this work, we propose an automation tool, ADEM, for grid application software deployment and management, and demonstrate and evaluate the tool on the Open Science Grid. ADEM uses Globus for basic grid services, and integrates the grid software installer Pacman. It supports both centralized “prebuild” and on-site “dynamic-build” approaches to software compilation, using the NMI Build and Test system to perform central prebuilds for specific target platforms. ADEM's parallel workflow automatically determines available grid sites and their platform “signatures”, checks for and integrates dependencies, and performs software build, installation, and testing. ADEM's tracking log of build and installation activities is helpful for troubleshooting potential exceptions. Experimental results on the Open Science Grid show that ADEM is easy to use and more productive for users than manual operation.

...read moreread less

13 citations

Journal Article•DOI•

New Science on the Open Science Grid

[...]

Ruth Pordes¹, Mine Altunay¹, Paul Avery², Alina Bejan³, Kent Blackburn⁴, Alan Blatecky⁵, Robert Gardner³, Bill Kramer⁶, Miron Livny⁷, John McGee⁵, M. Potekhin⁸, Rob Quick⁸, Doug Olson⁶, Alain Roy⁶, Chander Sehgal¹, Torre Wenaus⁸, Michael Wilde³, Frank Wuerthwein⁹ - Show less +14 more•Institutions (9)

Fermilab¹, University of Florida², University of Chicago³, California Institute of Technology⁴, Renaissance Computing Institute⁵, Lawrence Berkeley National Laboratory⁶, University of Wisconsin-Madison⁷, Indiana University⁸, University of California, San Diego⁹

24 Apr 2009-arXiv: Computational Physics

TL;DR: The Open Science Grid (OSG) as mentioned in this paper enables new science, new scientists, and new modalities in support of computationally based research, and leverages its deliverables to the large scale physics experiment member communities to benefit new communities at all scales through activities in education, engagement, and distributed facility.

...read moreread less

Abstract: The Open Science Grid (OSG) includes work to enable new science, new scientists, and new modalities in support of computationally based research. There are frequently significant sociological and organizational changes required in transformation from the existing to the new. OSG leverages its deliverables to the large scale physics experiment member communities to benefit new communities at all scales through activities in education, engagement and the distributed facility. As a partner to the poster and tutorial at SciDAC 2008, this paper gives both a brief general description and some specific examples of new science enabled on the OSG. More information is available at the OSG web site: (http://www.opensciencegrid.org).

...read moreread less

5 citations

Proceedings Article•DOI•

A Runtime Reputation Based Grid Resource Selection Algorithm on the Open Science Grid

[...]

Zhengxiong Hou, Xingshe Zhou, Michael Wilde¹, Jianhua Gu, Mihael Hategan¹ - Show less +1 more•Institutions (1)

University of Chicago¹

08 Dec 2009

TL;DR: A runtime reputation based grid resource selection algorithm that is dynamically adaptive to the runtime availability, load, and performance of the grid resources.

...read moreread less

Abstract: The scheduling and execution for grid application is an important problem in the grid environment. To get the high reliability and efficiency, we propose a runtime reputation based grid resource selection algorithm. According to the accumulated raw score, the runtime reputation degree for a grid resource is quantified as an evaluating score in the runtime of an application. Instead of being dependent on the historical experiences, it is dynamically adaptive to the runtime availability, load, and performance of the grid resources. The execution framework on the grid is based on Globus Toolkit and Swift system. In a real production grid, Open Science Grid (OSG), a typical grid application with large scale independent jobs was experimented, which was based on BLAST application. The experimental results for the performance of different policies are presented, with a benchmarking workload size of 10,000 jobs. The runtime reputation and behavior statistics for the grid resources are also presented.

...read moreread less

2 citations

Proceedings Article•DOI•

Experiences of On-Demand Execution for Large Scale Parameter Sweep Applications on OSG by Swift

[...]

Zhengxiong Hou, Michael Wilde¹, Mihael Hategan¹, Xingshe Zhou, Ian Foster¹, Ben Clifford¹ - Show less +2 more•Institutions (1)

Argonne National Laboratory¹

25 Jun 2009

TL;DR: This paper describes how to use swift to enable the on-demand execution of large scale PSA on open science grid (OSG), and the experimental results for the performance of different policies are presented.

...read moreread less

Abstract: Large scale parameter sweep application (PSA) is one of the main grid applications, which may have different characteristics and demands. In this paper, we describe how to use swift to enable the on-demand execution of large scale PSA on open science grid (OSG). The basic on-demand concept means providing appropriate grid resources for the application, which is decided by the characteristics and demands of the application. So we can get high reliability, efficiency, and scalability for large scale independent PSA jobs on OSG. The main on-demand policies include: trust based site selection and pre-selection; scheduling policy on-demand configuration; clustering for small jobs; adaptive execution and automatic data staging; divide and conquer for the scalability. Some usage examples of swift for executing large scale PSA are presented, such as dock, blast. The experimental results for the performance of different policies are presented, with a benchmarking workload size of 10,000 jobs.

...read moreread less

1 citations