scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Experiment Management Support for Performance Tuning

TL;DR: The design and preliminary implementation of a tool that views each execution as a scientific experiment and provides the functionality to answer questions about a program's performance that span more than a single execution or environment are reported on.
Abstract: The development of a high-performance parallel system or application is an evolutionary process. It may begin with models or simulations, followed by an initial implementation of the program. The code is then incrementally modified to tune its performance and continues to evolve throughout the applications's life span. At each step, the key question for developers is: how and how much did the performance change? This question arises comparing an implementation to models or simulations; considering versions of an implementation that use a different algorithm, communication or numeric library, or language; studying code behavior by varying number or type of processors, type of network, type of processes, input data set or work load, or scheduling algorithm; and in benchmarking or regression testing. Despite the broad utility of this type of comparison, no existing performance tool provides the necessary functionality to answer it; even state of the art research tools such as Paradyn[2] and Pablo[3] focus instead on measuring the performance of a single program execution.We describe an infrastructure for answering this question at all stages of the life of an application. We view each program run, simulation result, or program model as an experiment, and provide this functionality in an Experiment Management system. Our project has three parts: (1) a representation for the space of executions, (2) techniques for quantitatively and automatically comparing two or more executions, and (3) enhanced performance diagnosis abilities based on historic performance data. In this paper we present initial results on the first two parts. The measure of success of this project is that an activity that was complex and cumbersome to do manually, we can automate.The first part is a concise representation for the set of executions collected over the life of an application. We store information about each experiment in a Program Event, which enumerates the components of the code executed and the execution environment, and stores the performance data collected. The possible combinations of code and execution environment form the multi-dimensional Program Space, with one dimension for each axis of variation and one point for each Program Event. We enable exploration of this space with a simple naming mechanism, a selection and query facility, and a set of interactive visualizations. Queries on a Program Space may be made both on the contents of the performance data and on the metadata that describes the multi-dimensional program space. A graphical representation of the Program Space serves as the user interface to the Experiment Management system.The second part of the project is to develop techniques for automating comparison between experiments. Performance tuning across multiple executions must answer the deceptively simple question: what changed in this run of the program? We have developed techniques for determining the "difference" between two or more program runs, automatically describing both the structural differences (differences in program execution structure and resources used), and the performance variation (how were the resources used and how did this change from one run to the next). We can apply our technique to compare an actual execution with a predicted or desired performance measure for the application, and to compare distinct time intervals of a single program execution. Uses for this include performance tuning efforts, automated scalability studies, resource allocation for metacomputing [4], performance model validation studies, and dynamic execution models where processes are created, destroyed, migrated [5], communication patterns and use of distributed shared memory may be optimized [6,9], or data values or code may be changed by steering [7,8]. The difference information is not necessarily a simple measure such as total execution time, but may be a more complex measure derived from details of the program structure, an analytical performance prediction, an actual previous execution of the code, a set of performance thresholds that the application is required to meet or exceed, or an incomplete set of data from selected intervals of an execution.The third part of this research is to investigate the use of the predicted, summary, and historical data contained in the Program Events and Program Space for performance diagnosis. We are exploring novel opportunities for exploiting this collection of data to focus data gathering and analysis efforts to the critical sections of a large application, and for isolating spurious effects from interesting performance variations. Details of this are outside of the scope of this paper.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The overall architecture of the ASKALON tool set is described and the basic functionality of the four constituent tools are outlined, enabling tool interoperability and demonstrating the usefulness and effectiveness of ASKalON by applying the tools to real‐world applications.
Abstract: Performance engineering of parallel and distributed applications is a complex task that iterates through various phases, ranging from modeling and prediction, to performance measurement, experiment ...

202 citations

Proceedings ArticleDOI
29 Mar 2006
TL;DR: A novel approach for finding performance problems in applications with a large number of processes that leverages the authors' multicast and data aggregation infrastructure to address these three performance tool scalability barriers is presented.
Abstract: Performance analysis tools are critical for the effective use of large parallel computing resources, but existing tools have failed to address three problems that limit their scalability: (1) management and processing of the volume of performance data generated when monitoring a large number of application processes, (2) communication between a large number of tool components, and (3) presentation of performance data and analysis results for applications with a large number of processes. In this paper, we present a novel approach for finding performance problems in applications with a large number of processes that leverages our multicast and data aggregation infrastructure to address these three performance tool scalability barriers.First, we show how to design a scalable, distributed performance diagnosis facility. We demonstrate this design with an on-line, automated strategy for finding performance bottlenecks. Our strategy uses distributed, independent bottleneck search agents located in the tool agent processes that monitor running application processes. Second, we present a technique for constructing compact displays of the results of our bottleneck detection strategy. This technique, called the Sub-Graph Folding Algorithm, presents bottleneck search results using dynamic graphs that record the refinement of a bottleneck search. The complexity of the results graph is controlled by combining sub-graphs showing similar local application behavior into a composite sub-graph.Using an approach that combines these two synergistic parts, we performed bottleneck searches on programs with up to 1024 processes with no sign of tool resource saturation. With 1024 application processes, our visualization technique reduced a search results graph containing over 30,000 nodes to a single composite 44-node graph sub-graph showing the same qualitative performance information as the original graph.

71 citations

Proceedings ArticleDOI
24 Jun 2012
TL;DR: Expertus---a flexible code generation framework for automated performance testing of distributed applications in Infrastructure as a Service (IaaS) clouds uses a multi-pass compiler approach and leverages template-driven code generation to modularly incorporate different software applications on IaaS clouds.
Abstract: Cloud computing is an emerging technology paradigm that revolutionizes the computing landscape by providing on-demand delivery of software, platform, and infrastructure over the Internet. Yet, architecting, deploying, and configuring enterprise applications to run well on modern clouds remains a challenge due to associated complexities and non-trivial implications. The natural and presumably unbiased approach to these questions is thorough testing before moving applications to production settings. However, thorough testing of enterprise applications on modern clouds is cumbersome and error-prone due to a large number of relevant scenarios and difficulties in testing process. We address some of these challenges through Expertus---a flexible code generation framework for automated performance testing of distributed applications in Infrastructure as a Service (IaaS) clouds. Expertus uses a multi-pass compiler approach and leverages template-driven code generation to modularly incorporate different software applications on IaaS clouds. Expertus automatically handles complex configuration dependencies of software applications and significantly reduces human errors associated with manual approaches for software configuration and testing. To date, Expertus has been used to study three distributed applications on five IaaS clouds with over 10,000 different hardware, software, and virtualization configurations. The flexibility and extensibility of Expertus and our own experience on using it shows that new clouds, applications, and software packages can easily be incorporated.

55 citations

Book ChapterDOI
12 Apr 1999
TL;DR: The abstract view on an event trace the EARL interpreter provides to the user is described, a set of EARL script examples are used to demonstrate the features ofEARL, and an overview about theEARL language is given.
Abstract: This paper describes a new meta-tool name EARL which consists of a new high-level trace analysis language and its interpreter which allows to easily construct new trace analysis tools Because of its programmability and flexibility, EARL can be used for a wide range of event trace analysis tasks It is especially well-suited for automatic and for application or domain specific trace analysis and program validation We describe the abstract view on an event trace the EARL interpreter provides to the user, and give an overview about the EARL language Finally, a set of EARL script examples are used to demonstrate the features of EARL

50 citations


Cites background from "Experiment Management Support for P..."

  • ...It is also one of the few tools which support experiment management [12]....

    [...]

Proceedings ArticleDOI
23 Sep 2002
TL;DR: The ZENTURIO experiment management system for parameter studies, performance analysis, and software testing for cluster and Grid architectures, implemented based on Java/Jini distributed technology is introduced.
Abstract: The need to conduct and manage large sets of experiments for scientific applications dramatically increased over the last decade. However, there is still very little tool support for this complex and tedious process. We introduce the ZENTURIO experiment management system for parameter studies, performance analysis, and software testing for cluster and Grid architectures. ZENTURIO uses the ZEN directive-based language to specify arbitrary complex program executions. ZENTURIO is designed as a collection of Grid services that comprise: (1) a registry service which supports registering and locating Grid services; (2) an experiment generator that parses files with ZEN directives and instruments applications for performance analysis and parameter studies; (3) an experiment executor that compiles and controls the execution of experiments on the target machine. A graphical user portal allows the user to control and monitor the experiments and to automatically visualise performance and output data across multiple experiments. ZENTURIO has been implemented based on Java/Jini distributed technology. It supports experiment management on cluster architectures via PBS and on Grid infrastructures through GRAM. We report results of using ZENTURIO for performance analysis of an ocean simulation application and a parameter study of a computational finance code.

40 citations

References
More filters
Journal ArticleDOI

40,330 citations


"Experiment Management Support for P..." refers background or methods in this paper

  • ...[3] while ! (isEmpty (pendingQueue)) [4] currentFocus ← dequeue (pendingQueue) [5] pr1 ← P(E1, m, currentFocus, ) [6] pr2 ← P(E2, m, currentFocus, ) [7] If dd(pr1, pr2) = true...

    [...]

  • ...Tj} [5] S2 ....

    [...]

  • ...match (r i, rj) then [4] S ←S U { ri ⊕ rj, Ti ⊕Tj} [5] E2 ← E2 - (rj, Tj) [6] else S← SU {(r i, Ti)} [7] S ← SU E2 Figure 2: Algorithm to find the Structural Difference of two Program Events,E1 ⊕ E2...

    [...]

  • ...Uses for this include performance tuning efforts, automated scalability studies, resource allocation for metacomputing [4], performance model validation studies, and dynamic execution models where processes are created, destroyed, migrated [5], communication patterns and use of distributed shared memory may be optimized [6,9], or data values or code may be changed by steering [7,8]....

    [...]

Journal ArticleDOI
01 Jun 1997
TL;DR: The Globus system is intended to achieve a vertically integrated treatment of application, middleware, and net work, an integrated set of higher level services that enable applications to adapt to heteroge neous and dynamically changing metacomputing environ ments.
Abstract: The Globus system is intended to achieve a vertically integrated treatment of application, middleware, and net work. A low-level toolkit provides basic mechanisms such as communication, authentication, network information, and data access. These mechanisms are used to con struct various higher level metacomputing services, such as parallel programming tools and schedulers. The long- term goal is to build an adaptive wide area resource environment AWARE, an integrated set of higher level services that enable applications to adapt to heteroge neous and dynamically changing metacomputing environ ments. Preliminary versions of Globus components were deployed successfully as part of the I-WAY networking experiment.

3,450 citations

Book
01 Jan 1994
TL;DR: The Addison-Wesley Publishing Company, Inc. as mentioned in this paper has published the first edition of this book, which is available for personal use only and requires prior written permission of the author or publisher.
Abstract: Copyright © 1993 Addison-Wesley Publishing Company, Inc. All rights reserved. Duplication of this draft is permitted by individuals for personal use only. Any other form of duplication or reproduction requires prior written permission of the author or publisher. This statement must be easily visible on the first page of any reproduced copies. The publisher does not offer warranties in regard to this draft.

1,985 citations

Journal ArticleDOI
TL;DR: Dynamic instrumentation lets us defer insertion until the moment it is needed (and remove it when it is no longer needed); Paradyn's Performance Consultant decides when and where to insert instrumentation.
Abstract: Paradyn is a tool for measuring the performance of large-scale parallel programs. Our goal in designing a new performance tool was to provide detailed, flexible performance information without incurring the space (and time) overhead typically associated with trace-based tools. Paradyn achieves this goal by dynamically instrumenting the application and automatically controlling this instrumentation in search of performance problems. Dynamic instrumentation lets us defer insertion until the moment it is needed (and remove it when it is no longer needed); Paradyn's Performance Consultant decides when and where to insert instrumentation. >

864 citations

Book
01 Jan 1995
TL;DR: ParaGraph as mentioned in this paper is a software tool that provides a detailed, dynamic, graphical animation of the behavior of message-passing parallel programs and graphical summaries of their performance, animating trace information from actual runs to depict behavior and obtain the performance summaries.
Abstract: ParaGraph, a software tool that provides a detailed, dynamic, graphical animation of the behavior of message-passing parallel programs and graphical summaries of their performance, is presented. ParaGraph animates trace information from actual runs to depict behavior and obtain the performance summaries. It provides twenty-five perspectives on the same data, lending insight that might otherwise be missed. ParaGraph's features are described, its use is explained, its software design is briefly discussed, and its displays are examined in some detail. Future work on ParaGraph is indicated. >

540 citations