scispace - formally typeset
Search or ask a question

Showing papers by "Carl Kesselman published in 2003"


Posted Content
TL;DR: In this paper, the authors propose an approach to the representation, maintenance, and enforcement of fine-grained access control policies in distributed communities of resource providers and resource consumers, within which often complex and dynamic policies govern who can use which resources for which purpose.
Abstract: In "Grids" and "collaboratories," we find distributed communities of resource providers and resource consumers, within which often complex and dynamic policies govern who can use which resources for which purpose. We propose a new approach to the representation, maintenance, and enforcement of such policies that provides a scalable mechanism for specifying and enforcing these policies. Our approach allows resource providers to delegate some of the authority for maintaining fine-grained access control policies to communities, while still maintaining ultimate control over their resources. We also describe a prototype implementation of this approach and an application in a data management context.

680 citations



Proceedings ArticleDOI
22 Jun 2003
TL;DR: This work describes new approaches developed to support the Globus Toolkit version 3 (GT3) implementation of the Open Grid Services Architecture, an initiative that is recasting Grid concepts within a service-oriented framework based on Web services.
Abstract: Grid computing is concerned with the sharing and coordinated use of diverse resources in distributed "virtual organizations." The dynamic and multiinstitutional nature of these environments introduces challenging security issues that demand new technical approaches. In particular, one must deal with diverse local mechanisms, support dynamic creation of services, and enable dynamic creation of trust domains. We describe how these issues are addressed in two generations of the Globus Toolkit/spl reg/. First, we review the Globus Toolkit version 2 (GT2) approach; then we describe new approaches developed to support the Globus Toolkit version 3 (GT3) implementation of the Open Grid Services Architecture, an initiative that is recasting Grid concepts within a service-oriented framework based on Web services. GT3's security implementation uses Web services security mechanisms for credential exchange and other purposes, and introduces a tight least-privilege model that avoids the need for any privileged network service.

518 citations


Journal ArticleDOI
TL;DR: The current ACWG based on AI planning technologies is described and it is outlined how these technologies can play a crucial role in developing complex application workflows in Grid environments.
Abstract: In this paper we address the problem of automatically generating job workflows for the Grid. These workflows describe the execution of a complex application built from individual application components. In our work we have developed two workflow generators: the first (the Concrete Workflow Generator CWG) maps an abstract workflow defined in terms of application-level components to the set of available Grid resources. The second generator (Abstract and Concrete Workflow Generator, ACWG) takes a wider perspective and not only performs the abstract to concrete mapping but also enables the construction of the abstract workflow based on the available components. This system operates in the application domain and chooses application components based on the application metadata attributes. We describe our current ACWG based on AI planning technologies and outline how these technologies can play a crucial role in developing complex application workflows in Grid environments. Although our work is preliminary, CWG has already been used to map high energy physics applications onto the Grid. In one particular experiment, a set of production runs lasted 7 days and resulted in the generation of 167,500 events by 678 jobs. Additionally, ACWG was used to map gravitational physics workflows, with hundreds of nodes onto the available resources, resulting in 975 tasks, 1365 data transfers and 975 output files produced.

517 citations


Posted Content
TL;DR: The Globus Toolkit version 2 (GT2) as discussed by the authors was developed to support the Open Grid Services Architecture, an initiative that recasting Grid concepts within a service oriented framework based on Web services.
Abstract: Grid computing is concerned with the sharing and coordinated use of diverse resources in distributed "virtual organizations." The dynamic and multi-institutional nature of these environments introduces challenging security issues that demand new technical approaches. In particular, one must deal with diverse local mechanisms, support dynamic creation of services, and enable dynamic creation of trust domains. We describe how these issues are addressed in two generations of the Globus Toolkit. First, we review the Globus Toolkit version 2 (GT2) approach; then, we describe new approaches developed to support the Globus Toolkit version 3 (GT3) implementation of the Open Grid Services Architecture, an initiative that is recasting Grid concepts within a service oriented framework based on Web services. GT3's security implementation uses Web services security mechanisms for credential exchange and other purposes, and introduces a tight least-privilege model that avoids the need for any privileged network service.

507 citations


Book ChapterDOI
30 May 2003
TL;DR: This presentation complements an earlier foundational article, “The Anatomy of the Grid,” by describing how Grid mechanisms can implement a service-oriented architecture, explaining how Grid functionality can be incorporated into a Web services framework, and illustrating how the architecture can be applied within commercial computing as a basis for distributed system integration.
Abstract: In both e-business and e-science, we often need to integrate services across distributed, heterogeneous, dynamic “virtual organizations” formed from the disparate resources within a single enterprise and/or from external resource sharing and service provider relationships. This integration can be technically challenging because of the need to achieve various qualities of service when running on top of different native platforms. We present an Open Grid Services Architecture that addresses these challenges. Building on concepts and technologies from the Grid and Web services communities, this architecture defines a uniform exposed service semantics (the Grid service); defines standard mechanisms for creating, naming, and discovering transient Grid service instances; provides location transparency and multiple protocol bindings for service instances; and supports integration with underlying native platform facilities. The Open Grid Services Architecture also defines, in terms of Web Services Description Language (WSDL) interfaces and associated conventions, mechanisms required for creating and composing sophisticated distributed systems, including lifetime management, change management, and notification. Service bindings can support reliable invocation, authentication, authorization, and delegation, if required. Our presentation complements an earlier foundational article, “The Anatomy of the Grid,” by describing how Grid mechanisms can implement a service-oriented architecture, explaining how Grid functionality can be incorporated into a Web services framework, and illustrating how our architecture can be applied within commercial computing as a basis for distributed system integration—within and across organizational domains. This is a DRAFT document and continues to be revised. The latest version can be found at http://www.globus.org/research/papers/ogsa.pdf. Please send comments to foster@mcs.anl.gov, carl@isi.edu, jnick@us.ibm.com, tuecke@mcs.anl.gov Physiology of the Grid 2

449 citations


Proceedings ArticleDOI
15 Nov 2003
TL;DR: The Metadata Catalog Service (MCS) as mentioned in this paper provides a mechanism for storing and accessing descriptive metadata and allows users to query for data items based on desired attributes, such as attributes.
Abstract: Advances in computational, storage and network technologies as well as middle ware such as the Globus Toolkit allow scientists to expand the sophistication and scope of data-intensive applications. These applications produce and analyze terabytes and petabytes of data that are distributed in millions of files or objects. To manage these large data sets efficiently, metadata or descriptive information about the data needs to be managed. There are various types of metadata, and it is likely that a range of metadata services will exist in Grid environments that are specialized for particular types of metadata cataloguing and discovery. In this paper, we present the design of a Metadata Catalog Service (MCS) that provides a mechanism for storing and accessing descriptive metadata and allows users to query for data items based on desired attributes. We describe our experience in using the MCS with several applications and present a scalability study of the service.

258 citations


Book ChapterDOI
20 Oct 2003
TL;DR: This paper has designed and prototyped an ontology-based resource selector that exploits ontologies, background knowledge, and rules for solving resource matching in the Grid using semantic web technologies.
Abstract: The Grid is an emerging technology for enabling resource sharing and coordinated problem solving in dynamic multi-institutional virtual organizations. In the Grid environment, shared resources and users typically span different organizations. The resource matching problem in the Grid involves assigning resources to tasks in order to satisfy task requirements and resource policies. These requirements and policies are often expressed in disjoint application and resource models, forcing a resource selector to perform semantic matching between the two. In this paper, we propose a flexible and extensible approach for solving resource matching in the Grid using semantic web technologies. We have designed and prototyped an ontology-based resource selector that exploits ontologies, background knowledge, and rules for solving resource matching in the Grid.

194 citations


Proceedings ArticleDOI
22 Jun 2003
TL;DR: This work proposes a grid workflow system (grid-WFS), a flexible failure handling framework for the grid, which addresses these grid-unique failure recovery requirements, central to the framework is flexibility by the use of workflow structure as a high-level recovery policy specification.
Abstract: The generic, heterogeneous, and dynamic nature of the grid requires a new from of failure recovery mechanism to address its unique requirements such as support for diverse failure handling strategies, separation of failure handling strategies from application codes, and user-defined exception handling. We here propose a grid workflow system (grid-WFS), a flexible failure handling framework for the grid, which addresses these grid-unique failure recovery requirements. Central to the framework is flexibility by the use of workflow structure as a high-level recovery policy specification. We show how this use of high-level workflow structure allows users to achieve failure recovery in a variety of ways depending on the requirements and constraints of their applications. We also demonstrate that this use of workflow structure enables users to not only rapidly prototype and investigate failure handling strategies, but also easily change them by simply modifying the encompassing workflow structure, while the application code remains intact. Finally, we present an experimental evaluation of our framework using a simulation, demonstrating the value of supporting multiple failure recovery techniques in grid systems to achieve high performance in the presence of failures.

176 citations


Posted Content
TL;DR: This paper describes CAS and the past and current implementations of CAS, and it discusses the plans for CAS-related research.
Abstract: Virtual organizations (VOs) are communities of resource providers and users distributed over multiple policy domains. These VOs often wish to define and enforce consistent policies in addition to the policies of their underlying domains. This is challenging, not only because of the problems in distributing the policy to the domains, but also because of the fact that those domains may each have different capabilities for enforcing the policy. The Community Authorization Service (CAS) solves this problem by allowing resource providers to delegate some policy authority to the VO while maintaining ultimate control over their resources. In this paper we describe CAS and our past and current implementations of CAS, and we discuss our plans for CAS-related research.

150 citations


01 Jan 2003
TL;DR: How to achieve the flexibility by the use of workflow structure as a high-level recovery policy specification, which enables support for multiple failure recovery techniques, the separation of failure handling strategies from the application code, and user-defined exception handlings.
Abstract: Over the past few years, the Grid has emerged as a new infrastructure for developing so-called Grid applications by enabling the integration of instruments, display, and computing resources that are managed by diverse organizations in widespread locations. Even though the geographically distributed and non-centralized administrative nature of the Grid can make it prone to failures during task execution, the research focus so far has not been on fault tolerance. This thesis is intended to improve this situation by presenting the Grid Workflow System (Grid-WFS) designed to provide a special form of fault tolerance for the Grid; a generic failure detection mechanism and a flexible failure handling framework. The generic failure detection mechanism enables the detection of generic task crash failures. In addition, the mechanism allows users to define exceptions to handle task-specific failures without requiring any modifications to both the Grid protocol and the local policy of each Grid node. This thesis describes how to overcome the challenge by employing an event notification mechanism that is based on the interpretation of notification messages being delivered from different entities residing on each Grid node. The flexible failure handling framework allows users to achieve failure recovery in a variety of ways depending on the requirements and constraints of their applications. Central to the framework is flexibility in handling failures. The heterogeneity of the Grid environment and applications, and the dynamic nature of the Grid dictate that a single monolithic failure recovery strategy is not appropriate. This thesis describes how to achieve the flexibility by the use of workflow structure as a high-level recovery policy specification, which enables support for multiple failure recovery techniques, the separation of failure handling strategies from the application code, and user-defined exception handlings. Finally, this thesis presents an experimental evaluation of the Grid-WFS using a simulation, demonstrating the value of supporting multiple failure recovery techniques in Grid systems to achieve high performance in the presence of failures.

Journal ArticleDOI
TL;DR: In this article, a failure detection service (FDS) and a flexible failure handling framework (Grid-WFS) are presented as a fault tolerance mechanism on the Grid, which enables the detection of both task crashes and user-defined exceptions.
Abstract: This paper presents a failure detection service (FDS) and a flexible failure handling framework (Grid-WFS) as a fault tolerance mechanism on the Grid. The FDS enables the detection of both task crashes and user-defined exceptions. A major challenge in providing such a generic failure detection service on the Grid is to detect those failures without requiring any modification to both the Grid protocol and the local policy of each Grid node. This paper describes how to overcome the challenge by using a notification mechanism which is based on the interpretation of notification messages being delivered from the underlying Grid resources. The Grid-WFS built on top of FDS allows users to achieve failure recovery in a variety of ways depending on the requirements and constraints of their applications. Central to the framework is flexibility in handling failures. This paper describes how to achieve the flexibility by the use of workflow structure as a high-level recovery policy specification, which enables support for multiple failure recovery techniques, the separation of failure handling strategies from the application code, and user-defined exception handlings. Finally, this paper presents an experimental evaluation of the Grid-WFS using a simulation, demonstrating the value of supporting multiple failure recovery techniques in Grid applications to achieve high performance in the presence of failures.

Proceedings Article
09 Jun 2003
TL;DR: This work has implemented a planning system to generate task workflows for the Grid automatically, allowing the user to specify the desired data products in simple terms, and believes AI planning will play a crucial role in developing complex application workflows in the Grid.
Abstract: Grid computing gives users access to widely distributed networks of computing resources to solve large-scale tasks such as scientific computation. These tasks are defined as standalone components that can be combined to process the data in various ways. We have implemented a planning system to generate task workflows for the Grid automatically, allowing the user to specify the desired data products in simple terms. The planner uses heuristic control rules and searches a number of alternative complete plans in order to find a high-quality solution. We describe an implemented test case in gravitational wave interferometry and show how the planner is integrated in the Grid environment. We discuss promising future directions of this work. We believe AI planning will play a crucial role in developing complex application workflows for the Grid.

Journal ArticleDOI
01 Oct 2003
TL;DR: The Earth System Grid prototype is described, which brings together advanced analysis, replica management, data transfer, request management, and other technologies to support high-performance, interactive analysis of replicated data.
Abstract: In numerous scientific disciplines, terabyte and petabyte-scale data collections are emerging as critical community resources. A new class of "data grid" infrastructure is required to support management, transport, distributed access to, and analysis of these datasets by potentially thousands of users. Researchers who face this challenge include the climate modeling community, which performs long-duration computations accompanied by frequent output of very large files that must be further analyzed. We describe the Earth System Grid-I prototype, which brings together advanced analysis, replica management, data transfer, request management, and other technologies to support high-performance, interactive analysis of replicated data. We present performance results that demonstrate our ability to manage the location and movement of large datasets from the user's desktop. We report on experiments conducted over SciNET at SC'2000, where we achieved peak performance of 1.55 Gb/s and sustained performance of 512.9 Mb/s for data transfers between Texas and California. Finally, we describe the development of the next-generation Earth System Grid-II (ESG-II) project. Important issues for ESG-II include security requirements for production environments, efficient data filtering and transport, metadata services for discovery of relevant climate datasets, and sophisticated request or workflow management for complex tasks.


Proceedings Article
01 Jan 2003
TL;DR: This paper describes the initial work in capturing knowledge and heuristics about how to select application components and computing resources, and using that knowledge to generate automatically executable job workflows for the Grid.
Abstract: Grid computing provides key infrastructure for distributed problem solving in dynamic virtual organizations. It has been adopted by many scientific projects, and industrial interest is rising rapidly. However, Grids are still the domain of a few highly trained programmers with expertise in networking, high-performance computing, and operating systems. This paper describes our initial work in capturing knowledge and heuristics about how to select application components and computing resources, and using that knowledge to generate automatically executable job workflows for the Grid. Our system is implemented and integrated with a Grid environment where it has generated dozens of workflows with hundreds of jobs in real time. The paper also discusses the prospects of using AI to improve current Grid infrastructure.

Proceedings ArticleDOI
12 May 2003
TL;DR: The increasing role of ontologies in the context of Grid Computing for obtaining, comparing and analyzing data is described and a declarative model that provide the outline for an ontology of scientific information is presented.
Abstract: In the emerging world of Grid Computing, shared computational, data, other distributed resources are becoming available to enable scientific advancement through collaborative research and collaboratories. This paper describes the increasing role of ontologies in the context of Grid Computing for obtaining, comparing and analyzing data. We present ontology entities and a declarative model that provide the outline for an ontology of scientific information. Relationships between concepts are also given. The implementation of some concepts described in this ontology is discussed within the context of the Earth System Grid II (ESG)[1].

Proceedings ArticleDOI
15 Nov 2003
TL;DR: This paper focuses on the key enabling technology components, particularly Chimera and Pegasus which are used to create and manage the computational workflow that must be present to deal with the challenging application requirements.
Abstract: As part of the development of the National Virtual Observatory (NVO), a Data Grid for astronomy, we have developed a prototype science application to explore the dynamical history of galaxy clusters by analyzing the galaxies' morphologies. The purpose of the prototype is to investigate how Grid-based technologies can be used to provide specialized computational services within the NVO environment. In this paper we focus on the key enabling technology components, particularly Chimera and Pegasus which are used to create and manage the computational workflow that must be present to deal with the challenging application requirements. We illustrate how the components interplay with each other and can be driven from a special purpose application portal.

Book ChapterDOI
07 Sep 2003
TL;DR: Pegasus as mentioned in this paper is a workflow mapping and planning system that can map complex workflows onto the Grid using an AI-based planner to perform the mapping from high-level metadata descriptions to a workflow that can be executed on the Grid.
Abstract: This paper describes the Pegasus workflow mapping and planning system that can map complex workflows onto the Grid In particular, Pegasus can be configured to generate an executable workflow based on application-specific attributes In that configuration, Pegasus uses and AI-based planner to perform the mapping from high-level metadata descriptions to a workflow that can be executed on the Grid This configuration of Pegasus was used in the context of the Laser Interferometer Gravitational Wave Observatory (LIGO) pulsar search We conducted a successful demonstration of the system at SC 2002 during which time we ran approximately 200 pulsar searches



Proceedings Article
01 Jan 2003

01 Jan 2003
TL;DR: This work describes in detail new approaches developed to support the GT3 implementation of the Open Grid Services Architecture, a new initiative aimed at recasting key Grid concepts within a service-oriented framework.
Abstract: Grid computing is concerned with the sharing and coordinated use of diverse resources in distributed "virtual organizations” The dynamic and multi-institutional nature of these environments introduces challenging security concerns that demand new technical approaches In particular, we must deal with diverse local mechanisms, support dynamic creation of services, and enable dynamic creation of trust domains We describe how these issues are addressed in two generations of the Globus Toolkit (GT2) First, we review the GT2 approach; then, we describe in detail new approaches developed to support the GT3 implementation of the Open Grid Services Architecture, a new initiative aimed at recasting key Grid concepts within a service-oriented framework GT3’s security implementation uses WS-Security mechanisms for credential exchange and other purposes, and introduces a tight least privilege model that avoids the need for any privileged service


Journal ArticleDOI
TL;DR: How the Grid enables new research possibilities in astronomy through multi‐wavelength images is described, which can be used for pushing source detection and statistics by an order of magnitude from current techniques and for optimization of multi-wavelength image registration.
Abstract: We describe how the Grid enables new research possibilities in astronomy through multi-wavelength images. To see sky images in the same pixel space, they must be projected to that space, a computer-intensive process. There is thus a virtual data space induced that is defined by an image and the applied projection. This virtual data can be created and replicated with Planners and Replica catalog technology developed under the GriPhyN project. We plan to deploy our system (MONTAGE) on the U.S. Teragrid. Grid computing is also needed for ingesting data—computing background correction on each image—which forms a separate virtual data space. Multi-wavelength images can be used for pushing source detection and statistics by an order of magnitude from current techniques; for optimization of multi-wavelength image registration for detection and characterization of extended sources; and for detection of new classes of essentially multi-wavelength astronomical phenomena. The paper discusses both the Grid architecture and the scientific goals.