Showing papers by "Carl Kesselman published in 2002"

PDF

Open Access

The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration

[...]

Ian Foster, Carl Kesselman, Jeffrey M. Nick, Steven Tuecke

01 Jan 2002

TL;DR: This presentation complements an earlier foundational article, “The Anatomy of the Grid,” by describing how Grid mechanisms can implement a service-oriented architecture, explaining how Grid functionality can be incorporated into a Web services framework, and illustrating how the architecture can be applied within commercial computing as a basis for distributed system integration.

...read moreread less

Abstract: In both e-business and e-science, we often need to integrate services across distributed, heterogeneous, dynamic “virtual organizations” formed from the disparate resources within a single enterprise and/or from external resource sharing and service provider relationships. This integration can be technically challenging because of the need to achieve various qualities of service when running on top of different native platforms. We present an Open Grid Services Architecture that addresses these challenges. Building on concepts and technologies from the Grid and Web services communities, this architecture defines a uniform exposed service semantics (the Grid service); defines standard mechanisms for creating, naming, and discovering transient Grid service instances; provides location transparency and multiple protocol bindings for service instances; and supports integration with underlying native platform facilities. The Open Grid Services Architecture also defines, in terms of Web Services Description Language (WSDL) interfaces and associated conventions, mechanisms required for creating and composing sophisticated distributed systems, including lifetime management, change management, and notification. Service bindings can support reliable invocation, authentication, authorization, and delegation, if required. Our presentation complements an earlier foundational article, “The Anatomy of the Grid,” by describing how Grid mechanisms can implement a service-oriented architecture, explaining how Grid functionality can be incorporated into a Web services framework, and illustrating how our architecture can be applied within commercial computing as a basis for distributed system integration—within and across organizational domains. This is a DRAFT document and continues to be revised. The latest version can be found at http://www.globus.org/research/papers/ogsa.pdf. Please send comments to foster@mcs.anl.gov, carl@isi.edu, jnick@us.ibm.com, tuecke@mcs.anl.gov Physiology of the Grid 2

...read moreread less

3,455 citations

Journal Article•DOI•

Grid services for distributed system integration

[...]

Ian Foster¹, Carl Kesselman², Jeffrey M. Nick³, Steven Tuecke¹•Institutions (3)

Argonne National Laboratory¹, University of Southern California², IBM³

01 Jun 2002-IEEE Computer

TL;DR: In this paper, the authors focus on the nature of the services that respond to protocol messages and propose a set of services that can be aggregated in various ways to meet the needs of virtual organizations, which themselves can be defined by the services they operate and share.

...read moreread less

Abstract: Increasingly, computing addresses collaboration, data sharing, and interaction modes that involve distributed resources, resulting in an increased focus on the interconnection of systems both within and across enterprises. These evolutionary pressures have led to the development of Grid technologies. The authors' work focuses on the nature of the services that respond to protocol messages. Grid provides an extensible set of services that can be aggregated in various ways to meet the needs of virtual organizations, which themselves can be defined in part by the services they operate and share.

...read moreread less

1,816 citations

Proceedings Article•DOI•

A community authorization service for group collaboration

[...]

Laura Pearlman¹, Von Welch, Ian Foster, Carl Kesselman, Steven Tuecke - Show less +1 more•Institutions (1)

University of Southern California¹

05 Jun 2002

TL;DR: This approach allows resource providers to delegate some of the authority for maintaining fine-grained access control policies to communities, while still maintaining ultimate control over their resources.

...read moreread less

Abstract: In "grids" and "collaboratories", we find distributed communities of resource providers and resource consumers, within which often complex and dynamic policies govern who can use which resources for which purpose. We propose a new approach to the representation, maintenance and enforcement of such policies that provides a scalable mechanism for specifying and enforcing these policies. Our approach allows resource providers to delegate some of the authority for maintaining fine-grained access control policies to communities, while still maintaining ultimate control over their resources. We also describe a prototype implementation of this approach and an application in a data management context.

...read moreread less

665 citations

Journal Article•DOI•

Data management and transfer in high-performance computational grid environments

[...]

Bill Allcock¹, Joe Bester¹, John Bresnahan¹, Ann L. Chervenak², Ian Foster³, Carl Kesselman², Sam Meder¹, Veronika Nefedova¹, Darcy Quesnel¹, Steven Tuecke¹ - Show less +6 more•Institutions (3)

Argonne National Laboratory¹, University of Southern California², University of Chicago³

01 May 2002

TL;DR: A high-speed transport service that extends the popular FTP protocol with new features required for Data Grid applications, such as striping and partial file access and a replica management service that integrates a replica catalog with GridFTP transfers to provide for the creation, registration, location, and management of dataset replicas.

...read moreread less

Abstract: An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or computed data. Such applications arise, for example, in experimental physics, where the data in question is generated by accelerators, and in simulation science, where the data is generated by supercomputers. So-called Data Grids provide essential infrastructure for such applications, much as the Internet provides essential services for applications such as e-mail and the Web. We describe here two services that we believe are fundamental to any Data Grid: reliable, high-speed transport and replica management. Our high-speed transport service, GridFTP, extends the popular FTP protocol with new features required for Data Grid applications, such as striping and partial file access. Our replica management service integrates a replica catalog with GridFTP transfers to provide for the creation, registration, location, and management of dataset replicas. We present the design of both services and also preliminary performance results. Our implementations exploit security and other services provided by the Globus Toolkit.

...read moreread less

633 citations

Proceedings Article•DOI•

Giggle: A Framework for Constructing Scalable Replica Location Services

[...]

Ann L. Chervenak¹, Ewa Deelman¹, Ian Foster², Leanne P. Guy³, Wolfgang Hoschek³, Adriana Iamnitchi⁴, Carl Kesselman¹, Peter Z. Kunszt³, Matei Ripeanu⁴, Bob Schwartzkopf¹, Heinz Stockinger³, Kurt Stockinger³, Brian Tierney⁵ - Show less +9 more•Institutions (5)

University of Southern California¹, Argonne National Laboratory², CERN³, University of Chicago⁴, Lawrence Berkeley National Laboratory⁵

16 Nov 2002

TL;DR: A parameterized architectural framework is described, which is name Giggle (for GIGa-scale Global Location Engine), within which a wide range of RLSs can be defined, and initial performance results for an RLS prototype are presented, demonstrating that RLS systems can be constructed that meet performance goals.

...read moreread less

Abstract: In wide area computing systems, it is often desirable to create remote read-only copies (replicas) of files. Replication can be used to reduce access latency, improve data locality, and/or increase robustness, scalability and performance for distributed applications. We define a replica location service (RLS) as a system that maintains and provides access to information about the physical locations of copies. An RLS typically functions as one component of a data grid architecture. This paper makes the following contributions. First, we characterize RLS requirements. Next, we describe a parameterized architectural framework, which we name Giggle (for GIGa-scale Global Location Engine), within which a wide range of RLSs can be defined. We define several concrete instantiations of this framework with different performance characteristics. Finally, we present initial performance results for an RLS prototype, demonstrating that RLS systems can be constructed that meet performance goals.

...read moreread less

440 citations

Book Chapter•DOI•

SNAP: A Protocol for Negotiating Service Level Agreements and Coordinating Resource Management in Distributed Systems

[...]

Karl Czajkowski¹, Ian Foster², Ian Foster³, Carl Kesselman¹, Volker Sander⁴, Steven Tuecke³ - Show less +2 more•Institutions (4)

University of Southern California¹, University of Chicago², Argonne National Laboratory³, Forschungszentrum Jülich⁴

24 Jul 2002

TL;DR: A resource management model is defined that distinguishes three kinds of resource-independent service level agreements (SLAs), formalizingag reements to deliver capability, perform activities, and bind activities to capabilities, respectively.

...read moreread less

Abstract: A fundamental problem in distributed computing is to map activities such as computation or data transfer onto resources that meet requirements for performance, cost, security, or other quality of service metrics. The creation of such mappings requires negotiation among application and resources to discover, reserve, acquire, configure, and monitor resources. Current resource management approaches tend to specialize for specific resource classes, and address coordination across resources only in a limited fashion. We present a new approach that overcomes these difficulties.We define a resource management model that distinguishes three kinds of resource-independent service level agreements (SLAs), formalizingag reements to deliver capability, perform activities, and bind activities to capabilities, respectively. We also define a Service Negotiation and Acquisition Protocol (SNAP) that supports reliable management of remote SLAs. Finally, we explain how SNAP can be deployed within the context of the Globus Toolkit.

...read moreread less

426 citations

Grid Service Specification

[...]

Steven Tuecke, Karl Czajkowski, Ian Foster, Jeffrey Frey, Steve Graham, Carl Kesselman - Show less +2 more

01 Jan 2002

TL;DR: Technical details are provided, providing a full specification of the behaviors and Web Service Definition Language (WSDL) interfaces that define a Grid service.

...read moreread less

Abstract: Building on both Grid and Web services technologies, the Open Grid Services Architecture (OGSA) defines mechanisms for creating, managing, and exchanging information among entities called Grid services. Succinctly, a Grid service is a Web service that conforms to a set of conventions (interfaces and behaviors) that define how a client interacts with a Grid service. These conventions, and other OGSA mechanisms associated with Grid service creation and discovery, provide for the controlled, fault resilient, and secure management of the distributed and often long-lived state that is commonly required in advanced distributed applications. In a separate document, we have presented in detail the motivation, requirements, structure, and applications that underlie OGSA. Here we focus on technical details, providing a full specification of the behaviors and Web Service Definition Language (WSDL) interfaces that define a Grid service.

...read moreread less

221 citations

Proceedings Article•

A Metadata Catalog Service for Data Intensive Applications

[...]

Ann L. Chervenak, Ewa Deelman, Carl Kesselman, Laurie Anne Pearlman, Gurmeet Singh - Show less +1 more

01 Jan 2002

TL;DR: The design of a Metadata Catalog Service (MCS) is presented that provides a mechanism for storing and accessing descriptive metadata and allows users to query for data items based on desired attributes and a scalability study of the service is presented.

...read moreread less

Abstract: Advances in computational, storage and network technologies as well as middle ware such as the Globus Toolkit allow scientists to expand the sophistication and scope of data-intensive applications. These applications produce and analyze terabytes and petabytes of data that are distributed in millions of files or objects. To manage these large data sets efficiently, metadata or descriptive information about the data needs to be managed. There are various types of metadata, and it is likely that a range of metadata services will exist in Grid environments that are specialized for particular types of metadata cataloguing and discovery. In this paper, we present the design of a Metadata Catalog Service (MCS) that provides a mechanism for storing and accessing descriptive metadata and allows users to query for data items based on desired attributes. We describe our experience in using the MCS with several applications and present a scalability study of the service.

...read moreread less

177 citations

Proceedings Article•DOI•

GriPhyN and LIGO, building a virtual data Grid for gravitational wave scientists

[...]

Ewa Deelman¹, Carl Kesselman¹, Gaurang Mehta¹, L. Meshkat¹, Laura Pearlman¹, Kent Blackburn², P. Ehrens², Albert Lazzarini², Roy Williams², Scott Koranda³ - Show less +6 more•Institutions (3)

University of Southern California¹, California Institute of Technology², University of Wisconsin–Milwaukee³

24 Jul 2002

TL;DR: The initial design and prototype of a virtual data Grid for LIGO, which is being built to observe the gravitational waves predicted by general relativity, is described.

...read moreread less

Abstract: Many Physics experiments today generate large volumes of data. That data is then processed in a variety of ways in order to achieve the understanding of fundamental physical phenomena. The goal of the NSF-funded GriPhyN project (Grid Physics Network) is to enable scientists to seamlessly access data whether it is raw experimental data or a data product which is a result of further processing. GriPhyN provides a new degree of transparency in how data-handling and processing capabilities are integrated to deliver data products to end-users or applications, so that requests for such products are easily mapped into computation and/or data access at multiple locations. GriPhyN refers to the set of all data products available to the user as virtual data. Among the physics applications participating in the project is the Laser Interferometer Gravitational-wave Observatory (LIGO), which is being built to observe the gravitational waves predicted by general relativity. We describe our initial design and prototype of a virtual data Grid for LIGO.

...read moreread less

150 citations

Proceedings Article•DOI•

Protocols and services for distributed data-intensive science

[...]

William Allcock, Ian Foster, Steven Tuecke, Ann L. Chervenak, Carl Kesselman - Show less +1 more

29 Jan 2002

TL;DR: These components leverage the substantial body of “Grid” services and protocols developed within the Globus project and by its collaborators, and are being used in a number of data-intensive application projects.

...read moreread less

Abstract: We describe work being performed in the Globus project to develop enabling protocols and services for distributed data-intensive science. These services include: * High-performance, secure data transfer protocols based on FTP, plus a range of libraries and tools that use these protocols * Replica catalog services supporting the creation and location of file replicas in distributed systems These components leverage the substantial body of “Grid” services and protocols developed within the Globus project and by its collaborators, and are being used in a number of data-intensive application projects.

...read moreread less

90 citations

Pegasus: Planning for Execution in Grids

[...]

Ewa Deelman¹, Jim Blythe, Yolanda Gil, Carl Kesselman•Institutions (1)

Information Sciences Institute¹

01 Jan 2002

TL;DR: Motivation Grid computing has made great progress in the last few years, but it is becoming increasingly necessary to develop higher level services which can automate the process and provide an adequate level of performance and reliability.

...read moreread less

Abstract: 1 Motivation Grid computing has made great progress in the last few years. The basic mechanisms for accessing remote resources have been developed as part of the Globus Toolkit and are now widely deployed and used. Among such mechanisms are: § Information services, which allow for the discovery and monitoring of resources. The information provided can be used to find the available resources and select the resources which are the most appropriate for the task. § Security services, which allow users and resources to mutually authenticate and allows the resources to authorize users based on local policies. § Resource management, which allows for the scheduling of jobs on particular resources. § Data management services, which enable users and applications to manage large, distributed and replicated data sets. Some of the available services deal with locating particular data sets, others with efficiently moving large amounts of data across wide area networks. With the use of the above mechanisms, one can manually find out about the resources and schedule the desired computations and data movements. However, this process is time consuming and can potentially be complex. As the result it is becoming increasingly necessary to develop higher level services which can automate the process and provide an adequate level of performance and reliability.

...read moreread less

SNAP: A Protocol for Negotiation of Service Level Agreements and Coordinated Resource Management in Distributed Systems

[...]

Karl Czajkowski, Ian Foster, Carl Kesselman, Volker Sander, Steven Tuecke, Angewandte Mathematik - Show less +2 more

01 Jan 2002

TL;DR: The Service Negotiation and Acquisition Protocol (SNAP) as mentioned in this paper is a generalized resource management model in which resource interactions are mapped onto a well defined set of symmetric and resource independent service level agreements.

...read moreread less

Abstract: A fundamental problem with distributed applications is how to map activities such as computation or data transfer onto a set of resources that will meet the application’s requirement for performance, cost, security, or other quality of service metrics. An application or client must engage in a multi-phase negotiation process with resource managers, as it discovers, reserves, acquires, configures, monitors, and potentially renegotiates resource access. We present a generalized resource management model in which resource interactions are mapped onto a well defined set of symmetric and resource independent service level agreements. We instantiate this model in (the Service Negotiation and Acquisition Protocol (SNAP) which provides integrated support for lifetime management and an at-most-once creation semantics for SLAs. The result is a resource management framework for distributed systems that we believe is more powerful and general than current approaches. We explain how SNAP can be deployed within the context of the Globus Toolkit.

...read moreread less

Book Chapter•DOI•

The Grid, Grid Services and the Semantic Web: Technologies and Opportunities

[...]

Carl Kesselman¹•Institutions (1)

Information Sciences Institute¹

09 Jun 2002

TL;DR: This talk will introduce the Grid concept and illustrate it with application examples from a range of scientific disciplines, and explore some of these potential areas of Semantic Web technologies, identifying those that I think offer the most potential.

...read moreread less

Abstract: Grids are an emerging computational infrastructure that enables resource sharing and coordinated problem solving across dynamic, distributed collaborations that have come to be known as virtual organizations. Unlike the web, which primarily focuses on the sharing of information, the Grid provides a range of fundamental mechanisms for sharing diverse types of resource, such as computers, storage, data, software, and scientific instruments. In this talk, I will introduce the Grid concept and illustrate it with application examples from a range of scientific disciplines. It is likely that technology that is being developed for the Semantic Web will have important roles to play in Grid Services; I will explore some of these potential areas of Semantic Web technologies, identifying those that I think offer the most potential.

...read moreread less

Book Chapter•DOI•

Die Anatomie des Grid

[...]

Ian Foster¹, Carl Kesselman¹, Steven Tuecke¹•Institutions (1)

Argonne National Laboratory¹

01 Jan 2002

TL;DR: Wir beschreiben die Voraussetzungen, die Mechanismen dieser Art unseres Erachtens erfullen mussen und erortern, wie wichtig es ist, eine kom- pakte Familie of Integrid-Protokollen zu definieren, die fur die Interoperabilitat der verschiedenen Grid-Systeme sorgen.

...read moreread less

Abstract: „Grid Computing“ hat sich als wichtiger neuer Bereich etabliert, der sich dadurch vom konventionellen „Distributed Computing“ unterscheidet, dass es hier primar um den gemeinsamen Zugriff auf sehr groβe Ressourcenpools geht, die innovative Applikationen und in manchen Fallen eine hoch performante Orientierung bieten. In diesem Artikel wollen wir diesen neuen Sektor definieren, wobei wir uns zunachst das „Grid-Problem“ ansehen, das wir als flexiblen, sicheren und koordinierten Zugriff auf gemeinsame Ressourcen in dynamischen Gruppen von Personen, Institutionen und Ressourcen definieren, die wir im Folgenden als vir- tuelle Organisation bezeichnen werden. In Szenarien dieser Art befassen wir uns mit Themen wie die eindeutige Authentifizierung und Autorisierung, den Zugriff auf und die Entdeckung von Ressourcen und anderen Herausforderungen. Und gerade fur diese Klasse von Problemen bietet die Grid-Technologie Losungsan- satze. Als nachstes stellen wir eine skalierbare und offene Grid-Architektur dar, die Protokolle, Services, Application Program Interfaces und Software Develop- ment Kits anhand ihrer Rolle bei der Realisierung des Ressourcen-Sharing kata- logisiert werden. Wir beschreiben die Voraussetzungen, die Mechanismen dieser Art unseres Erachtens erfullen mussen und erortern, wie wichtig es ist, eine kom- pakte Familie von Integrid-Protokollen zu definieren, die fur die Interoperabilitat der verschiedenen Grid-Systeme sorgen. Zum Schluss beschreiben wir, wie Grid- Technologien mit anderen aktuellen Technologien wie die unternehmensweite Integration, Application-Service-Providing, Storage-Service-Providing und Peer-to- Peer-Computing zusammenhangen. Wir sind der Auffassung, dass Grid-Konzepteund Technologien diese anderen Ansatze nicht nur erganzen, sondern insgesamt aufwerten konnen.

...read moreread less

Journal Article•

Applications of Virtual Data in the LIGO experiment

[...]

Ewa Deelman, Carl Kesselman, Roy Williams, Kent Blackburn, Albert Lazzarini, Scott Koranda - Show less +2 more

01 Jan 2002-Lecture Notes in Computer Science

TL;DR: The Grid Physics Network (GridPhyN) as mentioned in this paper is an NSF-funded project that aims to realize the concepts of Virtual Data and Virtual Data Grid, a concept that unifies the view of the data whether it is raw or derived.

...read moreread less

Abstract: Many Physics experiments today generate large volumes of data. That data is then processed in many ways in order to achieve the understanding of fundamental physical phenomena. Virtual Data is a concept that unifies the view of the data whether it is raw or derived. It provides a new degree of transparency in how data-handling and processing capabilities are integrated to deliver data products to end-users or applications, so that requests for such products are easily mapped into computation and/or data access at multiple locations. GriPhyN (Grid Physics Network) is a NSF-funded project, which aims to realize the concepts of Virtual Data. Among the physics applications participating in the project is the Laser Interferometer Gravitational-wave Observatory (LIGO), which is being built to observe the gravitational waves predicted by general relativity. LIGO will produce large amounts of data, which are expected to reach hundreds of petabytes over the next decade. Large communities of scientists, distributed around the world, need to access parts of these datasets and perform efficient analysis on them. It is expected that the raw and processed data will be distributed among various national centers, university computing centers, and individual workstations. In this paper we describe some of the challenges associated with building Virtual Data Grids for experiments such as LIGO.

...read moreread less