A directory service for configuring high-performance distributed computations

doi:10.1109/HPDC.1997.626445

The submitted manuscript has been created

by the University

01

Chicago

as

Operator

of

Argonne National Laboratory ("Argonne")

under Contract No. W-31-109-ENG-38

with

the

U.S.

Department of Energy. The

U.S.

Government retains lor itself, and others act-

ing on its behalf, a paid-up, nonexclusive,

irrevocable worldwide license in said article

to reproduce, prepare derivative works, dis-

tribute copies

to

the public, and perform pub-

licly and display publicly, by

or

on behalf

of

the Government.

A

Directory Service

for

Configuring

High-Performance Distributed Computations

Steven Fitzgerald,'

Ian

Foster: Carl Kesselman,' Gregor von Laszewski?

Warren Smith? Steven Tuecke2

1V

JUL

07

0

ST.1

Information Sciences Institute

University

of

Southern California

Marina

del

Rey, CA

90292

Mathematics and Computer Sc,;nce

Argonne National Laboratory

Argonne,

IL

60439

http://www.globus.org/

Abstract

High-pelfotmance execution in distributed computing

envimnments ofren requires careful selection and configu-

ration not only

of

computers, networks, and other resources

but also

of

the protocols and algorithms used by applica-

tions. Selection and configuration in turn require access

to

accurate, up-to-date information

on

the structure and

state

of

available resources. Unfortunately,

no

standard

mechanism exists for organizing or accessing such infom-

tion. Consequently, diferent tools andapplications adopt ad

hoc mechanisms, or thq compromise their portability and

pegormance by using default configurations. We propose

a

Metacomputing Directory Service

that provides efficient

and scalable access to diverse, dynamic, and distributed

information about resource structure and state. We define

an

extensible data model

to

represent required information

and present a scalable, high-perfotmance, distributed

im-

plementation. The data representation and application pro-

gramming interface are adopted from the Lightweight Di-

rectory Access Protocol; the data model and implementation

are new. We use the Globus distributed computing toolkit to

illustrate how this directory service enables the development

of

more flexible and efficient distributed computing services

and applications.

1

Introduction

High-performance distributed computing often requires

careful selection and configuration of computers, networks,

application protocols, and algorithms. These requirements

do

not arise

in

traditional distributed computing, where con-

figuration problems can typically be avoided by the use of

standard default protocols, interfaces, and

so

on. The situ-

ation is also quite different in traditional high-performance

computing, where systems are usually homogeneous and

hence can be configured manually. But

in

high-performance

distributed computing, neither defaults nor manual config-

uration is acceptable. Defaults often do not result in ac-

ceptable performance, and manual configuration requires

low-level knowledge of remote systems that an average

programmer does not possess. We need an

infonnation-

rich

approach to configuration in which decisions are made

(whether at compile-time, link-time, or run-time

[

191)

based

upon information about the structure and state of the system

on which a program is to run.

An example from the I-WAY networking experiment il-

lustrates some of the difficulties associated with the configu-

ration of high-performance distributed systems. The I-WAY

was composed of massively parallel computers, worksta-

tions, archival storage systems, and visualization devices

161.

These resources were interconnected by both the internet

and a dedicated 155 Mbisec IP over

ATM

network. In this

environment, applications might run on a single

or

multi-

ple parallel computers, of the same or different types. An

optimal communication configuration for a particular situa-

tion might use vendor-optimized communication protocols

within a computer but

TCP/IP

between computers over an

ATM

network (if available). A significant amount of infor-

mation must be available to select such configurations, for

example:

0

What are the network interfaces (i.e., IP addresses) for

the

ATM

network and Internet?

What is the raw bandwidth of the

ATM

network and

the Internet, and which

.

I

DISCLAIMER

This

report

was

prepared

as

an account

of

work sponsored by an agency

of

the United

States Government. Neither the United States Government nor

any

agency thereof, nor

any

of

their employees, make any warranty, express or implied, or assumes any

legal

liabili-

ty or responsibility for the accuracy, completeness, or usefulness of any information, appa-

ratus,

product, or process

disclosed,

or

represents tbat

its

use

would not infringe privately

owned rights. Reference herein

to

any specific commercial product, process, or service by

trade name, trademark, manufacturer, or otherwise

does

not

necessarily

comtitute

or

imply its endorsement,

recommendation,

or favoring by the United States Government or

any agency thereof. The views and opinions of authors expressed herein do not necessar-

ily

state or reflect those of the United States Government or any agency thereof.

Portions

of

this

document

may

be

iiiegiile

in

electronic

image

products.

Images

are

produced

from

the

best

available

original

document.

0

Is the ATM network currently available?

0

Between which pairs of nodes can we use vendor pro-

tocols to access fast internal networks?

0

Between which pairs of nodes must we use TCP/IP?

Additional information

is

required if we use a resource

lo-

cation service to select an "optimal" set of resources from

among the machines available on the I-WAY at a given time.

In our experience, such configuration decisions

are

not

difficult

if

the right information is available. Until now,

however, this information has not been easily available, and

this lack of access has hindered application optimization.

Furthermore, making

this

information available in a useful

fashion is a nontrivial problem: the information required to

configure high-performance distributed systems is diverse

in scope, dynamic in value, distributed across the network,

and detailed in nature.

In this article, we propose an approach to the design

of high-performance distributed systems that addresses this

need for efficient and scalable access to diverse, dynamic,

and distributed information about the structure and state

of resources. The core of this approach is the definition

and implementation of a Metacomputing Directory Service

(MDS) that provides a uniform interface to diverse infor-

mation sources. We show how a simple data representa-

tion and application programming interface (API) based

on the Lightweight Directory Access Protocol (LDAP)

meet requirements for uniformity, extensibility, and dis-

tributed maintenance. We introduce a data model suitable

for distributed computing applications and show how this

model is able to represent computers and networks of inter-

est. We also present novel implementation techniques for

this service that address the unique requirements of high-

performance applications. Finally, we use examples from

the Globus distributed computing toolkit [9] to show how

MDS data can be used to guide configuration decisions with

realistic settings. We expect these techniques to be equally

useful in other systems that support computing in distributed

environments, such

as

Legion [12],

NEOS

[5],

NetSolve [4],

Condor [16], Nimrod [I], PRM [18], AppLeS

[2],

and het-

erogeneous implementations

of

MPI

[

131.

The principal contributions of this article are

0

a new architecture for high-performance distributed

computing systems, based upon an information service

called the Metacomputing Directory Service;

0

a design for this directory service, addressing issues of

data representation, data model, and implementation;

0

a data model able to represent the network structures

commonly used by distributed computing systems, in-

cluding various types of supercomputers; and

0

a demonstration of the use of

the

information provided

by MDS to guide resource and communication config-

uration within a distributed computing toolkit.

The rest of

this

article

is organized

as

follows.

In

Sec-

tion

2,

we explain the requirements that a distributed com-

puting information infrastructure must satisfy, and we pro-

pose MDS in response to these requirements. We then de-

scribe the representation (Section

3),

the data model (Sec-

tion 4), and the implementation (Section

5)

of MDS. In

Section 6, we demonstrate how MDS information is used

within Globus. We conclude in Section

7

with suggestions

for future research efforts.

2

Designing

a

Metacomputing Directory Ser-

vice

The problem of organizing and providing access to in-

formation is a familiar one in computer science, and there

are

many potential approaches to the problem, ranging from

database systems to the Simple Network Management Proto-

col

(SNMP).

The appropriate solution depends on the ways

in which the information is produced, maintained, accessed,

and used.

2.1

Requirements

Following are the requirements that shaped our design

of an information infrastructure for distributed computing

applications. Some of these requirements can be expressed

in quantitative terms (e.g., scalability, performance); others

are more subjective (e.g., expressiveness, deployability).

Performance.

The applications of interest to us frequently

operate on a large scale (e.g., hundreds of proces-

sors) and have demanding performance requirements.

Hence, an information infrastructure must permit rapid

access to frequently used configuration information. It

is not acceptable to contact a server for every item:

caching is required.

Scalability and cost.

The infrastructure must scale to large

numbers

of

components and permit concurrent access

by many entities. At the same time, its organization

must permit easy discovery of information. The human

and resource costs (CPU cycles, disk space, network

bandwidth) of creating and maintaining information

must also be low, both at individual sites and

in

total.

Uniformity.

Our goal is to simplify the development of

tools and applications that use data to guide config-

uration decisions. We require a uniform data model

as

well

as

an application programming interface (API)

for common operations on the data represented via that

2

model. One aspect of this uniformity is a standard rep-

resentation for data about common resources, such

as

processors and networks.

Expressiveness.

We require a data model rich enough to

represent relevant structure within distributed comput-

ing systems.

A

particular challenge is representing

characteristics

that

span organizations, for example net-

work bandwidth between sites.

Extensibility.

Any data model that we define will be in-

complete. Hence,

the

ability to incorporate additional

information is important. For example, an applica-

tion can use this facility to record specific information

about its behavior (observed bandwidth, memory re-

quirements) for

use

in subsequent

runs.

Multiple information

sources.

The information that we

require may be generated by many different sources.

Consequently, an information infrastructure must inte-

grate information from multiple sources.

Dynamic data.

Some of the data required by applications

is

highly dynamic: for example, network availability

or load. An information infrastructure must be able to

make this data availabIe in a timely fashion.

Flexible access.

We require the ability to both read and up-

date data contained within the information infrastruc-

ture. Some form of search capability is also required,

to

assist in locating stored data.

Security.

It is important to control who is allowed to update

configuration data. Some sites will also want to control

access.

Deployability.

An information infrastructure is useful only

if

is broadly deployed. In the current case, we require

techniques that can be installed and maintained easily

at many sites.

Decentralized maintenance.

It must be possible to dele-

gate the task of creating and maintaining information

about resources

to

the sites at which resources are

lo-

cated. This delegation is important for both scalability

and security reasons.

2.2

Approaches

It is instructive to review, with respect to these require-

ments, the various (incomplete) approaches to information

infrastructure that have been used by distributed computing

systems.

Operating system commands such

as

mame

and

sysinf

o

can provide important information about a partic-

ular machine but do not support remote access. SNMP

[21]

and the Network Information Service

(NIS)

both permit re-

mote access but are defined within the context of the IP

protocol suite, which can add significant overhead to a high-

performance computing environment. Furthermore, SNMP

does not define an API, thus preventing its use

as

a

compo-

nent within other software architectures.

High-performance computing systems such

as

PVM

[

111,

p4

[3],

and MPICH

[

131

provide rapid access to configura-

tion data by placing

this

data (e.g., machine names, network

interfaces) into files maintained by the programmer, called

“hostfiles.” However, lack

of

support for remote access

means that hostfiles must be replicated at each host, compli-

cating maintenance and dynamic update.

The Domain Name Service (DNS) provides

a

highly dis-

tributed, scalable service for resolving Internet addresses to

values (e.g., IP addresses) but is not, in general, extensible.

Furthermore, its update strategies are designed to support

values that change relatively rarely.

The

X.500

standard

[

14,

201

defines a directory service

that can be used to provide extensible distributed directory

services within a wide area environment. A directory service

is a service that provides read-optimized access to general

data about entities, such

as

people, corporations, and com-

puters.

X.500

provides a framework that could, in principle,

be used to organize the information that is of interest to us.

However, it is complex and requires IS0 protocols and the

heavyweight ASN.

1

encodings

of

data. For these and other

reasons, it is not widely used.

The Lightweight Directory Access Protocol

[24]

is

a

streamlined version of the

X.500

directory service. It re-

moves the requirement for an IS0 protocol stack, defining

a standard wire protocol based on the IP protocol suite. It

also simplifies the data encoding and command set of

X.500

and defines a standard API for directory access

[

151.

LDAP

is seeing wide-scale deployment

as

the directory service of

choice for the World Wide Web. Disadvantages include its

only moderate performance

(see

Section

5),

limited access

to external data sources, and rigid approach to distributing

data across servers.

Reviewing these various systems, we see that each is

in

some way incomplete, failing to address the types of

information needed to build high-performance distributed

computing systems, being too slow,

or

not defining an API

to enable uniform access to the service. For these reasons,

we have defined our own metacomputing information

in-

frastructure that integrates existing systems while providing

a uniform and extensible data model, support for multiple

information service providers, and a uniform API.

2.3

A

Metacomputing Directory Service

Our analysis of requirements and existing systems leads

us to define what we call the Metacomputing Directory Ser-

3

A directory service for configuring high-performance distributed computations

Citations

Globus: a Metacomputing Infrastructure Toolkit

Grid information services for distributed resource sharing

The network weather service: a distributed resource performance forecasting service for metacomputing

The data grid

A taxonomy and survey of grid resource management systems for distributed computing

References

Globus: a Metacomputing Infrastructure Toolkit

Condor-a hunter of idle workstations

A high-performance, portable implementation of the MPI message passing interface standard

Portable implementation of the mpi message passing interface standard

PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing

Related Papers (5)

The Grid 2: Blueprint for a New Computing Infrastructure

Grid information services for distributed resource sharing

Globus: a Metacomputing Infrastructure Toolkit

The Anatomy of the Grid: Enabling Scalable Virtual Organizations

The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration