scispace - formally typeset
Search or ask a question

Showing papers on "Scalability published in 2001"


Proceedings ArticleDOI
16 Jul 2001
TL;DR: A novel approach to the localization of sensors in an ad-hoc network that enables sensor nodes to discover their locations using a set distributed iterative algorithms is described.
Abstract: The recent advances in radio and em beddedsystem technologies have enabled the proliferation of wireless microsensor networks. Such wirelessly connected sensors are released in many diverse environments to perform various monitoring tasks. In many such tasks, location awareness is inherently one of the most essential system parameters. It is not only needed to report the origins of events, but also to assist group querying of sensors, routing, and to answer questions on the network coverage. In this paper we present a novel approach to the localization of sensors in an ad-hoc network. We describe a system called AHLoS (Ad-Hoc Localization System) that enables sensor nodes to discover their locations using a set distributed iterative algorithms. The operation of AHLoS is demonstrated with an accuracy of a few centimeters using our prototype testbed while scalability and performance are studied through simulation.

2,931 citations


Journal ArticleDOI
TL;DR: SPADE is a new algorithm for fast discovery of Sequential Patterns that utilizes combinatorial properties to decompose the original problem into smaller sub-problems, that can be independently solved in main-memory using efficient lattice search techniques, and using simple join operations.
Abstract: In this paper we present SPADE, a new algorithm for fast discovery of Sequential Patterns. The existing solutions to this problem make repeated database scans, and use complex hash structures which have poor locality. SPADE utilizes combinatorial properties to decompose the original problem into smaller sub-problems, that can be independently solved in main-memory using efficient lattice search techniques, and using simple join operations. All sequences are discovered in only three database scans. Experiments show that SPADE outperforms the best previous algorithm by a factor of two, and by an order of magnitude with some pre-processed data. It also has linear scalability with respect to the number of input-sequences, and a number of other database parameters. Finally, we discuss how the results of sequence mining can be applied in a real application domain.

2,063 citations


Proceedings ArticleDOI
21 Oct 2001
TL;DR: The Cooperative File System is a new peer-to-peer read-only storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval with a completely decentralized architecture that can scale to large systems.
Abstract: The Cooperative File System (CFS) is a new peer-to-peer read-only storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers provide a distributed hash table (DHash) for block storage. CFS clients interpret DHash blocks as a file system. DHash distributes and caches blocks at a fine granularity to achieve load balance, uses replication for robustness, and decreases latency with server selection. DHash finds blocks using the Chord location protocol, which operates in time logarithmic in the number of servers.CFS is implemented using the SFS file system toolkit and runs on Linux, OpenBSD, and FreeBSD. Experience on a globally deployed prototype shows that CFS delivers data to clients as fast as FTP. Controlled tests show that CFS is scalable: with 4,096 servers, looking up a block of data involves contacting only seven servers. The tests also demonstrate nearly perfect robustness and unimpaired performance even when as many as half the servers fail.

1,733 citations


Proceedings ArticleDOI
07 Aug 2001
TL;DR: This work presents an information services architecture that addresses performance, security, scalability, and robustness requirements of Grid software infrastructure and has been implemented as MDS-2, which forms part of the Globus Grid toolkit and has be widely deployed and applied.
Abstract: Grid technologies enable large-scale sharing of resources within formal or informal consortia of individuals and/or institutions: what are sometimes called virtual organizations. In these settings, the discovery, characterization, and monitoring of resources, services, and computations are challenging problems due to the considerable diversity; large numbers, dynamic behavior, and geographical distribution of the entities in which a user might be interested. Consequently, information services are a vital part of any Grid software infrastructure, providing fundamental mechanisms for discovery and monitoring, and hence for planning and adapting application behavior. We present an information services architecture that addresses performance, security, scalability, and robustness requirements. Our architecture defines simple low-level enquiry and registration protocols that make it easy to incorporate individual entities into various information structures, such as aggregate directories that support a variety of different query languages and discovery strategies. These protocols can also be combined with other Grid protocols to construct additional higher-level services and capabilities such as brokering, monitoring, fault detection, and troubleshooting. Our architecture has been implemented as MDS-2, which forms part of the Globus Grid toolkit and has been widely deployed and applied.

1,707 citations


Journal ArticleDOI
TL;DR: SIENA, an event notification service that is designed and implemented to exhibit both expressiveness and scalability, is presented and the service's interface to applications, the algorithms used by networks of servers to select and deliver event notifications, and the strategies used to optimize performance are described.
Abstract: The components of a loosely coupled system are typically designed to operate by generating and responding to asynchronous events. An event notification service is an application-independent infrastructure that supports the construction of event-based systems, whereby generators of events publish event notifications to the infrastructure and consumers of events subscribe with the infrastructure to receive relevant notifications. The two primary services that should be provided to components by the infrastructure are notification selection (i. e., determining which notifications match which subscriptions) and notification delivery (i.e., routing matching notifications from publishers to subscribers). Numerous event notification services have been developed for local-area networks, generally based on a centralized server to select and deliver event notifications. Therefore, they suffer from an inherent inability to scale to wide-area networks, such as the Internet, where the number and physical distribution of the service's clients can quickly overwhelm a centralized solution. The critical challenge in the setting of a wide-area network is to maximize the expressiveness in the selection mechanism without sacrificing scalability in the delivery mechanism. This paper presents SIENA, an event notification service that we have designed and implemented to exhibit both expressiveness and scalability. We describe the service's interface to applications, the algorithms used by networks of servers to select and deliver event notifications, and the strategies used to optimize performance. We also present results of simulation studies that examine the scalability and performance of the service.

1,568 citations


Patent
09 Apr 2001
TL;DR: In this article, the authors present a system and apparatus for efficient and reliable, control and distribution of data files or portions of files, applications, or other data objects in large-scale distributed networks.
Abstract: The present invention provides a system and apparatus for efficient and reliable, control and distribution of data files or portions of files, applications, or other data objects in large-scale distributed networks. A unique content-management front-end provides efficient controls for triggering distribution of digitized data content to selected groups of a large number of remote computer servers. Transport-layer protocols interact with distribution controllers to automatically determine an optimized tree-like distribution sequence to group leaders selected by network devices at each remote site. Reliable store-and-forward transfer to clusters is accomplished using a unicast protocol in the ordered tree sequence. Once command messages and content arrive at all participating group leaders, local hybrid multicast protocols efficiently and reliably distribute them to the back-end nodes for interpretation and execution. Positive acknowledgement is then sent back to the content manager from each group leader, and the updated content in each remote device autonomously goes 'live' when the content change is locally completed.

1,261 citations


Journal ArticleDOI
TL;DR: An overview of the FGS video coding technique is provided in this Amendment of the MPEG-4 to address a variety of challenging problems in delivering video over the Internet.
Abstract: Streaming video profile is the subject of an Amendment of MPEG-4, and is developed in response to the growing need on a video coding standard for streaming video over the Internet. It provides the capability to distribute single-layered frame-based video over a wide range of bit rates with high coding efficiency. It also provides fine granularity scalability (FGS), and its combination with temporal scalability, to address a variety of challenging problems in delivering video over the Internet. This paper provides an overview of the FGS video coding technique in this Amendment of the MPEG-4.

1,023 citations


Proceedings ArticleDOI
27 Aug 2001
TL;DR: A 'crawler' is built to extract the topology of Gnutella's application level network, a topology graph is analyzed and the current configuration has the benefits and drawbacks of a power-law structure.
Abstract: Despite recent excitement generated by the P2P paradigm and despite surprisingly fast deployment of some P2P applications, there are few quantitative evaluations of P2P system behavior. Due to its open architecture and achieved scale, Gnutella is an interesting P2P architecture case study. Gnutella, like most other P2P applications, builds at the application level a virtual network with its own routing mechanisms. The topology of this overlay network and the routing mechanisms used have a significant influence on application properties such as performance, reliability, and scalability. We built a 'crawler' to extract the topology of Gnutella's application level network, we analyze the topology graph and evaluate generated network traffic. We find that although Gnutella is not a pure power-law network, its current configuration has the benefits and drawbacks of a power-law structure. These findings lead us to propose changes to the Gnutella protocol and implementations that bring significant performance and scalability improvements.

824 citations


Proceedings ArticleDOI
20 May 2001
TL;DR: The design of PAST is sketched, a large-scale, Internet-based, global storage utility that provides scalability, high availability, persistence and security, and the use of randomization to ensure diversity in the set of nodes that store a file's replicas.
Abstract: This paper sketches the design of PAST, a large-scale, Internet-based, global storage utility that provides scalability, high availability, persistence and security. PAST is a peer-to-peer Internet application and is entirely self-organizing. PAST nodes serve as access points for clients, participate in the routing of client requests, and contribute storage to the system. Nodes are not trusted, they may join the system at any time and may silently leave the system without warning. Yet, the system is able to provide strong assurances, efficient storage access, load balancing and scalability. Among the most interesting aspects of PAST's design are (1) the Pastry location and routing scheme, which reliably and efficiently routes client requests among the PAST nodes, has good network locality properties and automatically resolves node failures and node additions; (2) the use of randomization to ensure diversity in the set of nodes that store a file's replicas and to provide load balancing; and (3) the optional use of smartcards, which are held by each PAST user and issued by a third party called a broker The smartcards support a quota system that balances supply and demand of storage in the system.

702 citations


Book ChapterDOI
05 Sep 2001
TL;DR: This paper introduces P-Grid, a scalable access structure that is specifically designed for Peer-To-Peer information systems, which provide reliable data access even with unreliable peers, and scale gracefully both in storage and communication cost.
Abstract: Peer-To-Peer systems are driving a major paradigm shift in the era of genuinely distributed computing. Gnutella is a good example of a Peer-To-Peer success story: a rather simple software enables Internet users to freely exchange files, such as MP3 music files. But it shows up also some of the limitations of current P2P information systems with respect to their ability to manage data efficiently. In this paper we introduce P-Grid, a scalable access structure that is specifically designed for Peer-To-Peer information systems. P-Grids are constructed and maintained by using randomized algorithms strictly based on local interactions, provide reliable data access even with unreliable peers, and scale gracefully both in storage and communication cost.

490 citations


01 Jan 2001
TL;DR: P-Grid as mentioned in this paper is a scalable access structure specifically designed for P2P information systems, which is constructed and maintained by using randomized algorithms strictly based on local interactions, provide reliable data access even with unreliable peers, and scale gracefully both in storage and communication cost.
Abstract: Peer-To-Peer systems are driving a major paradigm shift in the era of genuinely distributed computing. Gnutella is a good example of a Peer-To-Peer success story: a rather simple software enables Internet users to freely exchange files, such as MP3 music files. But it shows up also some of the limitations of current P2P information systems with respect to their ability to manage data efficiently. In this paper we introduce P-Grid, a scalable access structure that is specifically designed for Peer-To-Peer information systems. P-Grids are constructed and maintained by using randomized algorithms strictly based on local interactions, provide reliable data access even with unreliable peers, and scale gracefully both in storage and communication cost. Keywords: Peer-To-Peer computing, Distributed Indexing, Distributed Databases, Randomized Algorithms.

Proceedings ArticleDOI
15 Oct 2001
TL;DR: This work presents a novel backoff-based cost field setup algorithm that finds the optimal costs of all nodes to the sink with one single message overhead at each node in a large sensor network.
Abstract: Wireless sensor networks offer a wide range of challenges to networking research, including unconstrained network scale, limited computing, memory and energy resources, and wireless channel errors. We study the problem of delivering messages from any sensor to an interested client user along the minimum-cost path in a large sensor network. We propose a new cost field based approach to minimum cost forwarding. In the design, we present a novel backoff-based cost field setup algorithm that finds the optimal costs of all nodes to the sink with one single message overhead at each node. Once the field is established, the message, carrying dynamic cost information, flows along the minimum cost path in the cost field. Each intermediate node forwards the message only if it finds itself to be on the optimal path, based on dynamic cost states. Our design does not require an intermediate node to maintain explicit "forwarding path" states. It requires a few simple operations and scales to any network size. We show the correctness and effectiveness of the design by both simulations and analysis.

Proceedings ArticleDOI
29 Nov 2001
TL;DR: This study shows that H-mine has high performance in various kinds of data, outperforms the previously developed algorithms in different settings, and is highly scalable in mining large databases.
Abstract: Methods for efficient mining of frequent patterns have been studied extensively by many researchers. However, the previously proposed methods still encounter some performance bottlenecks when mining databases with different data characteristics, such as dense vs. sparse, long vs. short patterns, memory-based vs. disk-based, etc. In this study, we propose a simple and novel hyper-linked data structure, H-struct and a new mining algorithm, H-mine, which takes advantage of this data structure and dynamically adjusts links in the mining process. A distinct feature of this method is that it has very limited and precisely predictable space overhead and runs really fast in memory-based setting. Moreover it can be scaled up to very large databases by database partitioning, and when the data set becomes dense, (conditional) FP-trees can be constructed dynamically as part of the mining process. Our study shows that H-mine has high performance in various kinds of data, outperforms the previously developed algorithms in different settings, and is highly scalable in mining large databases. This study also proposes a new data mining methodology, space-preserving mining, which may have strong impact in the future development of efficient and scalable data mining methods.

Proceedings ArticleDOI
15 May 2001
TL;DR: The paper presents the design of XtremWeb and presents two essential features of this design are multi-applications and high-performance, which are ensured by scalability, fault tolerance, efficient scheduling and a large base of volunteer PCs.
Abstract: Global computing achieves high throughput computing by harvesting a very large number of unused computing resources connected to the Internet. This parallel computing model targets a parallel architecture defined by a very high number of nodes, poor communication performance and continuously varying resources. The unprecedented scale of the global computing architecture paradigm requires us to revisit many basic issues related to parallel architecture programming models, performance models, and class of applications or algorithms suitable for this architecture. XtremWeb is an experimental global computing platform dedicated to provide a tool for such studies. The paper presents the design of XtremWeb. Two essential features of this design are multi-applications and high-performance. Accepting multiple applications allows institutions or enterprises to set up their own global computing applications or experiments. High-performance is ensured by scalability, fault tolerance, efficient scheduling and a large base of volunteer PCs. We also present an implementation of the first global application running on XtremWeb.

Proceedings ArticleDOI
23 Apr 2001
TL;DR: This work argues that a computational economy is required in order to create a real world scalableGrid because it provides a mechanism for regulating the Grid resources demand and supply and offers incentive for resourceowners to be part of the Grid and encourages consumers to optimally utilize resources and balance timeframe and accesscosts.
Abstract: Computational Grids are a promising platform for executinglarge-scale resource intensive applications. However, resource management and scheduling in the Grid environment is a complex undertaking as resources are (geographically) distributed, heterogeneous in nature, owned by different individuals or organizations with their own policies, have different access and cost models, and have dynamically varying loads and availability. This introduces a number of challenging issues such as site autonomy, heterogeneous interaction, policy extensibility, resource allocation or co-allocation, online control, scalability, transparency, resource brokering, and "computational economy".A number of Grid systems (such as Globus and Legion)have addressed many of these issues with exception of acomputational economy. We argue that a computationaleconomy is required in order to create a real world scalableGrid because it provides a mechanism for regulating the Gridresources demand and supply. It offers incentive for resourceowners to be part of the Grid and encourages consumers tooptimally utilize resources and balance timeframe and accesscosts. We propose a 'computational economy framework' thatbuilds on the existing Grid middleware systems and offers aninfrastructure for resource management and trading in theGrid environment. We discuss the usage economic models forresource trading in the Nimrod/G resource broker and presentdeadline and cost-based scheduling experimental results onthe Grid.

Proceedings ArticleDOI
01 May 2001
TL;DR: A fast and scalable algorithm for determining whether part or all of a query can be computed from materialized views and how it can be incorporated in transformation-based optimizers is presented.
Abstract: Materialized views can provide massive improvements in query processing time, especially for aggregation queries over large tables. To realize this potential, the query optimizer must know how and when to exploit materialized views. This paper presents a fast and scalable algorithm for determining whether part or all of a query can be computed from materialized views and describes how it can be incorporated in transformation-based optimizers. The current version handles views composed of selections, joins and a final group-by. Optimization remains fully cost based, that is, a single “best” rewrite is not selected by heuristic rules but multiple rewrites are generated and the optimizer chooses the best alternative in the normal way. Experimental results based on an implementation in Microsoft SQL Server show outstanding performance and scalability. Optimization time increases slowly with the number of views but remains low even up to a thousand.

Proceedings Article
11 Sep 2001
TL;DR: This paper model a file-sharing application, developing a probabilistic model to describe query behavior and expected query result sizes and an analytic model to describes system performance, and discusses the tradeoffs between the architectures.
Abstract: “Peer-to-peer” systems like Napster and Gnutella have recently become popular for sharing information. In this paper, we study the relevant issues and tradeoffs in designing a scalable P2P system. We focus on a subset of P2P systems, known as “hybrid” P2P, where some functionality is still centralized. (In Napster, for example, indexing is centralized, and file exchange is distributed.) We model a file-sharing application, developing a probabilistic model to describe query behavior and expected query result sizes. We also develop an analytic model to describe system performance. Using experimental data collected from a running, publicly available hybrid P2P system, we validate both models. We then present several hybrid P2P system architectures and evaluate them using our model. We discuss the tradeoffs between the architectures and highlight the effects of key parameter values on system performance.

Proceedings ArticleDOI
25 Nov 2001
TL;DR: Targeted at multi-hop wireless sensor networks, a set of low power MAC design principles have been proposed, and a novel ultra-low power MAC is designed to be distributed in nature to support scalability, survivability and adaptability requirements.
Abstract: Targeted at multi-hop wireless sensor networks, a set of low power MAC design principles have been proposed, and a novel ultra-low power MAC is designed to be distributed in nature to support scalability, survivability and adaptability requirements. Simple CSMA and spread spectrum techniques are combined to trade off bandwidth and power efficiency. A distributed algorithm is used to do dynamic channel assignment. A novel wake-up radio scheme is incorporated to take advantage of new radio technologies. The notion of mobility awareness is introduced into an adaptive protocol to reduce network maintenance overhead. The resulting protocol shows much higher power efficiency for typical sensor network applications.

Journal ArticleDOI
TL;DR: Explores mechanisms for storage-level management in OceanStore, a global-scale distributed storage utility infrastructure, designed to scale to billions of users and exabytes of data, and concludes that OceanStore is self-maintaining.
Abstract: Explores mechanisms for storage-level management in OceanStore, a global-scale distributed storage utility infrastructure, designed to scale to billions of users and exabytes of data OceanStore automatically recovers from server and network failures, incorporates new resources and adjusts to usage patterns It provides its storage platform through adaptation, fault tolerance and repair The only role of human administrators in the system is to physically attach or remove server hardware Of course, an open question is how to scale a research prototype in such a way to demonstrate the basic thesis of this article - that OceanStore is self-maintaining The allure of connecting millions or billions of components together is the hope that aggregate systems can provide scalability and predictable behavior under a wide variety of failures The OceanStore architecture is a step towards this goal

Proceedings ArticleDOI
01 Oct 2001
TL;DR: This work proposes a key management scheme to periodically update the symmetric keys used by all pebbles, combining mobility-adaptive clustering and an effective probabilistic selection of the key-generating node, which meets the requirements of efficiency, scalability and security needed for the survivability of networks of pebble (pebblenets).
Abstract: We consider the problem of securing communication in large ad hoc networks, i.e., wireless networks with no fixed, wired infrastructure and with multi-hop routes. Such networks, e.g., networks of sensors, are deployed for applications such as microsensing, monitoring and control, and for extending the peer-to-peer communication capability of smaller group of network users. Because the nodes of these networks, which we term pebbles for their very limited size and large number, are resource constrained, only symmetric key cryptography is feasible. We propose a key management scheme to periodically update the symmetric keys used by all pebbles. By combining mobility-adaptive clustering and an effective probabilistic selection of the key-generating node, the proposed scheme meets the requirements of efficiency, scalability and security needed for the survivability of networks of pebbles (pebblenets)

Proceedings ArticleDOI
10 Nov 2001
TL;DR: In this article, a predictive analytical model that encompasses the performance and scaling characteristics of an important ASCI application is presented, and validated against measurements on several systems including ASCI Blue Mountain, ASCI White, and a Compaq Alphaserver ES45 system showing high accuracy.
Abstract: In this work we present a predictive analytical model that encompasses the performance and scaling characteristics of an important ASCI application. SAGE (SAIC's Adaptive Grid Eulerian hydrocode) is a multidimensional hydrodynamics code with adaptive mesh refinement. The model is validated against measurements on several systems including ASCI Blue Mountain, ASCI White, and a Compaq Alphaserver ES45 system showing high accuracy. It is parametric --- basic machine performance numbers (latency, MFLOPS rate, bandwidth) and application characteristics (problem size, decomposition method, etc.) serve as input. The model is applied to add insight into the performance of current systems, to reveal bottlenecks, and to illustrate where tuning efforts can be effective. We also use the model to predict performance on future systems.

Proceedings ArticleDOI
12 Jun 2001
TL;DR: An overview of the research in real time data mining-based intrusion detection systems (IDS) and an architecture consisting of sensors, detectors, a data warehouse, and model generation components is presented that improves the efficiency and scalability of the IDS.
Abstract: We present an overview of our research in real time data mining-based intrusion detection systems (IDSs). We focus on issues related to deploying a data mining-based IDS in a real time environment. We describe our approaches to address three types of issues: accuracy, efficiency, and usability. To improve accuracy, data mining programs are used to analyze audit data and extract features that can distinguish normal activities from intrusions; we use artificial anomalies along with normal and/or intrusion data to produce more effective misuse and anomaly detection models. To improve efficiency, the computational costs of features are analyzed and a multiple-model cost-based approach is used to produce detection models with low cost and high accuracy. We also present a distributed architecture for evaluating cost-sensitive models in real-time. To improve usability, adaptive learning algorithms are used to facilitate model construction and incremental updates; unsupervised anomaly detection algorithms are used to reduce the reliance on labeled data. We also present an architecture consisting of sensors, detectors, a data warehouse, and model generation components. This architecture facilitates the sharing and storage of audit data and the distribution of new or updated models. This architecture also improves the efficiency and scalability of the IDS.

Journal ArticleDOI
TL;DR: It is demonstrated that the incorporation of energy considerations into multicast algorithms can, indeed, result in improved energy efficiency.
Abstract: In this paper we address the problem of multicasting in ad hoc wireless networks from the viewpoint of energy efficiency. We discuss the impact of hte wireless medium on the multicasting problem and the fundamental trade-offs that arise. We propose and evaluate several algorithms for defining multicast trees for session (or connection-oriented) traffic when transceiver resources are limited. The algorithms select the relay nodes and the corresponding transmission power levels, and achieve different degrees of scalability and performance. We demonstrate that the incorporation of energy considerations into multicast algorithms can, indeed, result in improved energy efficiency.

Patent
13 Jun 2001
TL;DR: In this article, a Virtual Server Farm (VSF) is created out of a wide scale computing fabric ('Computing Grid') which is physically constructed once and then logically divided up into VSFs for various organizations on demand.
Abstract: Methods and apparatus providing, controlling and managing a dynamically sized, highly scalable and available server farm are disclosed. A Virtual Server Farm (VSF) is created out of a wide scale computing fabric ('Computing Grid') which is physically constructed once and then logically divided up into VSFs for various organizations on demand. Each organization retains independent administrative control of a VSF. A VSF is dynamically firewalled within the Computing Grid. Allocation and control of the elements in the VSF is performed by a control plane connected to all computing, networking, and storage elements in the computing grid through special control ports. The internal topology of each VSF is under control of the control plane. No physical rewiring is necessary in order to construct VSFs in many different configurations, including single-tier Web server or multi-tier Web-server, application server, database server configurations.

Posted Content
TL;DR: In this article, a high-level replica selection service that uses information regarding replica location and user preferences to guide selection from among storage replica alternatives is presented, and the use of Condor's ClassAds resource description and matchmaking mechanism as an efficient tool for representing and matching storage resource capabilities and policies against application requirements.
Abstract: The Globus Data Grid architecture provides a scalable infrastructure for the management of storage resources and data that are distributed across Grid environments. These services are designed to support a variety of scientific applications, ranging from high-energy physics to computational genomics, that require access to large amounts of data (terabytes or even petabytes) with varied quality of service requirements. By layering on a set of core services, such as data transport, security, and replica cataloging, one can construct various higher-level services. In this paper, we discuss the design and implementation of a high-level replica selection service that uses information regarding replica location and user preferences to guide selection from among storage replica alternatives. We first present a basic replica selection service design, then show how dynamic information collected using Globus information service capabilities concerning storage system properties can help improve and optimize the selection process. We demonstrate the use of Condor's ClassAds resource description and matchmaking mechanism as an efficient tool for representing and matching storage resource capabilities and policies against application requirements.

Patent
10 Sep 2001
TL;DR: In this paper, the authors provide on-demand, scalable computational resources to application providers over a distributed network and system, where resources are made available based on demand for applications, and application providers are charged fees based on the amount of resources utilized to satisfy the needs of the application.
Abstract: Method, system, apparatus, and computer program and computer program product provide on-demand, scalable computational resources to application providers over a distributed network and system. Resources are made available based on demand for applications. Application providers are charged fees based on the amount of resources utilized to satisfy the needs of the application. In providing compute resources, method and apparatus is capable of rapidly activating a plurality of instances of the applications as demand increases and to halt instances as demand drops. Application providers are charged based on metered amount of computational resources utilized in processing their applications. Application providers access the network to distribute applications onto network to utilize distributed compute resources for processing of the applications. Application providers are further capable of monitoring, updating and replacing distributed applications. Apparatus and system includes plurality of computing resources distributed across a network capable of restoring and snapshotting provisioned applications based on demand.

Proceedings ArticleDOI
15 May 2001
TL;DR: This work discusses the design and implementation of a high-level replica selection service that uses information regarding replica location and user preferences to guide selection from among storage replica alternatives, and shows how dynamic information collected using Globus information service capabilities concerning storage system properties can help improve and optimize the selection process.
Abstract: The Globus Data Grid architecture (I. Foster and C. Kesselman, 1998), provides a scalable infrastructure for the management of storage resources and data that are distributed across Grid environments. These services are designed to support a variety of scientific applications, ranging from high-energy physics to computational genomics, that require access to large amounts of data (terabytes or even petabytes) with varied quality of service requirements. By layering on a set of core services, such as data transport, security, and replica cataloging, one can construct various higher-level services. We discuss the design and implementation of a high-level replica selection service that uses information regarding replica location and user preferences to guide selection from among storage replica alternatives. We first present a basic replica selection service design, then show how dynamic information collected using Globus information service capabilities concerning storage system properties can help improve and optimize the selection process. We demonstrate the use of Condor's ClassAds resource description and matchmaking mechanism as an efficient tool for representing and matching storage resource capabilities and policies against application requirements.

Proceedings ArticleDOI
01 Dec 2001
TL;DR: This paper surveys the design space of a new class of architectures called Grid Processor Architectures (GPAs), designed to scale with technology, allowing faster clock rates than conventional architectures while providing superior instruction-level parallelism on traditional workloads and high performance across a range of application classes.
Abstract: In this paper, we survey the design space of a new class of architectures called Grid Processor Architectures (GPAs). These architectures are designed to scale with technology, allowing faster clock rates than conventional architectures while providing superior instruction-level parallelism on traditional workloads and high performance across a range of application classes. A GPA consists of an array of ALUs, each with limited control, connected by a thin operand network. Programs are executed by mapping blocks of statically scheduled instructions to the ALU array and executing them dynamically in dataflow order. This organization enables the critical paths of instruction blocks to be executed on chains of ALUs without transmitting temporary values back to the register file, avoiding most of the large, unscalable structures that limit the scalability of conventional architectures. Finally, we present simulation results of a preliminary design, the GPA-1. With a half-cycle routing delay, we obtain performance roughly equal to an ideal 8-way, 512-entry window superscalar core. With no inter-ALU delay, perfect memory, and perfect branch prediction, the IPC of the GPA-1 is more than twice that of the ideal superscalar core, achieving an average of 11 IPC across nine SPEC CPU2000 and Mediabench benchmarks.

Proceedings ArticleDOI
22 Aug 2001
TL;DR: The Alpha 21364 processor provides a high-performance, highly scalable, and highly reliable network architecture that provides a variety of reliability features, such as per-flit ECC, that make the network well-suited to support communication-intensive server applications.
Abstract: The Alpha 21364 processor provides a high-performance, highly scalable, and highly reliable network architecture. The router runs at 1.2 GHz and routes packets at a peak bandwidth of 22.4 GB/s. The network architecture scales up to a 128-processor configuration, which can support up to four terabytes of distributed Rambus memory and hundreds of terabytes of disk storage. The distributed Rambus memory is kept coherent via a scalable, directory-based cache coherence scheme. The network also provides a variety of reliability features, such as per-flit ECC. These features make the 21364 network architecture well-suited to support communication-intensive server applications.

Proceedings ArticleDOI
01 Aug 2001
TL;DR: This paper quantifies the optimal scalability, in terms of network load, of distributed, complete failure detectors as a function of application-specified requirements, and presents a randomized, distributed, failure detector algorithm that imposes an equal expected load per group member.
Abstract: Process groups in distributed applications and services rely on failure detectors to detect process failures completely, and as quickly, accurately, and scalably as possible, even in the face of unreliable message deliveries. In this paper, we look at quantifying the optimal scalability, in terms of network load, (in messages per second, with messages having a size limit) of distributed, complete failure detectors as a function of application-specified requirements. These requirements are 1) quick failure detection by some non-faulty process, and 2) accuracy of failure detection. We assume a crash-recovery (non-Byzantine) failure model, and a network model that is probabilistically unreliable (w.r.t. message deliveries and process failures). First, we characterize, under certain independence assumptions, the optimum worst-case network load imposed by any failure detector that achieves an application's requirements. We then discuss why traditional heart beating schemes are inherently unscalable according to the optimal load. We also present a randomized, distributed, failure detector algorithm that imposes an equal expected load per group member. This protocol satisfies the application defined constraints of completeness and accuracy, and speed of detection on an average. It imposes a network load that differs frown the optimal by a sub-optimality factor that is much lower than that for traditional distributed heartbeating schemes. Moreover, this sub-optimality factor does not vary with group size (for large groups).