scispace - formally typeset
Open AccessProceedings ArticleDOI

Multi-class latency-bounded Web services

V. Kanodia, +1 more
- pp 231-239
Reads0
Chats0
TLDR
The goal of this work is to design a general "front-end" algorithm that uses a general service abstraction to adaptively control not only the latency of a particular class, but also to assess the inter-class relationships.
Abstract
Two recent advances have resulted in significant improvements in Web server quality of service. First, both centralized and distributed Web servers can provide isolation among service classes by fairly distributing system resources. Second, session admission control can protect classes from performance degradation due to overload. The goal of this work is to design a general "front-end" algorithm that uses these two building blocks to support a new Web service model, namely, multi-class services which control response latencies to within pre-specified targets. Our key technique is to devise a general service abstraction to adaptively control not only the latency of a particular class, but also to assess the inter-class relationships. In this way, we capture the extent to which classes are isolated or share system resources (as determined by the server architecture and system internals) and hence their effects on each other's QoS. For example, if the server provides class isolation (i.e., a minimum fraction of system resources independent of other classes), yet also allows a class to utilize unused resources from other classes, the algorithm infers and exploits this behavior without an explicit low level model of the server. Thus, as new functionalities are incorporated into Web servers, the approach naturally exploits their properties to efficiently satisfy the classes' performance targets. We validate the scheme with trace driven simulations.

read more

Content maybe subject to copyright    Report






Citations
More filters
Proceedings ArticleDOI

An analytical model for multi-tier internet services and its applications

TL;DR: This paper presents a model based on a network of queues, where the queues represent different tiers of the application, sufficiently general to capture the behavior of tiers with significantly different performance characteristics and application idiosyncrasies such as session-based workloads, concurrency limits, and caching at intermediate tiers.
Journal ArticleDOI

Performance guarantees for Web server end-systems: a control-theoretical approach

TL;DR: This paper uses feedback control theory to achieve overload protection, performance guarantees, and service differentiation in the presence of load unpredictability, and shows that control-theoretic techniques offer a sound way of achieving desired performance in performance-critical Internet applications.
Journal ArticleDOI

Agile dynamic provisioning of multi-tier Internet applications

TL;DR: A novel dynamic provisioning technique for multi-tier Internet applications that employs a flexible queuing model to determine how much of the resources to allocate to each tier of the application, and a combination of predictive and reactive methods that determine when to provision these resources, both at large and small time scales is proposed.
Journal ArticleDOI

Session-based admission control: a mechanism for peak load management of commercial Web sites

TL;DR: It is shown that a Web server augmented with the admission control mechanism is able to provide a fair guarantee of completion, for any accepted session, independent of a session length, which is a critical requirement for any e-business.
Proceedings Article

Adaptive overload control for busy internet servers

TL;DR: This paper presents a set of techniques for managing overload in complex, dynamic Internet services based on an adaptive admission control mechanism that attempts to bound the 90th-percentile response time of requests flowing through the service.
References
More filters
Journal ArticleDOI

Self-similarity in World Wide Web traffic: evidence and possible causes

TL;DR: It is shown that the self-similarity in WWW traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in file transfer, the effect of user "think time", and the superimposition of many such transfers in a local-area network.
Journal ArticleDOI

Self-similarity in World Wide Web traffic: evidence and possible causes

TL;DR: It is shown that the self-similarity in WWW traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in file transfer, the effect of user "think time", and the superimposition of many such transfers in a local area network.
Proceedings Article

Cost-aware WWW proxy caching algorithms

TL;DR: GreedyDual-Size as discussed by the authors incorporates locality with cost and size concerns in a simple and nonparameterized fashion for high performance, which can potentially improve the performance of main-memory caching of Web documents.
Proceedings ArticleDOI

Resource containers: a new facility for resource management in server systems

TL;DR: This work proposes and evaluates a new operating system abstraction called a resource container, which separates the notion of a protection domain from that of a resource principal, and enables fine-grained resource management in server systems and allow the development of robust servers, with simple and firm control over priority policies.
Journal ArticleDOI

Locality-aware request distribution in cluster-based network servers

TL;DR: A simple, practical strategy for locality-aware request distribution (LARD), in which the front-end distributes incoming requests in a manner that achieves high locality in the back-ends' main memory caches as well as load balancing.
Related Papers (5)