scispace - formally typeset
Search or ask a question
Author

Mohit Aron

Bio: Mohit Aron is an academic researcher from Rice University. The author has contributed to research in topics: Web server & Server. The author has an hindex of 11, co-authored 14 publications receiving 1631 citations.
Topics: Web server, Server, Network packet, Server farm, H-TCP

Papers
More filters
Journal ArticleDOI
01 Oct 1998
TL;DR: A simple, practical strategy for locality-aware request distribution (LARD), in which the front-end distributes incoming requests in a manner that achieves high locality in the back-ends' main memory caches as well as load balancing.
Abstract: We consider cluster-based network servers in which a front-end directs incoming requests to one of a number of back-ends. Specifically, we consider content-based request distribution: the front-end uses the content requested, in addition to information about the load on the back-end nodes, to choose which back-end will handle this request. Content-based request distribution can improve locality in the back-ends' main memory caches, increase secondary storage scalability by partitioning the server's database, and provide the ability to employ back-end nodes that are specialized for certain types of requests.As a specific policy for content-based request distribution, we introduce a simple, practical strategy for locality-aware request distribution (LARD). With LARD, the front-end distributes incoming requests in a manner that achieves high locality in the back-ends' main memory caches as well as load balancing. Locality is increased by dynamically subdividing the server's working set over the back-ends. Trace-based simulation results and measurements on a prototype implementation demonstrate substantial performance improvements over state-of-the-art approaches that use only load information to distribute requests. On workloads with working sets that do not fit in a single server node's main memory cache, the achieved throughput exceeds that of the state-of-the-art approach by a factor of two to four.With content-based distribution, incoming requests must be handed off to a back-end in a manner transparent to the client, after the front-end has inspected the content of the request. To this end, we introduce an efficient TCP handoflprotocol that can hand off an established TCP connection in a client-transparent manner.

643 citations

Proceedings Article
18 Jun 2000
TL;DR: In this architecture, a level-4 switch acts as the point of contact for the server on the Inernet and distributes the incoming requests to a number of back-end nodes, rather than being centralized in the front-end node.
Abstract: We present a scalable architecture for content-aware request distribution in web server clusters. In this architecture, a level-4 switch acts as the point of contact for the server on the Inernet and distributes the incoming requests to a number of back-end nodes. The switch does not perform any contect-based distribution. This function is performed by each of the back-end nodes, which may forward the incoming request to another back-end based on the requested contect. In terms of scalability, this architecture compares favorably to existing approaches where a front-end node performs contect-based distribution. In our architecture, the expensive operations of TCP connection estabilishment and handoff are distributes among the back-ends, rather than being centralized in the front-end node. Only a minimal additional latency penatly is paid for much improved scalability. We have implemented this new architecture, and we demonstrate its superior scalability by comparing it to a system that performs contect-aware distribution in the front-end, both under synthetic and trace-drive workloads.

273 citations

Proceedings ArticleDOI
01 Jun 2000
TL;DR: This paper presents a design and evaluates a prototype implementation that extends existing techniques for performance isolation on a single node server to cluster based servers, and demonstrates that cluster reserves are effective in ensuring performance isolation while enabling high utilization of the server resources.
Abstract: In network (e.g., Web) servers, it is often desirable to isolate the performance of different classes of requests from each other. That is, one seeks to achieve that a certain minimal proportion of server resources are available for a class of requests, independent of the load imposed by other requests. Recent work demonstrates how to achieve this performance isolation in servers consisting of a single, centralized node; however, achieving performance isolation in a distributed, cluster based server remains a problem.This paper introduces a new abstraction, the cluster reserve, which represents a resource principal in a cluster based network server. We present a design and evaluate a prototype implementation that extends existing techniques for performance isolation on a single node server to cluster based servers.In our design, the dynamic cluster-wide resource management problem is formulated as a constrained optimization problem, with the resource allocations on individual machines as independent variables, and the desired cluster-wide resource allocations as constraints. Periodically collected resource usages serve as further inputs to the problem.Experimental results show that cluster reserves are effective in providing performance isolation in cluster based servers. We demonstrate that, in a number of different scenarios, cluster reserves are effective in ensuring performance isolation while enabling high utilization of the server resources.

270 citations

Proceedings Article
06 Jun 1999
TL;DR: Two mechanisms for the efficient, content-based distribution of HTTP/1.1 requests among the back-end nodes of a cluster server, combined with an extension of the locality-aware request distribution (LARD) policy, are presented.
Abstract: This paper studies mechanisms and policies for supporting HTTP/1.1 persistent connections in cluster-based Web servers that employ content-based request distribution. We present two mechanisms for the efficient, content-based distribution of HTTP/1.1 requests among the back-end nodes of a cluster server. A trace-driven simulation shows that these mechanisms, combined with an extension of the locality-aware request distribution (LARD) policy, are effective in yielding scalable performance for HTTP/1.1 requests. We implemented the simpler of these two mechanisms, back-end forwarding. Measurements of this mechanism in connection with extended LARD on a prototype cluster, driven with traces from actual Web servers, confirm the simulation results. The throughput of the prototype is up to four times better than that achieved by conventional weighted round-robin request distribution. In addition, throughput with persistent connections is up to 26% better than without.

128 citations

Journal ArticleDOI
TL;DR: This paper proposes and evaluates soft timers, a new operating system facility that allows the efficient scheduling of software events at agranularity down to tens of microseconds, and shows that this technique can improve the throughput of a Web server by up to 25%.
Abstract: This paper proposes and evaluates soft timers, a new operating system facility that allows the efficient scheduling of software events at agranularity down to tens of microseconds. Soft timers can be used to avoid interrupts and reduce context switches associated with network processing, without sacrificing low communication delays. More specifically, soft timers enable transport protocols like TCP to efficiently perform rate-based clocking of packet transmissions. Experiments indicate that soft timers allow a server to employ rate-based clocking with little CPU overhead (2-6%) at high aggregate bandwidths. Soft timers can also be used to perform network polling, which eliminates network interrupts and increases the memory access locality of the network subsystem without sacrificing delay. Experiments show that this technique can improve the throughput of a Web server by up to 25%.

102 citations


Cited by
More filters
Proceedings ArticleDOI
21 Oct 2001
TL;DR: Experimental results from a prototype confirm that the system adapts to offered load and resource availability, and can reduce server energy usage by 29% or more for a typical Web workload.
Abstract: Internet hosting centers serve multiple service sites from a common hardware base. This paper presents the design and implementation of an architecture for resource management in a hosting center operating system, with an emphasis on energy as a driving resource management issue for large server clusters. The goals are to provision server resources for co-hosted services in a way that automatically adapts to offered load, improve the energy efficiency of server clusters by dynamically resizing the active server set, and respond to power supply disruptions or thermal events by degrading service in accordance with negotiated Service Level Agreements (SLAs).Our system is based on an economic approach to managing shared server resources, in which services "bid" for resources as a function of delivered performance. The system continuously monitors load and plans resource allotments by estimating the value of their effects on service performance. A greedy resource allocation algorithm adjusts resource prices to balance supply and demand, allocating resources to their most efficient use. A reconfigurable server switching infrastructure directs request traffic to the servers assigned to each service. Experimental results from a prototype confirm that the system adapts to offered load and resource availability, and can reduce server energy usage by 29% or more for a typical Web workload.

1,492 citations

Proceedings Article
11 Apr 2007
TL;DR: This work presents Sandpiper, a system that automates the task of monitoring and detecting hotspots, determining a new mapping of physical to virtual resources and initiating the necessary migrations, and implements a black- box approach that is fully OS- and application-agnostic and a gray-box approach that exploits OS-and- application-level statistics.
Abstract: Virtualization can provide significant benefits in data centers by enabling virtual machine migration to eliminate hotspots. We present Sandpiper, a system that automates the task of monitoring and detecting hotspots, determining a new mapping of physical to virtual resources and initiating the necessary migrations. Sandpiper implements a black-box approach that is fully OS- and application-agnostic and a gray-box approach that exploits OS- and application-level statistics. We implement our techniques in Xen and conduct a detailed evaluation using a mix of CPU, network and memory-intensive applications. Our results show that Sandpiper is able to resolve single server hotspots within 20 seconds and scales well to larger, data center environments. We also show that the gray-box approach can help Sandpiper make more informed decisions, particularly in response to memory pressure.

931 citations

Proceedings Article
10 Apr 2005
TL;DR: This paper examines a theoretic thermodynamic formulation that uses information about steady state hot spots and cold spots in the data center and develops real-world scheduling algorithms, and develops an alternate approach to address the problem of heat management through temperature-aware workload placement.
Abstract: Trends towards consolidation and higher-density computing configurations make the problem of heat management one of the critical challenges in emerging data centers Conventional approaches to addressing this problem have focused at the facilities level to develop new cooling technologies or optimize the delivery of cooling In contrast to these approaches, our paper explores an alternate dimension to address this problem, namely a systems-level solution to control the heat generation through temperature-aware workload placement We first examine a theoretic thermodynamic formulation that uses information about steady state hot spots and cold spots in the data center and develop real-world scheduling algorithms Based on the insights from these results, we develop an alternate approach Our new approach leverages the non-intuitive observation that the source of cooling inefficiencies can often be in locations spatially uncorrelated with its manifested consequences; this enables additional energy savings Overall, our results demonstrate up to a factor of two reduction in annual data center cooling costs over location-agnostic workload distribution, purely through software optimizations without the need for any costly capital investment

740 citations

Proceedings ArticleDOI
21 Mar 2007
TL;DR: An adaptive resource control system that dynamically adjusts the resource shares to individual tiers in order to meet application-level quality of service (QoS) goals while achieving high resource utilization in the data center is developed.
Abstract: Data centers are often under-utilized due to over-provisioning as well as time-varying resource demands of typical enterprise applications. One approach to increase resource utilization is to consolidate applications in a shared infrastructure using virtualization. Meeting application-level quality of service (QoS) goals becomes a challenge in a consolidated environment as application resource needs differ. Furthermore, for multi-tier applications, the amount of resources needed to achieve their QoS goals might be different at each tier and may also depend on availability of resources in other tiers. In this paper, we develop an adaptive resource control system that dynamically adjusts the resource shares to individual tiers in order to meet application-level QoS goals while achieving high resource utilization in the data center. Our control system is developed using classical control theory, and we used a black-box system modeling approach to overcome the absence of first principle models for complex enterprise applications and systems. To evaluate our controllers, we built a testbed simulating a virtual data center using Xen virtual machines. We experimented with two multi-tier applications in this virtual data center: a two-tier implementation of RUBiS, an online auction site, and a two-tier Java implementation of TPC-W. Our results indicate that the proposed control system is able to maintain high resource utilization and meets QoS goals in spite of varying resource demands from the applications.

645 citations

Journal ArticleDOI
TL;DR: A novel dynamic provisioning technique for multi-tier Internet applications that employs a flexible queuing model to determine how much of the resources to allocate to each tier of the application, and a combination of predictive and reactive methods that determine when to provision these resources, both at large and small time scales is proposed.
Abstract: Dynamic capacity provisioning is a useful technique for handling the multi-time-scale variations seen in Internet workloads. In this article, we propose a novel dynamic provisioning technique for multi-tier Internet applications that employs (1) a flexible queuing model to determine how much of the resources to allocate to each tier of the application, and (2) a combination of predictive and reactive methods that determine when to provision these resources, both at large and small time scales. We propose a novel data center architecture based on virtual machine monitors to reduce provisioning overheads. Our experiments on a forty-machine Xen/Linux-based hosting platform demonstrate the responsiveness of our technique in handling dynamic workloads. In one scenario where a flash crowd caused the workload of a three-tier application to double, our technique was able to double the application capacity within five minutes, thus maintaining response-time targets. Our technique also reduced the overhead of switching servers across applications from several minutes to less than a second, while meeting the performance targets of residual sessions.

554 citations