scispace - formally typeset
Search or ask a question

Showing papers on "Load balancing (computing) published in 2008"


Proceedings ArticleDOI
15 Nov 2008
TL;DR: The design of an agile data center with integrated server and storage virtualization technologies is described and a novel load balancing algorithm called VectorDot is proposed for handling the hierarchical and multi-dimensional resource constraints in such systems.
Abstract: We describe the design of an agile data center with integrated server and storage virtualization technologies. Such data centers form a key building block for new cloud computing architectures. We also show how to leverage this integrated agility for non-disruptive load balancing in data centers across multiple resource layers - servers, switches, and storage. We propose a novel load balancing algorithm called VectorDot for handling the hierarchical and multi-dimensional resource constraints in such systems. The algorithm, inspired by the successful Toyoda method for multi-dimensional knapsacks, is the first of its kind.We evaluate our system on a range of synthetic and real data center testbeds comprising of VMware ESX servers, IBM SAN Volume Controller, Cisco and Brocade switches. Experiments under varied conditions demonstrate the end-to-end validity of our system and the ability of VectorDot to efficiently remove overloads on server, switch and storage nodes.

367 citations


Patent
11 Mar 2008
TL;DR: In this article, the authors propose a spillover management technique for virtual servers of an appliance based on bandwidth, where a network administrator may configure a bandwidth threshold for one or more virtual servers.
Abstract: The present solution provides a spillover management technique for virtual servers of an appliance based on bandwidth. A network administrator may configure a bandwidth threshold for one or more virtual servers, such as virtual servers providing acceleration or load balancing for one or more services. The bandwidth threshold may be specified as a number of bytes transferred via the virtual server. The bandwidth threshold may also be specified as a round trip time or derivative thereof. A user may specify the bandwidth threshold via a configuration interface. Otherwise, the appliance may establish the bandwidth threshold. The appliance monitors the bandwidth used by a first virtual server. In response to detecting the bandwidth reaching or exceeding the bandwidth threshold, the appliance dynamically directs client requests to a second virtual server.

349 citations


Proceedings ArticleDOI
22 Aug 2008
TL;DR: Monsoon is described, a new network architecture, which scales and commoditizes data center networking monsoon realizes a simple mesh-like architecture using programmable commodity layer-2 switches and servers, which creates a huge, flexible switching domain, supporting any server/any service and unfragmented server capacity at low cost.
Abstract: Applications hosted in today's data centers suffer from internal fragmentation of resources, rigidity, and bandwidth constraints imposed by the architecture of the network connecting the data center's servers. Conventional architectures statically map web services to Ethernet VLANs, each constrained in size to a few hundred servers owing to control plane overheads. The IP routers used to span traffic across VLANs and the load balancers used to spray requests within a VLAN across servers are realized via expensive customized hardware and proprietary software. Bisection bandwidth is low, severly constraining distributed computation Further, the conventional architecture concentrates traffic in a few pieces of hardware that must be frequently upgraded and replaced to keep pace with demand - an approach that directly contradicts the prevailing philosophy in the rest of the data center, which is to scale out (adding more cheap components) rather than scale up (adding more power and complexity to a small number of expensive components).Commodity switching hardware is now becoming available with programmable control interfaces and with very high port speeds at very low port cost, making this the right time to redesign the data center networking infrastructure. In this paper, we describe monsoon, a new network architecture, which scales and commoditizes data center networking monsoon realizes a simple mesh-like architecture using programmable commodity layer-2 switches and servers. In order to scale to 100,000 servers or more,monsoon makes modifications to the control plane (e.g., source routing) and to the data plane (e.g., hot-spot free multipath routing via Valiant Load Balancing). It disaggregates the function of load balancing into a group of regular servers, with the result that load balancing server hardware can be distributed amongst racks in the data center leading to greater agility and less fragmentation. The architecture creates a huge, flexible switching domain, supporting any server/any service and unfragmented server capacity at low cost.

336 citations


Patent
Salil Suri1, Harish Chilkoti1
24 Nov 2008
TL;DR: In this paper, a load balancing algorithm is used to select a vNIC from the VNICs connected or connectable to the virtual switch, based on the rate of processing of previous network packets by each the VNs, which is measured by the size of a network packet queue.
Abstract: A virtualized platform includes a virtual switch connected to the virtual network interface cards (vNICs) for a group of virtual machines running the same application program that is associated with multiple software ports. A module in the virtualized platform monitors the virtual switch's receipt of a network packet that includes control information relating to the application program and its software ports. The module applies a load balancing algorithm to select a vNIC from the vNICs connected or connectable to the virtual switch, based on the rate of processing of previous network packets by each the vNICs (e.g., as measured by the size of a network packet queue). The module might also apply the load balancing algorithm to select a software port for the application. The module then causes the virtual switch to route the network packet to the selected vNIC and software port.

312 citations


Proceedings ArticleDOI
19 May 2008
TL;DR: This paper addresses the challenge of assigning VNs to the underlying physical network in a distributed and efficient manner and proposes a VN mapping protocol to communicate and exchange messages between agent-based substrate nodes to achieve the mapping.
Abstract: Network visualization is a promising concept to diversify the future Internet architecture into separate virtual networks (VN) that can support simultaneously multiple network experiments, services and architectures over a shared substrate network. To take full advantage of this paradigm this paper addresses the challenge of assigning VNs to the underlying physical network in a distributed and efficient manner. A distributed algorithm responsible for load balancing and mapping virtual nodes and links to substrate nodes and links has been designed, implemented and evaluated. A VN mapping protocol is proposed to communicate and exchange messages between agent-based substrate nodes to achieve the mapping. Results of the implementation and a performance evaluation of the distributed VN mapping algorithm using a multi-agent approach are reported.

298 citations


Journal ArticleDOI
30 Sep 2008
TL;DR: In this article, the authors argue that the natural evolution of the Internet is that it should achieve resource pooling by harnessing the responsiveness of multipath-capable end systems.
Abstract: Since the ARPAnet, network designers have built localized mechanisms for statistical multiplexing, load balancing, and failure resilience, often without understanding the broader implications. These mechanisms are all types of resource pooling whichmeans making a collection of resources behave like a single pooled resource. We believe that the natural evolution of the Internet is that it should achieve resource pooling by harnessing the responsiveness of multipath-capable end systems. We argue that this approach will solve the problems and limitations of the current piecemeal approaches.

262 citations


Patent
29 May 2008
TL;DR: In this article, the authors present methods and systems for performing load balancing via a plurality of virtual servers upon a failover using metrics from a backup virtual server, where the first virtual server is not available.
Abstract: The present invention provides methods and systems for performing load balancing via a plurality of virtual servers upon a failover using metrics from a backup virtual server. The methods and systems described herein provide systems and methods for an appliance detecting that a first virtual server of a plurality of virtual servers having one or more backup virtual servers load balanced by an appliance is not available, identifying at least a first backup virtual server of a one or more backup virtual servers of the first virtual server is available, maintaining a status of the first virtual server as available in response to the identification, obtaining one or more metrics from the first backup virtual server of a one or more backup virtual servers, and determining the load across the plurality of virtual servers using the metrics obtained from the first backup virtual server associated with the first virtual server.

258 citations


Journal ArticleDOI
Robert Knauerhase1, Paul Brett1, B. Hohlt1, Tong Li1, Scott D. Hahn1 
TL;DR: It is shown that the OS can use data obtained from dynamic runtime observation of task behavior to ameliorate performance variability and more effectively exploit multicore processor resources.
Abstract: Today's operating systems don't adequately handle the complexities of Multicore processors. Architectural features confound existing OS techniques for task scheduling, load balancing, and power management. This article shows that the OS can use data obtained from dynamic runtime observation of task behavior to ameliorate performance variability and more effectively exploit multicore processor resources. The authors' research prototypes demonstrate the utility of observation-based policy.

227 citations


Patent
Raphael Yahalom1, Assaf Levy1
19 Dec 2008
TL;DR: In this paper, the authors describe methods and systems for periodically analyzing and correcting storage load imbalances in a storage network environment including virtual machines, which account for various resource types, logical access paths, and relationships among different storage environment components.
Abstract: Methods and systems for periodically analyzing and correcting storage load imbalances in a storage network environment including virtual machines are described. These methods and systems account for various resource types, logical access paths, and relationships among different storage environment components. Load balancing may be managed in terms of input/output (I/O) traffic and storage utilization. The aggregated information is stored, and may be used to identify and correct load imbalances in a virtual server environment in order to prevent primary congestion and bottlenecks.

224 citations


Patent
18 Mar 2008
TL;DR: A content distribution mechanism that distributes content of a content provider at various sites across a network and selects the site that is nearest a content requestor using an anycast address that resides at each of the sites is described in this paper.
Abstract: A content distribution mechanism that distributes content of a content provider at various sites across a network and selects the site that is nearest a content requestor using an anycast address that resides at each of the sites. The sites are configured as nodes (or clusters) and each node includes a content server and a DNS server. The DNS servers are so associated with the content servers at their respective nodes as to resolve the name of the content provider to the IP address of the content servers at the nodes. The DNS servers each are assigned the anycast address in addition to a unique address, and the anycast address is advertised to the network (in particular, the network routing infrastructure) using Border Gateway Protocol (BGP). Node selection occurs when the network routing infrastructure selects a shortest path to the anycast address during DNS name resolution.

216 citations


Patent
21 Oct 2008
TL;DR: In this article, a power management controller dynamically controls a level of power for a server, responsive to the monitored level of load on one of the servers, and a power monitoring agent monitors a power mode for each server.
Abstract: A method for adaptively load balancing user sessions to reduce energy consumption includes identifying a session type for each of a plurality of user sessions. A server group is defined, providing access to a subset of the user sessions having a common session type. A power management schedule is also defined for the server group. The method includes consolidating, onto at least one server in the server group, the subset of user sessions. In still another aspect, a method for reducing energy consumption by dynamically managing power modes for a plurality of servers, includes monitoring, via a power monitoring agent, a level of load on one of the servers. A power management console generates a power management schedule for a server, responsive to the monitored level of load. Responsive to the power management schedule, a power management controller dynamically controls a level of power for the server.

Journal ArticleDOI
TL;DR: This work proposes an interference- minimized multipath routing (I2MR) protocol that increases throughput by discovering zone-disjoint paths for load balancing, requiring minimal localization support and proposes a congestion control scheme that further increased throughput by loading the paths forload balancing at the highest possible rate supportable.
Abstract: High-rate streaming in WSN is required for future applications to provide high-quality information of battlefield hot spots. Although recent advances have enabled large-scale WSN to be deployed supported by high-bandwidth backbone network for high-rate streaming, the WSN remains the bottleneck due to the low-rate radios used and the effects of wireless interferences. First, we propose a technique to evaluate the quality of a pathset for multipath load balancing, taking into consideration the effects of wireless interferences and that nodes may interfere beyond communication ranges. Second, we propose an interference- minimized multipath routing (I2MR) protocol that increases throughput by discovering zone-disjoint paths for load balancing, requiring minimal localization support. Third, we propose a congestion control scheme that further increases throughput by loading the paths for load balancing at the highest possible rate supportable. Finally, we validate thepath-set evaluation technique and also evaluate the I2MR protocol and congestion control scheme by comparing with AODV protocol and node-disjoint multipath routing (NDMR) protocol. Simulation results show that I2MR with congestion control achieves on average 230% and 150% gains in throughput over AODV and NDMR respectively, and consumes comparable or at most 24% more energy than AODV but up to 60% less energy than NDMR.

Journal ArticleDOI
TL;DR: This paper presents a system called Jitter, which reduces the frequency on nodes that are assigned less computation and therefore have slack time, and the goal of Jitter is to attempt to ensure that they arrive "just in time" so that they avoid increasing overall execution time.

01 Jan 2008
TL;DR: This work explores extending the capacity provisioning model used in current clouds by using resource leases as a fundamental provisioning abstraction, and focuses in this work on advance reservation leases, which can be used to satisfy capacity peaks known in advance.
Abstract: Clouds can be used to provide on-demand capacity as a utility. Although the realization of this idea can differ among various cloud providers (from Google App Engine to Amazon EC2), the most flexible approach is the provisioning of virtualized resources as a service. These virtualization-based clouds, like Amazon EC2 or the Science Clouds (which uses the Globus Virtual Workspace Service [4]), provide a way to build a large computing infrastructure by accessing remote computational, storage and network resources. Since a cloud typically comprises a large amount of virtual and physical servers, in the order of hundreds or thousands, efficiently managing this virtual infrastructure becomes a major concern. Several solutions, such as VMWare VirtualCenter, Platform Orchestrator, or Enomalism, have emerged to manage virtual infrastructures, providing a centralized control platform for the automatic deployment and monitoring of virtual machines (VMs) in resource pools. However, these solutions provide simple VM placement and load balancing policies. In particular, existing clouds use an immediate provisioning model, where virtualized resources are allocated at the time they are requested, without the possibility of requesting resources at a specific future time and, at most, being placed in a simple first-come-first-serve queue when no resources are available. However, service provisioning clouds, like the one being built by the RESERVOIR project, have requirements that cannot be supported within this model, such as resource requests that are subject to non-trivial policies, capacity reservations at specific times to meet peak capacity requirements, variable resource usage throughout a VM’s lifetime, and dynamic renegotiation of resources allocated to VMs. Additionally, smaller clouds with limited resources, where not all requests may be satisfiable immediately for lack of resources, could benefit from more complex VM placement strategies supporting queues, priorities, and advance reservations. In this work we explore extending the capacity provisioning model used in current clouds by using resource leases [3, 10, 9] as a fundamental provisioning abstraction. To do this, we have integrated the OpenNebula virtual infrastructure engine with the Haizea lease manager to produce a resource management system that can be used to support a variety of leases in clouds. We focus in this work on advance reservation leases, which can be used to satisfy capacity peaks known in advance, or for a variety of well-documented use cases where advance reservations are used (such as coscheduling of multiple resources [12, 5, 1, 2], urgent

Journal ArticleDOI
TL;DR: In this article, a scalable cross-layer framework is proposed to coordinate packet-level scheduling, call-level cell-site selection and handoff, and system-level coverage based on load, throughput, and channel measurements.
Abstract: We investigate a wireless system of multiple cells, each having a downlink shared channel in support of high-speed packet data services. In practice, such a system consists of hierarchically organized entities including a central server, Base Stations (BSs), and Mobile Stations (MSs). Our goal is to improve global resource utilization and reduce regional congestion given asymmetric arrivals and departures of mobile users, a goal requiring load balancing among multiple cells. For this purpose, we propose a scalable cross-layer framework to coordinate packet-level scheduling, call-level cell-site selection and handoff, and system-level cell coverage based on load, throughput, and channel measurements. In this framework, an opportunistic scheduling algorithm--the weighted Alpha-Rule--exploits the gain of multiuser diversity in each cell independently, trading aggregate (mean) down-link throughput for fairness and minimum rate guarantees among MSs. Each MS adapts to its channel dynamics and the load fluctuations in neighboring cells, in accordance with MSs' mobility or their arrival and departure, by initiating load-aware handoff and cell-site selection. The central server adjusts schedulers of all cells to coordinate their coverage by prompting cell breathing or distributed MS handoffs. Across the whole system, BSs and MSs constantly monitor their load, throughput, or channel quality in order to facilitate the overall system coordination. Our specific contributions in such a framework are highlighted by the minimum-rate guaranteed weighted Alpha-Rule scheduling, the load-aware MS handoff/cell-site selection, and the Media Access Control (MAC)-layer cell breathing. Our evaluations show that the proposed framework can improve global resource utilization and load balancing, resulting in a smaller blocking rate of MS arrivals without extra resources while the aggregate throughput remains roughly the same or improved at the hot-spots. Our simulation tests also show that the coordinated system is robust to dynamic load fluctuations and is scalable to both the system dimension and the size of MS population.

Proceedings ArticleDOI
08 Dec 2008
TL;DR: This study evaluates the relative accuracy of each metric by conducting experiments with multiple transmission rates and varying levels of interference on a large set of links, and suggests that a careful consideration of these limitations is essential.
Abstract: The accurate determination of the link quality is critical for ensuring that functionalities such as intelligent routing, load-balancing, power control and frequency selection operate efficiently. There are 4 primary metrics for capturing the quality of a wireless link: RSSI (Received Signal Strength Indication), SINR (Signal-to-Interference-plus-Noise Ratio), PDR (Packet-Delivery Ratio), and BER (Bit-Error Rate). In this paper, we perform a measurement-based study in order to answer the question: which is the appropriate metric to use, and under what conditions? We evaluate the relative accuracy of each metric by conducting experiments with multiple transmission rates and varying levels of interference on a large set of links. We observe that each metric has advantages and projects one or more limitations. Our study suggests that a careful consideration of these limitations is essential, and provides guidelines on the applicability of each metric.


Patent
19 May 2008
TL;DR: In this paper, a proof of concept of a new type of WLAN, complete with simulation and results from the simulation has been described, where each AP node is implemented as a self-contained embedded OS unit, with all algorithms resident in its Operating system.
Abstract: A design and proof of concept of a new type of WLAN, complete with simulation and results from the simulation has been described. Each AP Node is implemented as a self-contained embedded OS unit, with all algorithms resident in its Operating system. The normal day-to-day functioning of the AP node is based entirely on resident control algorithms. Upgrades are possible through a simple secure communications interface supported by the OS kernel for each AP node. Benefits provided by a wireless network, as proposed in this invention, are that: it installs out of the box; the network is self-configuring; the network is redundant in that mesh network formalism is supported, ensuring multiple paths; load balancing is supported; there is no single point of failure; allows for decentralized execution; there is a central control; it is network application aware; there is application awareness; there is automatic channel allocation to manage and curtail RF interference, maximize non interference bandwidth and enable seamless roaming between adjoining wireless sub networks (BSS) and it supports the wireless equivalent for switching—for seamless roaming requirements.

Patent
11 Aug 2008
TL;DR: In this article, the authors disaggregate load balancing functionality/design to increase resilience and flexibility for both the load balancing and switching mechanisms of the data center, which can be disaggregated to increase reliability and flexibility.
Abstract: Systems and methods that distribute load balancing functionalities in a data center. A network of demultiplexers and load balancer servers enable a calculated scaling and growth operation, wherein capacity of load balancing operation can be adjusted by changing the number of load balancer servers. Accordingly, load balancing functionality/design can be disaggregated to increase resilience and flexibility for both the load balancing and switching mechanisms of the data center.

Proceedings ArticleDOI
13 Apr 2008
TL;DR: AdapCode, a reliable data dissemination protocol that uses adaptive network coding to reduce broadcast traffic in the process of code updates, is proposed, and it is shown that network coding is doable on sensor networks.
Abstract: Code updates, such as those for debugging purposes, are frequent and expensive in the early development stages of wireless sensor network applications. We propose AdapCode, a reliable data dissemination protocol that uses adaptive network coding to reduce broadcast traffic in the process of code updates. Packets on every node are coded by linear combination and decoded by Gaussian elimination. The core idea in AdapCode is to adaptively change the coding scheme according to the link quality. Our evaluation shows that AdapCode uses up to 40% less packets than Deluge in large networks. In addition, AdapCode performs much better in terms of load balancing, which prolongs the system lifetime, and has a slightly shorter propagation delay. Finally, we show that network coding is doable on sensor networks in that (i) it imposes only a 3 byte header overhead, (ii) it is easy to find linearly independent packets, and (3) Gaussian elimination needs only 1 KB of memory.

Patent
03 Mar 2008
TL;DR: In this article, the authors present a local load balancing approach for virtual IP addresses. But the task of load balancing is distributed as opposed to being centralized at a server farm or cluster.
Abstract: An exemplary method for load balancing includes accessing a range of values for IP addresses associated with a virtual IP address associated with a domain name; selecting, using a local statistical algorithm, a value in the range; and, based at least in part on the selected value, connecting to a remote resource at one of the IP addresses. In such a method, a client can perform local load balancing when connecting to one of many fungible resources “behind” a virtual IP address. With many such clients, the task of load balancing is distributed as opposed to being centralized at a server farm or cluster. Other methods, devices and systems are also disclosed.

Journal ArticleDOI
TL;DR: The problem of grouping the sensor nodes into clusters to enhance the overall scalability of the network is investigated and it is proved that the general case of LBCP is NP-hard and an efficient 32-approximation algorithm is proposed.

Book
01 Jan 2008
TL;DR: Parallel Algorithms presents a rigorous yet accessible treatment of theoretical models of parallel computation, parallel algorithm design for homogeneous and heterogeneous platforms, complexity and performance analysis, and essential notions of scheduling.
Abstract: Focusing on algorithms for distributed-memory parallel architectures, Parallel Algorithms presents a rigorous yet accessible treatment of theoretical models of parallel computation, parallel algorithm design for homogeneous and heterogeneous platforms, complexity and performance analysis, and essential notions of scheduling. The book extracts fundamental ideas and algorithmic principles from the mass of parallel algorithm expertise and practical implementations developed over the last few decades. In the first section of the text, the authors cover two classical theoretical models of parallel computation (PRAMs and sorting networks), describe network models for topology and performance, and define several classical communication primitives. The next part deals with parallel algorithms on ring and grid logical topologies as well as the issue of load balancing on heterogeneous computing platforms. The final section presents basic results and approaches for common scheduling problems that arise when developing parallel algorithms. It also discusses advanced scheduling topics, such as divisible load scheduling and steady-state scheduling. With numerous examples and exercises in each chapter, this text encompasses both the theoretical foundations of parallel algorithms and practical parallel algorithm design.

Journal ArticleDOI
TL;DR: The algorithm developed combines the inherent efficiency of the centralized approach and the fault-tolerant nature of the distributed, decentralized approach to solve the grid load-balancing problem.
Abstract: Load balancing is a very important and complex problem in computational grids. A computational grid differs from traditional high-performance computing systems in the heterogeneity of the computing nodes, as well as the communication links that connect the different nodes together. There is a need to develop algorithms that can capture this complexity yet can be easily implemented and used to solve a wide range of load-balancing scenarios. In this paper, we propose a game-theoretic solution to the grid load-balancing problem. The algorithm developed combines the inherent efficiency of the centralized approach and the fault-tolerant nature of the distributed, decentralized approach. We model the grid load-balancing problem as a noncooperative game, whereby the objective is to reach the Nash equilibrium. Experiments were conducted to show the applicability of the proposed approaches. One advantage of our scheme is the relatively low overhead and robust performance against inaccuracies in performance prediction information.

Patent
30 Jun 2008
TL;DR: In this paper, a logical load balancing method for distributing traffic according to a set of weights among a group of network interfaces is proposed, where a logical identity of a packet may be generated, e.g., by generating a hash index of the packet's header.
Abstract: A logical load-balancing method for distributing traffic according to a set of weights among a group of network interfaces. A logical identity of a packet may be generated, e.g., by generating a hash index of the packet's header. Each of the weights may be associated with a network interface. A range of logical identities, or its boundary, may be determined for an interface according to the weight associated with the interface member. A packet may be directed to an interface if the packet's logical identify falls into the range of the interface.

Patent
11 Jan 2008
TL;DR: In this paper, the authors present a technique for maintaining mirrored storage cluster data consistency on systems with two-node, highly available storage solutions that employ an initiator-side agent operable to prevent split-brain scenarios.
Abstract: Techniques for maintaining mirrored storage cluster data consistency on systems with two-node, highly available storage solutions can employ an initiator-side agent operable to prevent split-brain scenarios. Split brain syndrome can be avoided, information identifying changes of synchronization states can be maintained, and both graceful and ungraceful shutdowns (or failures) of either one of the nodes or of the intelligent initiator itself can be mitigated. Technology presented herein supports load balancing and hot failover/failback in systems that may feature redundant network connectivity. Moreover, a method is supported for communicating storage cluster status between the storage nodes and the initiator.

Patent
05 Nov 2008
TL;DR: In this article, the authors propose a load balancing and state management in virtualized virtualized computing environments with a plurality of PCI-Express switches (NICs) coupled with a number of NICs.
Abstract: Embodiments provide load balancing in a virtual computing environment comprising a plurality of PCI-Express switches (the PCIe switching cloud) coupled to a plurality of network interface devices (NICs). An NIC cluster is added between the PCIe switching cloud and the NICs. The NIC cluster is configured to hide NICs from system images and allow the system images to access functions across multiple NICs. The NIC cluster of an embodiment dynamically load balances network resources by performing a hashing function on a header field of received packets. The NIC cluster of an embodiment performs load balancing and state management in association with driver software, which is embedded in the system image. The driver software adds a tag for flow identification to downstream data packets. The NIC cluster distributes data packets based on information in the tag.

Proceedings ArticleDOI
20 Jun 2008
TL;DR: Four different dynamic load balancing methods are compared to see which one is most suited to the highly parallel world of graphics processors and it is shown that lock-free methods achieves better performance than blocking and that they can be made to scale with increased numbers of processing units.
Abstract: To get maximum performance on the many-core graphics processors it is important to have an even balance of the workload so that all processing units contribute equally to the task at hand. This can be hard to achieve when the cost of a task is not known beforehand and when new sub-tasks are created dynamically during execution. With the recent advent of scatter operations and atomic hardware primitives it is now possible to bring some of the more elaborate dynamic load balancing schemes from the conventional SMP systems domain to the graphics processor domain.We have compared four different dynamic load balancing methods to see which one is most suited to the highly parallel world of graphics processors. Three of these methods were lock-free and one was lock-based. We evaluated them on the task of creating an octree partitioning of a set of particles. The experiments showed that synchronization can be very expensive and that new methods that take more advantage of the graphics processors features and capabilities might be required. They also showed that lock-free methods achieves better performance than blocking and that they can be made to scale with increased numbers of processing units.

Patent
Tienwei Chao, Bill Shao1
07 Nov 2008
TL;DR: In this paper, a packet forwarding table is searched to find an entry with a hash value that matches the computed hash value and to identify the server to which the matching hash value maps.
Abstract: A switch device includes a packet forwarding table for providing load balancing across servers in a server group. Each table entry maps a hash value to a server in the server group. A hash value can be computed from the destination MAC address, destination IP address, and destination service port in the header of a received packet. The packet forwarding table is searched to find an entry with a hash value that matches the computed hash value and to identify the server to which the matching hash value maps. The switch device forwards the packet to the identified server. Implementing load-balancing decisions in hardware enables packet switching at the line rate of the switch ports. In addition, the hardware-based load balancing performed by the switch device eliminates session tables and the memory to store them, enabling the switch device to handle an unlimited number of client connections.

Proceedings ArticleDOI
30 Nov 2008
TL;DR: A two-layer control architecture based on well-established control theory is proposed that can effectively reduce server power consumption while achieving required application-level performance for virtualized enterprise servers.
Abstract: Both power and performance are important concerns for enterprise data centers. While various management strategies have been developed to effectively reduce server power consumption by transitioning hardware components to lower-power states, they cannot be directly applied to today's data centers that rely on virtualization technologies. Virtual machines running on the same physical server are correlated, because the state transition of any hardware component will affect the application performance of all the virtual machines. As a result, reducing power solely based on the performance level of one virtual machine may cause another to violate its performance specification. This paper proposes a two-layer control architecture based on well-established control theory. The primary control loop adopts a multi-input-multi-output control approach to maintain load balancing among all virtual machines so that they can have approximately the same performance level relative to their allowed peak values. The secondary performance control loop then manipulates CPU frequency for power efficiency based on the uniform performance level achieved by the primary loop. Empirical results demonstrate that our control solution can effectively reduce server power consumption while achieving required application-level performance for virtualized enterprise servers.