Energy-aware resource allocation heuristics for efficient management of data centers for Cloud computing
Summary (6 min read)
1. Introduction
- Cloud computing can be classified as a new paradigm for the dynamic provisioning of computing services supported by state-of-the-art data centers that usually employ Virtual Machine (VM) technologies for consolidation and environment isolation purposes [1].
- Cloud computing delivers an infrastructure, platform, and software as services that are made available to consumers in a pay-as-you-go model.
- To address this problem and drive Green Cloud computing, data center resources need to be managed in an energy-efficient manner.
- Section 7 concludes the paper with summary and future research directions.
3.1. Architectural framework
- Clouds aim to drive the design of the next generation data centers by architecting them as networks of virtual services (hardware, database, user-interface, application logic) so that users can access and deploy applications from anywhere in the world on demand at competitive costs depending on their QoS requirements [30].
- A consumer can be a company deploying a web-application, which presents varying workload according to the number of ‘‘users’’ accessing it.
- Acts as the interface between the Cloud infrastructure and consumers, also known as 2. Green Service Allocator.
- Keeps track of the availability of VMs and their resource usage, also known as (g) VMManager.
- Multiple VMs can be dynamically started and stopped on a single physical machine according to incoming requests, hence providing the flexibility of configuring various partitions of resources on the samephysicalmachine to different requirements of service requests, also known as 3. VMs.
3.2. Power model
- Power consumption by computing nodes in data centers is mostly determined by the CPU, memory, disk storage and network interfaces.
- Recent studies [16,12,11,15] have shown that the application of DVFS on the CPU results in almost linear power-to-frequency relationship for a server.
- Moreover, these studies have shown that on average an idle server consumes approximately 70% of the power consumed by the server running at the full CPU speed.
- Therefore, in this work the authors use the power model defined in (1).
- The utilization of the CPU may change over time due to the workload variability.
4. Energy-aware allocation of data center resources
- Recent developments in virtualization have resulted in its proliferation across data centers.
- When VMs do not use all the provided resources, they can be logically resized and consolidated to theminimumnumber of physical nodes, while idle nodes can be switched to the sleep mode to eliminate the idle power consumption and reduce the total energy consumption by the data center.
- To explore both performance and energy efficiency, three crucial issues must be addressed.
- First, excessive power cycling of a server could reduce its reliability.
- Due to the variability of the workload and aggressive consolidation, some VMs may not obtain required resources under peak load, and fail to meet the desired QoS.
4.1. VM placement
- The problem of VM allocation can be divided in two: the first part is the admission of new requests for VM provisioning and placing the VMs on hosts, whereas the second part is the optimization of the current VM allocation.
- The first part can be seen as a bin packing problem with variable bin sizes and prices.
- To solve it the authors apply a modification of the Best Fit Decreasing 1 The SPECpower benchmark results for the fourth quarter of 2010.
- In ourmodification, theModified Best Fit Decreasing (MBFD) algorithms, the authors sort all VMs in decreasing order of their current CPU utilizations, and allocate each VM to a host that provides the least increase of power consumption due to this allocation.
- The pseudo-code for the algorithm is presented inAlgorithm1.
4.2. VM selection
- The optimization of the current VM allocation is carried out in two steps: at the first step the authors select VMs that need to be migrated, at the second step the chosen VMs are placed on the hosts using the MBFD algorithm.
- To determine when and which VMs should be migrated, the authors introduce three double-threshold VM selection policies.
- If the CPU utilization of a host falls below the lower threshold, all VMs have to be migrated from this host and the host has to be switched to the sleep mode in order to eliminate the idle power consumption.
- The aim is to preserve free resources in order to prevent SLA violations due to the consolidation in cases when the utilization by VMs increases.
- The difference between the old and new placements forms a set of VMs that have to be reallocated.
4.2.1. The minimization of migrations policy
- The Minimization of Migrations (MM) policy selects the minimum number of VMs needed to migrate from a host to lower the CPU utilization below the upper utilization threshold if the upper threshold is violated.
- The pseudo-code for the MM algorithm for the over-utilization case is presented in Algorithm2.
- Then, it repeatedly looks through the list of VMs and finds a VM that is the best to migrate from the host.
- The best VM is the one that satisfies two conditions.
- The complexity of the algorithm is proportional to the product of the number of overutilized hosts and the number of VMs allocated to these hosts.
4.2.3. The random choice policy
- The Random Choice (RC) policy relies on a random selection of a number of VMs needed to decrease the CPU utilization by a host below the upper utilization threshold.
- The results of a simulation-based evaluation of the proposed algorithms in terms of power consumption, SLA violations and the number of VMmigrations are presented in Section 5.
5. Performance analysis
- This is justified as to enable live migration, the images and data of VMsmust be stored on a Network Attached Storage (NAS); and therefore, copying the VM’s storage is not required.
- Live migration creates an extra CPU load; however, it has been shown that the performance overhead is low [33].
- This approach has been justified by Verma et al. [15].
5.1. Performance metrics
- In order to compare the efficiency of the algorithms the authors use several metrics to evaluate their performance.
- The first metric is the total energy consumption by the physical resources of a data center caused by the application workloads.
- This can happen in cases when VMs sharing the same host require a CPU performance that cannot be provided due to the consolidation.
- The third metric is the number of VMmigrations initiated by the VMmanager during the adaptation of the VM placement.
- The last performance metric is the average SLA violation, which represents the average CPU performance that has not been allocated to an application when requested, resulting in performance degradation.
5.2. Experiment setup
- It is extremely difficult to conduct repeatable large-scale experiments on a real infrastructure, which is required to evaluate and compare the proposed resource management algorithms.
- Therefore, to ensure the repeatability of experiments, simulations have been chosen as a way to evaluate the performance of the proposed heuristics.
- It has been extended to enable energy-aware simulations as the core framework does not provide this capability.
- Apart from the energy consumption modeling and accounting, the ability to simulate service applications with workloads variable over time has been incorporated.
- The users submit requests for provisioning of 290 heterogeneous VMs that fill the full capacity of the simulated data center.
5.3. Simulation results
- For the benchmark experimental results the authors have used the Non Power-Aware (NPA) policy.
- The results show that with the growth of the utilization threshold energy consumption decreases, whereas the percentage of SLA violations increases.
- The graphs with fitted lines of the energy consumption, SLA violations, number of VM migration and average SLA violation achieved by the policies with the 40% interval between the thresholds are presented in Fig.
- After the transformation the residuals are normally distributedwith P-Value >0.1.
- The authors have chosen three representative threshold pairs for the MM policy and two values of the threshold for the ST policy to conduct a final comparison.
5.4. Transitions to the sleep mode
- The authors have collected the data on the number of times the hosts have been switched to the sleepmode caused by the proposedMM algorithm during the simulations.
- The distribution of the number of transitions obtained from10 simulation runs is depicted in Fig.
- This value is effective for real-world systems, as modern servers allow low-latency transitions to the sleep mode consuming lowpower.
- Meisner et al. [35] have shown that a typical blade server consuming 450W in the fully utilized state consumes approximately 10.4W in the sleepmode,while the transition delay is 300 ms.
6. Open challenges
- The virtualization technology, which Cloud computing environments heavily rely on, provides the ability to transfer VMs between physical nodes using live or offline migration.
- This enables the technique of dynamic consolidation of VMs to the minimum of physical nodes according to the current resource requirements.
- As a result, the idle nodes can be switched off or put to a power saving mode (e.g. sleep, hibernate) to reduce the total energy consumption by the data center.
- In this paper the authors have proposed algorithms that leverage this technique showing its efficiency.
- There aremanyopen challenges that have to be addressed in order to take advantage of the full potential of energy conservation in Cloud data centers.
6.1. Optimization of VM placement according to the utilization of multiple system resources
- The CPU consumes themajor part of power in a server followed by the next largest power consumer—memory.
- The increased number of cores in servers combined with the rapid adoption of virtualization technologies creates the ever growing demand to memory and makes memory one of the most important components of focus in the power and energy usage optimization [36].
- The same applies to network and disk storage facilities in modern data centers.
- A generic Cloud computing environment (IaaS) is built to serve multiple applications for multiple users, creating mixed workloads and complicating the workload characterization.
6.2. Optimization of virtual network topologies
- In virtualized data centers VMs often communicate with each other, establishing virtual network topologies.
- There have been recent research efforts on the optimization of the allocation of communicating applications to minimize the network data transfer overhead [20–25].
- These works have not directly addressed the problem of energy consumption by the network infrastructure.
- As migrations consume additional energy and have a negative impact on the performance, before initiating a migration, the reallocation controller has to ensure that the cost of migration does not exceed the benefit.
- The optimal VM placement and its dynamic adaptation can substantially reduce the data transfer overheads, and thus energy consumed by the network infrastructure.
6.3. Optimization of thermal states and cooling system operation
- A significant part of electrical energy consumed by computing resources is transformed into heat.
- For a 30, 000 ft2 data center with 1000 standard computing racks, each consuming 10 kW, the initial cost of purchasing and installing the infrastructure is $2–$5 million; whereas the annual costs for cooling is around $4–$8 million [38].
- New challenges include how and when to reallocate VMs to minimize the power drawn by the cooling system, while preserving a safe temperature of the resources and minimizing themigration overhead andperformance degradation.
- In addition, hardware level power management techniques, such as DVFS, can lower the temperature when it surpasses the thermal threshold.
- To meet the requirements of Cloud data centers, this problem should be explored for a case when multiple diverse applications with different QoS requirements are executing in the system simultaneously.
6.4. Efficient consolidation of VMs for managing heterogeneous workloads
- Cloud infrastructure services provide users with the ability to provision virtual machines and execute any kinds of applications on them.
- It is necessary to investigate which kinds of applications can be effectively combined and which parameters influence the efficiency.
- Additionally, Cloud applications can present varying workloads.
- End-userswill benefit fromdecreased prices for the resource usage.
- Knowledge of the efficient combination of different types of workloads will advance resource management strategies in energy-aware computing environments, where consolidation of VMs is one of the most productive energy saving techniques.
6.5. A holistic approach to energy-aware resource management
- The technique discussed in Section 6.1 is aimed at the consolidation of VMs and increases the amount of physical resources in cases of workload peaks.
- Therefore, the problem of combining different optimization techniques presents a significant research challenge creating a multi-objective optimization problem.
- Usually the optimization controller is centralized that creates a single point of failure and limits the scalability.
- This implies that reallocation controllers are distributed over multiple physical nodes in a data center and do not have the complete view of the system at any point of time.
7. Concluding remarks and future directions
- This work advances the Cloud computing field in two ways.
- Second, consumers are increasingly becoming conscious about the environment.
- Reducing greenhouse gas emissions is a key energy policy focus of many countries around the world.
- The research work is planned to be followed by the development of a software platform that supports the energy-efficient management and allocation of Cloud data center resources.
- The authors will leverage third party Cloud technologies and services offerings including (a) VM technologies, such as open-source Xen and KVM, and commercial products fromVMware; (b) Amazon’s Elastic Compute Cloud (EC2), Simple Storage Service (S3), and Microsoft’s Azure.
Acknowledgments
- This is a substantially extended version of the keynote paper presented at PDPTA 2010 [6].
- The authors thank Yoganathan Sivaram (Melbourne University), external reviewers and the Guest Editor of this special issue for their suggestions on enhancing the quality of the paper.
Did you find this useful? Give us your feedback
Citations
2,992 citations
2,289 citations
Cites methods from "Energy-aware resource allocation he..."
...The other model is based on an observation in recent works [89]–[91] that the...
[...]
1,616 citations
741 citations
Cites background or methods from "Energy-aware resource allocation he..."
...[6][111][112] Widely used linear power model....
[...]
...If we assume the power consumed by a server is approximately zero when it is switched off, we can model the power Pu consumed by a server at any specific processor utilization u (u is a fraction [110]) as [6][111][112] in Equation (22), Pu = (Pmax − Pidle)u+ Pidle, (22)...
[...]
...They considered the fact that CPU utilization may change over time due to the variation of the workload handled by the CPU [111]....
[...]
644 citations
References
9,282 citations
"Energy-aware resource allocation he..." refers background in this paper
...Moreover, developers with innovative ideas for new Internet services no longer require large capital outlays in hardware to deploy their service or human expenses to operate it [2]....
[...]
...A recent Berkeley report [2] stated: “Cloud computing, the long-held dream of computing as a utility, has the po-...
[...]
6,326 citations
5,850 citations
"Energy-aware resource allocation he..." refers background in this paper
...1 Architectural Framework Clouds aim to drive the design of the next generation data centers by architecting them as a network of virtual services (hardware, database, user-interface, application logic) so that users can access and deploy applications from anywhere in the world on demand at competitive costs depending on their QoS requirements [20]....
[...]
4,570 citations
"Energy-aware resource allocation he..." refers methods in this paper
...The CloudSim toolkit [34] has been chosen as a simulation platform as it is a modern simulation framework aimed at Cloud computing environments....
[...]
3,186 citations
"Energy-aware resource allocation he..." refers methods in this paper
...The new placement is achieved using live migration of VMs [32]....
[...]