Showing papers on "Data center published in 2013"

PDF

Open Access

Journal Article•DOI•

Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment

[...]

Zhen Xiao¹, Weijia Song¹, Qi Chen¹•Institutions (1)

01 Jun 2013-IEEE Transactions on Parallel and Distributed Systems

TL;DR: This paper presents a system that uses virtualization technology to allocate data center resources dynamically based on application demands and support green computing by optimizing the number of servers in use and develops a set of heuristics that prevent overload in the system effectively while saving energy used.

...read moreread less

Abstract: Cloud computing allows business customers to scale up and down their resource usage based on needs. Many of the touted gains in the cloud model come from resource multiplexing through virtualization technology. In this paper, we present a system that uses virtualization technology to allocate data center resources dynamically based on application demands and support green computing by optimizing the number of servers in use. We introduce the concept of "skewness” to measure the unevenness in the multidimensional resource utilization of a server. By minimizing skewness, we can combine different types of workloads nicely and improve the overall utilization of server resources. We develop a set of heuristics that prevent overload in the system effectively while saving energy used. Trace driven simulation and experiment results demonstrate that our algorithm achieves good performance.

...read moreread less

859 citations

Journal Article•DOI•

Data Center Network Virtualization: A Survey

[...]

Md. Faizul Bari¹, Raouf Boutaba¹, Rafael Pereira Esteves², Lisandro Zambenedetti Granville², Maxim Podlesny¹, Golam Rabbani¹, Qi Zhang¹, Mohamed Faten Zhani¹ - Show less +4 more•Institutions (2)

University of Waterloo¹, Universidade Federal do Rio Grande do Sul²

22 Jan 2013-IEEE Communications Surveys and Tutorials

TL;DR: A survey of the current state-of-the-art in data center networks virtualization, and a detailed comparison of the surveyed proposals are presented.

...read moreread less

Abstract: With the growth of data volumes and variety of Internet applications, data centers (DCs) have become an efficient and promising infrastructure for supporting data storage, and providing the platform for the deployment of diversified network services and applications (e.g., video streaming, cloud computing). These applications and services often impose multifarious resource demands (storage, compute power, bandwidth, latency) on the underlying infrastructure. Existing data center architectures lack the flexibility to effectively support these applications, which results in poor support of QoS, deployability, manageability, and defence against security attacks. Data center network virtualization is a promising solution to address these problems. Virtualized data centers are envisioned to provide better management flexibility, lower cost, scalability, better resources utilization, and energy efficiency. In this paper, we present a survey of the current state-of-the-art in data center networks virtualization, and provide a detailed comparison of the surveyed proposals. We discuss the key research challenges for future research and point out some potential directions for tackling the problems related to data center design.

...read moreread less

633 citations

Journal Article•DOI•

Dynamic right-sizing for power-proportional data centers

[...]

Minghong Lin¹, Adam Wierman¹, Lachlan L. H. Andrew², Eno Thereska³•Institutions (3)

California Institute of Technology¹, Swinburne University of Technology², Microsoft³

01 Oct 2013-IEEE ACM Transactions on Networking

TL;DR: A very general model is proposed and it is proved that the optimal offline algorithm for dynamic right-sizing has a simple structure when viewed in reverse time, and this structure is exploited to develop a new “lazy” online algorithm, which is proven to be 3-competitive.

...read moreread less

Abstract: Power consumption imposes a significant cost for data centers implementing cloud services, yet much of that power is used to maintain excess service capacity during periods of low load. This paper investigates how much can be saved by dynamically "right-sizing" the data center by turning off servers during such periods and how to achieve that saving via an online algorithm. We propose a very general model and prove that the optimal offline algorithm for dynamic right-sizing has a simple structure when viewed in reverse time, and this structure is exploited to develop a new "lazy" online algorithm, which is proven to be 3-competitive. We validate the algorithm using traces from two real data-center workloads and show that significant cost savings are possible. Additionally, we contrast this new algorithm with the more traditional approach of receding horizon control.

...read moreread less

324 citations

Proceedings Article•DOI•

SPANStore: cost-effective geo-replicated storage spanning multiple cloud services

[...]

Zhe Wu¹, Michael Butkiewicz¹, Dorian Perkins¹, Ethan Katz-Bassett², Harsha V. Madhyastha¹ - Show less +1 more•Institutions (2)

University of California, Riverside¹, University of Southern California²

03 Nov 2013

TL;DR: SPANStore is presented, a key-value store that exports a unified view of storage services in geographically distributed data centers that can lower costs by over 10x in several scenarios, in comparison with alternative solutions that either use a single storage provider or replicate every object to every data center from which it is accessed.

...read moreread less

Abstract: By offering storage services in several geographically distributed data centers, cloud computing platforms enable applications to offer low latency access to user data. However, application developers are left to deal with the complexities associated with choosing the storage services at which any object is replicated and maintaining consistency across these replicas.In this paper, we present SPANStore, a key-value store that exports a unified view of storage services in geographically distributed data centers. To minimize an application provider's cost, we combine three key principles. First, SPANStore spans multiple cloud providers to increase the geographical density of data centers and to minimize cost by exploiting pricing discrepancies across providers. Second, by estimating application workload at the right granularity, SPANStore judiciously trades off greater geo-distributed replication necessary to satisfy latency goals with the higher storage and data propagation costs this entails in order to satisfy fault tolerance and consistency requirements. Finally, SPANStore minimizes the use of compute resources to implement tasks such as two-phase locking and data propagation, which are necessary to offer a global view of the storage services that it builds upon. Our evaluation of SPANStore shows that it can lower costs by over 10x in several scenarios, in comparison with alternative solutions that either use a single storage provider or replicate every object to every data center from which it is accessed.

...read moreread less

251 citations

Journal Article•DOI•

Energy and Performance Management of Green Data Centers: A Profit Maximization Approach

[...]

Mahdi Ghamkhari¹, Hamed Mohsenian-Rad¹•Institutions (1)

University of California, Riverside¹

11 Mar 2013-IEEE Transactions on Smart Grid

TL;DR: This paper proposes a systematic approach to maximize green data center's profit, i.e., revenue minus cost, and proposes a novel optimization-based profit maximization strategy, which significantly outperforms two comparable energy and performance management algorithms that are recently proposed in the literature.

...read moreread less

Abstract: While a large body of work has recently focused on reducing data center's energy expenses, there exists no prior work on investigating the trade-off between minimizing data center's energy expenditure and maximizing their revenue for various Internet and cloud computing services that they may offer. In this paper, we seek to tackle this shortcoming by proposing a systematic approach to maximize green data center's profit, i.e., revenue minus cost. In this regard, we explicitly take into account practical service-level agreements (SLAs) that currently exist between data centers and their customers. Our model also incorporates various other factors such as availability of local renewable power generation at data centers and the stochastic nature of data centers' workload. Furthermore, we propose a novel optimization-based profit maximization strategy for data centers for two different cases, without and with behind-the-meter renewable generators. We show that the formulated optimization problems in both cases are convex programs; therefore, they are tractable and appropriate for practical implementation. Using various experimental data and via computer simulations, we assess the performance of the proposed optimization-based profit maximization strategy and show that it significantly outperforms two comparable energy and performance management algorithms that are recently proposed in the literature.

...read moreread less

234 citations

Journal Article•DOI•

VMPlanner: Optimizing virtual machine placement and traffic flow routing to reduce network power costs in cloud data centers

[...]

Weiwei Fang¹, Xiangmin Liang¹, Shengxin Li¹, Luca Chiaraviglio², Naixue Xiong³ - Show less +1 more•Institutions (3)

Beijing Jiaotong University¹, Polytechnic University of Turin², Colorado Technical University³

01 Jan 2013-Computer Networks

TL;DR: The basic idea of VMPlanner is to optimize both virtual machine placement and traffic flow routing so as to turn off as many unneeded network elements as possible for power saving in the virtualization-based data centers.

...read moreread less

228 citations

Journal Article•DOI•

Optical interconnection networks in data centers: recent trends and future challenges

[...]

Christoforos Kachris, Konstantinos Kanonakis, Ioannis Tomkos

05 Sep 2013-IEEE Communications Magazine

TL;DR: An update on recent developments in the field of ultra-highcapacity optical interconnects for intra-DCN communication is provided.

...read moreread less

Abstract: Warehouse-scale data center operators need much-higher-bandwidth intra-data center networks (DCNs) to sustain the increase of network traffic due to cloud computing and other emerging web applications. Current DCNs based on commodity switches require excessive amounts of power to face this traffic increase. Optical intra-DCN interconnection networks have recently emerged as a promising solution that can provide higher throughput while consuming less power. This article provides an update on recent developments in the field of ultra-highcapacity optical interconnects for intra-DCN communication. Several recently proposed architectures and technologies are examined and compared, while future trends and research challenges are outlined.

...read moreread less

202 citations

Journal Article•DOI•

Probabilistic Consolidation of Virtual Machines in Self-Organizing Cloud Data Centers

[...]

Carlo Mastroianni¹, Michela Meo, Giuseppe Papuzzo•Institutions (1)

Indian Council of Agricultural Research¹

01 Jul 2013

TL;DR: In this article, a self-organizing and adaptive approach for the consolidation of VMs on two resources, namely, CPU and RAM, is presented, which makes the approach very simple to implement.

...read moreread less

Abstract: Power efficiency is one of the main issues that will drive the design of data centers, especially of those devoted to provide Cloud computing services. In virtualized data centers, consolidation of Virtual Machines (VMs) on the minimum number of physical servers has been recognized as a very efficient approach, as this allows unloaded servers to be switched off or used to accommodate more load, which is clearly a cheaper alternative to buy more resources. The consolidation problem must be solved on multiple dimensions, since in modern data centers CPU is not the only critical resource: depending on the characteristics of the workload other resources, for example, RAM and bandwidth, can become the bottleneck. The problem is so complex that centralized and deterministic solutions are practically useless in large data centers with hundreds or thousands of servers. This paper presents ecoCloud, a self-organizing and adaptive approach for the consolidation of VMs on two resources, namely CPU and RAM. Decisions on the assignment and migration of VMs are driven by probabilistic processes and are based exclusively on local information, which makes the approach very simple to implement. Both a fluid-like mathematical model and experiments on a real data center show that the approach rapidly consolidates the workload, and CPU-bound and RAM-bound VMs are balanced, so that both resources are exploited efficiently.

...read moreread less

189 citations

Journal Article•DOI•

Energy efficient virtual machine placement algorithm with balanced and improved resource utilization in a data center

[...]

Xin Li¹, Zhuzhong Qian¹, Sanglu Lu¹, Jie Wu²•Institutions (2)

Nanjing University¹, Temple University²

01 Sep 2013-Mathematical and Computer Modelling

TL;DR: A virtual machine placement algorithm EAGLE is proposed, which can balance the utilization of multi-dimensional resources, reduce the number of running PMs, and thus lower the energy consumption, and is evaluated via extensive simulations and experiments on real traces.

...read moreread less

185 citations

Journal Article•DOI•

Review of performance metrics for green data centers: a taxonomy study

[...]

Lizhe Wang¹, Samee U. Khan²•Institutions (2)

Indiana University¹, North Dakota State University²

01 Mar 2013-The Journal of Supercomputing

TL;DR: This paper is devoted to categorization of green computing performance metrics in data centers, such as basic metrics like power metrics, thermal metrics and extended performance metrics i.e. multiple data center indicators.

...read moreread less

Abstract: Data centers now play an important role in modern IT infrastructures. Although much research effort has been made in the field of green data center computing, performance metrics for green data centers have been left ignored. This paper is devoted to categorization of green computing performance metrics in data centers, such as basic metrics like power metrics, thermal metrics and extended performance metrics i.e. multiple data center indicators. Based on a taxonomy of performance metrics, this paper summarizes features of currently available metrics and presents insights for the study on green data center computing.

...read moreread less

182 citations

Journal Article•DOI•

State-of-the-art research study for green cloud computing

[...]

Si-Yuan Jing¹, Shahzad Ali¹, Kun She¹, Yi Zhong¹•Institutions (1)

University of Electronic Science and Technology of China¹

01 Jul 2013-The Journal of Supercomputing

TL;DR: This paper study state-of-the-art techniques and research related to power saving in the IaaS of a cloud computing system, which consumes a huge part of total energy in a cloud Computing system.

...read moreread less

Abstract: Although cloud computing has rapidly emerged as a widely accepted computing paradigm, the research on cloud computing is still at an early stage. Cloud computing suffers from different challenging issues related to security, software frameworks, quality of service, standardization, and power consumption. Efficient energy management is one of the most challenging research issues. The core services in cloud computing system are the SaaS (Software as a Service), PaaS (Platform as a Service), and IaaS (Infrastructure as a Service). In this paper, we study state-of-the-art techniques and research related to power saving in the IaaS of a cloud computing system, which consumes a huge part of total energy in a cloud computing system. At the end, some feasible solutions for building green cloud computing are proposed. Our aim is to provide a better understanding of the design challenges of energy management in the IaaS of a cloud computing system.

...read moreread less

Journal Article•DOI•

Electricity Cost Saving Strategy in Data Centers by Using Energy Storage

[...]

Yuanxiong Guo¹, Yuguang Fang¹•Institutions (1)

University of Florida¹

01 Jun 2013-IEEE Transactions on Parallel and Distributed Systems

TL;DR: This work is the first to explore the problem of electricity cost saving using energy storage in multiple data centers by considering both the spatial and temporal variations in wholesale electricity prices and workload arrival processes.

...read moreread less

Abstract: Electricity expenditure comprises a significant fraction of the total operating cost in data centers. Hence, cloud service providers are required to reduce electricity cost as much as possible. In this paper, we consider utilizing existing energy storage capabilities in data centers to reduce electricity cost under wholesale electricity markets, where the electricity price exhibits both temporal and spatial variations. A stochastic program is formulated by integrating the center-level load balancing, the server-level configuration, and the battery management while at the same time guaranteeing the quality-of-service experience by end users. We use the Lyapunov optimization technique to design an online algorithm that achieves an explicit tradeoff between cost saving and energy storage capacity. We demonstrate the effectiveness of our proposed algorithm through extensive numerical evaluations based on real-world workload and electricity price data sets. As far as we know, our work is the first to explore the problem of electricity cost saving using energy storage in multiple data centers by considering both the spatial and temporal variations in wholesale electricity prices and workload arrival processes.

...read moreread less

Proceedings Article•DOI•

Harmony: Dynamic Heterogeneity-Aware Resource Provisioning in the Cloud

[...]

Qi Zhang¹, Mohamed Faten Zhani¹, Raouf Boutaba¹, Joseph L. Hellerstein²•Institutions (2)

University of Waterloo¹, Google²

08 Jul 2013

TL;DR: Harmony, a Heterogeneity-Aware dynamic capacity provisioning scheme for cloud data centers that uses the K-means clustering algorithm to divide workload into distinct task classes with similar characteristics in terms of resource and performance requirements and can reduce energy by 28 percent compared to heterogeneity-oblivious solutions.

...read moreread less

Abstract: Data centers today consume tremendous amount of energy in terms of power distribution and cooling. Dynamic capacity provisioning is a promising approach for reducing energy consumption by dynamically adjusting the number of active machines to match resource demands. However, despite extensive studies of the problem, existing solutions for dynamic capacity provisioning have not fully considered the heterogeneity of both workload and machine hardware found in production environments. In particular, production data centers often comprise several generations of machines with different capacities, capabilities and energy consumption characteristics. Meanwhile, the workloads running in these data centers typically consist of a wide variety of applications with different priorities, performance objectives and resource requirements. Failure to consider heterogenous characteristics will lead to both sub-optimal energy-savings and long scheduling delays, due to incompatibility between workload requirements and the resources offered by the provisioned machines. To address this limitation, in this paper we present HARMONY, a Heterogeneity-Aware Resource Management System for dynamic capacity provisioning in cloud computing environments. Specifically, we first use the K-means clustering algorithm to divide the workload into distinct task classes with similar characteristics in terms of resource and performance requirements. Then we present a novel technique for dynamically adjusting the number of machines of each type to minimize total energy consumption and performance penalty in terms of scheduling delay. Through simulations using real traces from Google's compute clusters, we found that our approach can improve data center energy efficiency by up to 28% compared to heterogeneity-oblivious solutions.

...read moreread less

Proceedings Article•DOI•

The Glasgow Raspberry Pi Cloud: A Scale Model for Cloud Computing Infrastructures

[...]

Fung Po Tso¹, David White¹, Simon Jouet¹, Jeremy Singer¹, Dimitrios P. Pezaros¹ - Show less +1 more•Institutions (1)

University of Glasgow¹

08 Jul 2013

TL;DR: The Glasgow Raspberry Pi Cloud (PiCloud), a scale model of a DC composed of clusters of Raspberry Pi devices, emulates every layer of a Cloud stack, ranging from resource virtualisation to network behaviour, providing a full-featured Cloud Computing research and educational environment.

...read moreread less

Abstract: Data Centers (DC) used to support Cloud services often consist of tens of thousands of networked machines under a single roof. The significant capital outlay required to replicate such infrastructures constitutes a major obstacle to practical implementation and evaluation of research in this domain. Currently, most research into Cloud computing relies on either limited software simulation, or the use of a testbed environments with a handful of machines. The recent introduction of the Raspberry Pi, a low-cost, low-power single-board computer, has made the construction of a miniature Cloud DCs more affordable. In this paper, we present the Glasgow Raspberry Pi Cloud (PiCloud), a scale model of a DC composed of clusters of Raspberry Pi devices. The PiCloud emulates every layer of a Cloud stack, ranging from resource virtualisation to network behaviour, providing a full-featured Cloud Computing research and educational environment.

...read moreread less

Proceedings Article•DOI•

Facebook's data center network architecture

[...]

Nathan Farrington¹, Alexey Andreyev¹•Institutions (1)

Facebook¹

05 May 2013

TL;DR: Facebook's current data center network architecture is reviewed and alternative architectures are explored to explore some alternative architectures used in data center networks around the world.

...read moreread less

Abstract: We review Facebook's current data center network architecture and explore some alternative architectures.

...read moreread less

Book Chapter•DOI•

Energy and carbon-efficient placement of virtual machines in distributed cloud data centers

[...]

Atefeh Khosravi¹, Saurabh Garg¹, Rajkumar Buyya¹•Institutions (1)

University of Melbourne¹

26 Aug 2013

TL;DR: This paper proposes a novel VM placement algorithm to increase the environmental sustainability by taking into account distributed data centers with different carbon footprint rates and PUEs and results show that the proposed algorithm reduces the CO2 emission and power consumption, while it maintains the same level of quality of service.

...read moreread less

Abstract: Due to the increasing use of Cloud computing services and the amount of energy used by data centers, there is a growing interest in reducing energy consumption and carbon footprint of data centers. Cloud data centers use virtualization technology to host multiple virtual machines (VMs) on a single physical server. By applying efficient VM placement algorithms, Cloud providers are able to enhance energy efficiency and reduce carbon footprint. Previous works have focused on reducing the energy used within a single or multiple data centers without considering their energy sources and Power Usage Effectiveness (PUE). In contrast, this paper proposes a novel VM placement algorithm to increase the environmental sustainability by taking into account distributed data centers with different carbon footprint rates and PUEs. Simulation results show that the proposed algorithm reduces the CO2 emission and power consumption, while it maintains the same level of quality of service compared to other competitive algorithms.

...read moreread less

Proceedings Article•

VDC Planner: Dynamic migration-aware Virtual Data Center embedding for clouds

[...]

Mohamed Faten Zhani¹, Qi Zhang¹, Gwendal Simon², Raouf Boutaba¹•Institutions (2)

University of Waterloo¹, École nationale supérieure des télécommunications de Bretagne²

27 May 2013

TL;DR: VDC Planner is proposed, a migration-aware dynamic virtual data center embedding framework that aims at achieving high revenue while minimizing the total energy cost over-time and achieves both higher revenue and lower average scheduling delay compared to existing migration-oblivious solutions.

...read moreread less

Abstract: Cloud computing promises to provide computing resources to a large number of service applications in an on demand manner. Traditionally, cloud providers such as Amazon only provide guaranteed allocation for compute and storage resources, and fail to support bandwidth requirements and performance isolation among these applications. To address this limitation, recently, a number of proposals advocate providing both guaranteed server and network resources in the form of Virtual Data Centers (VDCs). This raises the problem of optimally allocating both servers and data center networks to multiple VDCs in order to maximize the total revenue, while minimizing the total energy consumption in the data center. However, despite recent studies on this problem, none of the existing solutions have considered the possibility of using VM migration to dynamically adjust the resource allocation, in order to meet the fluctuating resource demand of VDCs. In this paper, we propose VDC Planner, a migration-aware dynamic virtual data center embedding framework that aims at achieving high revenue while minimizing the total energy cost over-time. Our framework supports various usage scenarios, including VDC embedding, VDC scaling as well as dynamic VDC consolidation. Through experiments using realistic workload traces, we show our proposed approach achieves both higher revenue and lower average scheduling delay compared to existing migration-oblivious solutions.

...read moreread less

Journal Article•DOI•

Moving Big Data to The Cloud: An Online Cost-Minimizing Approach

[...]

Linquan Zhang¹, Chuan Wu², Zongpeng Li¹, Chuanxiong Guo³, Minghua Chen⁴, Francis C. M. Lau² - Show less +2 more•Institutions (4)

University of Calgary¹, University of Hong Kong², Microsoft³, The Chinese University of Hong Kong⁴

02 Dec 2013-IEEE Journal on Selected Areas in Communications

TL;DR: This work studies timely, cost-minimizing upload of massive, dynamically-generated, geo-dispersed data into the cloud, for processing using a MapReduce-like framework, and proposes two online algorithms: an online lazy migration (OLM) algorithm and a randomized fixed horizon control (RFHC) algorithm.

...read moreread less

Abstract: Cloud computing, rapidly emerging as a new computation paradigm, provides agile and scalable resource access in a utility-like fashion, especially for the processing of big data. An important open issue here is to efficiently move the data, from different geographical locations over time, into a cloud for effective processing. The de facto approach of hard drive shipping is not flexible or secure. This work studies timely, cost-minimizing upload of massive, dynamically-generated, geo-dispersed data into the cloud, for processing using a MapReduce-like framework. Targeting at a cloud encompassing disparate data centers, we model a cost-minimizing data migration problem, and propose two online algorithms: an online lazy migration (OLM) algorithm and a randomized fixed horizon control (RFHC) algorithm , for optimizing at any given time the choice of the data center for data aggregation and processing, as well as the routes for transmitting data there. Careful comparisons among these online and offline algorithms in realistic settings are conducted through extensive experiments, which demonstrate close-to-offline-optimum performance of the online algorithms.

...read moreread less

Journal Article•DOI•

On the Characterization of the Structural Robustness of Data Center Networks

[...]

Kashif Bilal¹, Marc Manzano², Samee U. Khan¹, Eusebi Calle², Keqin Li³, Albert Y. Zomaya⁴ - Show less +2 more•Institutions (4)

North Dakota State University¹, University of Girona², State University of New York System³, Information Technology University⁴

19 Sep 2013

TL;DR: This study analyzes robustness of the state-of-the-art Data Center Network (DCN) and presents multi-layered graph modeling of various DCNs, and proposes new procedures to quantify the DCN robustness.

...read moreread less

Abstract: Data centers being an architectural and functional block of cloud computing are integral to the Information and Communication Technology (ICT) sector. Cloud computing is rigorously utilized by various domains, such as agriculture, nuclear science, smart grids, healthcare, and search engines for research, data storage, and analysis. A Data Center Network (DCN) constitutes the communicational backbone of a data center, ascertaining the performance boundaries for cloud infrastructure. The DCN needs to be robust to failures and uncertainties to deliver the required Quality of Service (QoS) level and satisfy Service Level Agreement (SLA). In this paper, we analyze robustness of the state-of-the-art DCNs. Our major contributions are: (a) we present multi-layered graph modeling of various DCNs; (b) we study the classical robustness metrics considering various failure scenarios to perform a comparative analysis; (c) we present the inadequacy of the classical network robustness metrics to appropriately evaluate the DCN robustness; and (d) we propose new procedures to quantify the DCN robustness. Currently, there is no detailed study available centering the DCN robustness. Therefore, we believe that this study will lay a firm foundation for the future DCN robustness research.

...read moreread less

Proceedings Article•DOI•

Characterizing data analysis workloads in data centers

[...]

Zhen Jia, Lei Wang, Jianfeng Zhan, Lixin Zhang, Chunjie Luo - Show less +1 more

30 Jul 2013

TL;DR: The study on the workloads reveals that data analysis applications share many inherent characteristics, which place them in a different class from desktop, HPC, and service workloads, including traditional server workloads (SPECweb200S) and scale-out service workloadS (four among six benchmarks in CloudSuite), and accordingly the authors give several recommendations for architecture and system optimizations.

...read moreread less

Abstract: As the amount of data explodes rapidly, more and more corporations are using data centers to make effective decisions and gain a competitive edge. Data analysis applications play a significant role in data centers, and hence it has became increasingly important to understand their behaviors in order to further improve the performance of data center computer systems. In this paper, after investigating three most important application domains in terms of page views and daily visitors, we choose eleven representative data analysis workloads and characterize their micro-architectural characteristics by using hardware performance counters, in order to understand the impacts and implications of data analysis workloads on the systems equipped with modern superscalar out-of-order processors. Our study on the workloads reveals that data analysis applications share many inherent characteristics, which place them in a different class from desktop (SPEC CPU2006), HPC (HPCC), and service workloads, including traditional server workloads (SPECweb200S) and scale-out service workloads (four among six benchmarks in CloudSuite), and accordingly we give several recommendations for architecture and system optimizations. On the basis of our workload characterization work, we released a benchmark suite named DCBench for typical datacenter workloads, including data analysis and service workloads, with an open-source license on our project home page on http://prof.ict.ac.cnIDCBench. We hope that DCBench is helpful for performing architecture and small-to-medium scale system researches for datacenter computing.

...read moreread less

Journal Article•DOI•

Greenhead: Virtual Data Center Embedding across Distributed Infrastructures

[...]

Ahmed Amokrane, Mohamed Faten Zhani¹, Rami Langar, Raouf Boutaba¹, Guy Pujolle - Show less +1 more•Institutions (1)

University of Waterloo¹

01 Jan 2013

TL;DR: Greenhead, a holistic resource management framework for embedding VDCs across geographically distributed data centers connected through a backbone network, is proposed to maximize the cloud provider's revenue while ensuring that the infrastructure is as environment-friendly as possible.

...read moreread less

Abstract: Cloud computing promises to provide on-demand computing, storage, and networking resources. However, most cloud providers simply offer virtual machines (VMs) without bandwidth and delay guarantees, which may hurt the performance of the deployed services. Recently, some proposals suggested remediating such limitation by offering virtual data centers (VDCs) instead of VMs only. However, they have only considered the case where VDCs are embedded within a single data center. In practice, infrastructure providers should have the ability to provision requested VDCs across their distributed infrastructure to achieve multiple goals including revenue maximization, operational costs reduction, energy efficiency, and green IT, or to simply satisfy geographic location constraints of the VDCs. In this paper, we propose Greenhead, a holistic resource management framework for embedding VDCs across geographically distributed data centers connected through a backbone network. The goal of Greenhead is to maximize the cloud provider's revenue while ensuring that the infrastructure is as environment-friendly as possible. To evaluate the effectiveness of our proposal, we conducted extensive simulations of four data centers connected through the NSFNet topology. Results show that Greenhead improves requests' acceptance ratio and revenue by up to 40 percent while ensuring high usage of renewable energy and minimal carbon footprint.

...read moreread less

Patent•

Cdn load balancing in the cloud

[...]

Marwan Batrouni¹, Jason Drew Zions¹, Octavian Hornoiu¹•Institutions (1)

Microsoft¹

01 Nov 2013

TL;DR: In this article, the authors propose a CND load balancing in the cloud, where server resources are allocated at an edge data center of a content delivery network to properties that are being serviced by edge data centers.

...read moreread less

Abstract: CND load balancing in the cloud. Server resources are allocated at an edge data center of a content delivery network to properties that are being serviced by edge data center. Based on near real-time data, properties are sorted by trending traffic at the edge data center. Server resources are allocated for at least one property of the sorted properties at the edge data center. The server resources are allocated based on rules developed from long-term trends. The resource allocation includes calculating server needs for the property in a partition at the edge data center, and allocating the server needs for the property to available servers in the partition.

...read moreread less

Journal Article•DOI•

Quantitative comparisons of the state-of-the-art data center architectures

[...]

Kashif Bilal¹, Samee U. Khan¹, Limin Zhang¹, Hongxiang Li², Khizar Hayat³, Sajjad A. Madani³, Nasro Min-Allah³, Lizhe Wang⁴, Dan Chen⁵, Majid I. Iqbal³, Cheng-Zhong Xu⁶, Albert Y. Zomaya⁷ - Show less +8 more•Institutions (7)

North Dakota State University¹, University of Louisville², COMSATS Institute of Information Technology³, Chinese Academy of Sciences⁴, China University of Geosciences (Wuhan)⁵, Wayne State University⁶, University of Sydney⁷

25 Aug 2013-Concurrency and Computation: Practice and Experience

TL;DR: This paper has implemented and simulated the state of the art DCN models in this paper, namely: (a) legacy DCN architecture, (b) switch‐based, and (c) hybrid models, and compared their effectiveness by monitoring the network: throughput and average packet delay.

...read moreread less

Abstract: Data centers are experiencing a remarkable growth in the number of interconnected servers. Being one of the foremost data center design concerns, network infrastructure plays a pivotal role in the initial capital investment and ascertaining the performance parameters for the data center. Legacy data center network (DCN) infrastructure lacks the inherent capability to meet the data centers growth trend and aggregate bandwidth demands. Deployment of even the highest-end enterprise network equipment only delivers around 50% of the aggregate bandwidth at the edge of network. The vital challenges faced by the legacy DCN architecture trigger the need for new DCN architectures, to accommodate the growing demands of the ‘cloud computing’ paradigm. We have implemented and simulated the state of the art DCN models in this paper, namely: (a) legacy DCN architecture, (b) switch-based, and (c) hybrid models, and compared their effectiveness by monitoring the network: (a) throughput and (b) average packet delay. The presented analysis may be perceived as a background benchmarking study for the further research on the simulation and implementation of the DCN-customized topologies and customized addressing protocols in the large-scale data centers. We have performed extensive simulations under various network traffic patterns to ascertain the strengths and inadequacies of the different DCN architectures. Moreover, we provide a firm foundation for further research and enhancement in DCN architectures. Copyright ？ 2012 John Wiley & Sons, Ltd.

...read moreread less

Proceedings Article•DOI•

Characterizing Cloud Applications on a Google Data Center

[...]

Sheng Di¹, Derrick Kondo¹, Franck Cappello²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of Illinois at Urbana–Champaign²

01 Oct 2013

TL;DR: This paper characterize Google applications, based on a one-month Google trace with over 650k jobs running across over 12,000 heterogeneous hosts from a Google data center, via a K-means clustering algorithm with optimized number of sets,based on task events and resource usage.

...read moreread less

Abstract: In this paper, we characterize Google applications, based on a one-month Google trace with over 650k jobs running across over 12000 heterogeneous hosts from a Google data center. On one hand, we carefully compute the valuable statistics about task events and resource utilization for Google applications, based on various types of resources (such as CPU, memory) and execution types (e.g., whether they can run batch tasks or not). Resource utilization per application is observed with an extremely typical Pareto principle. On the other hand, we classify applications via a K-means clustering algorithm with optimized number of sets, based on task events and resource usage. The number of applications in the K-means clustering sets follows a Pareto-similar distribution. We believe our work is very interesting and valuable for the further investigation of Cloud environment.

...read moreread less

Proceedings Article•DOI•

Adaptive Anomaly Identification by Exploring Metric Subspace in Cloud Computing Infrastructures

[...]

Qiang Guan¹, Song Fu¹•Institutions (1)

University of North Texas¹

30 Sep 2013

TL;DR: An adaptive anomaly identification mechanism that explores the most relevant principal components of different failure types in cloud computing infrastructures and integrates the cloud performance metric analysis with filtering techniques to achieve automated, efficient, and accurate anomaly identification.

...read moreread less

Abstract: Cloud computing has become increasingly popular by obviating the need for users to own and maintain complex computing infrastructures. However, due to their inherent complexity and large scale, production cloud computing systems are prone to various runtime problems caused by hardware and software faults and environmental factors. Autonomic anomaly detection is a crucial technique for understanding emergent, cloud-wide phenomena and self-managing cloud resources for system-level dependability assurance. To detect anomalous cloud behaviors, we need to monitor the cloud execution and collect runtime cloud performance data. These data consist of values of performance metrics for different types of failures, which display different correlations with the performance metrics. In this paper, we present an adaptive anomaly identification mechanism that explores the most relevant principal components of different failure types in cloud computing infrastructures. It integrates the cloud performance metric analysis with filtering techniques to achieve automated, efficient, and accurate anomaly identification. The proposed mechanism adapts itself by recursively learning from the newly verified detection results to refine future detections. We have implemented a prototype of the anomaly identification system and conducted experiments in an on-campus cloud computing environment and by using the Google data center traces. Our experimental results show that our mechanism can achieve more efficient and accurate anomaly detection than other existing schemes.

...read moreread less

Proceedings Article•DOI•

Shroud: ensuring private access to large-scale data in the data center

[...]

Jacob R. Lorch¹, Bryan Parno¹, James Mickens¹, Mariana Raykova², Joshua Schiffman³ - Show less +1 more•Institutions (3)

Microsoft¹, IBM², Advanced Micro Devices³

12 Feb 2013

TL;DR: Shroud is presented, a general storage system that hides data access patterns from the servers running it, protecting user privacy, and shows, via new techniques such as oblivious aggregation, how to securely use many inexpensive secure coprocessors acting in parallel to improve request latency.

...read moreread less

Abstract: Recent events have shown online service providers the perils of possessing private information about users Encrypting data mitigates but does not eliminate this threat: the pattern of data accesses still reveals information Thus, we present Shroud, a general storage system that hides data access patterns from the servers running it, protecting user privacy Shroud functions as a virtual disk with a new privacy guarantee: the user can look up a block without revealing the block's address Such a virtual disk can be used for many purposes, including map lookup, microblog search, and social networkingShroud aggressively targets hiding accesses among hundreds of terabytes of data We achieve our goals by adapting oblivious RAM algorithms to enable large-scale parallelization Specifically, we show, via new techniques such as oblivious aggregation, how to securely use many inexpensive secure coprocessors acting in parallel to improve request latency Our evaluation combines large-scale emulation with an implementation on secure coprocessors and suggests that these adaptations bring private data access closer to practicality

...read moreread less

Journal Article•DOI•

Enabling technologies for future data center networking: a primer

[...]

Min Chen¹, Hai Jin¹, Yonggang Wen², Victor C. M. Leung³•Institutions (3)

Huazhong University of Science and Technology¹, Nanyang Technological University², University of British Columbia³

05 Aug 2013-IEEE Network

TL;DR: This article presents a survey on enabling DCN technologies for future cloud infrastructures through which the huge amount of resources in data centers can be efficiently managed.

...read moreread less

Abstract: The increasing adoption of cloud services is demanding the deployment of more data centers. Data centers typically house a huge amount of storage and computing resources, in turn dictating better networking technologies to connect the large number of computing and storage nodes. Data center networking (DCN) is an emerging field to study networking challenges in data centers. In this article, we present a survey on enabling DCN technologies for future cloud infrastructures through which the huge amount of resources in data centers can be efficiently managed. Specifically, we start with a detailed investigation of the architecture, technologies, and design principles for future DCN. Following that, we highlight some of the design challenges and open issues that should be addressed for future DCN to improve its energy efficiency and increase its throughput while lowering its cost.

...read moreread less

Journal Article•DOI•

A Tabu Search Algorithm for the Location of Data Centers and Software Components in Green Cloud Computing Networks

[...]

Federico Larumbe¹, Brunilde Sansò¹•Institutions (1)

École Normale Supérieure¹

25 Nov 2013

TL;DR: In this article, the authors present a planning problem and an extremely efficient tabu search heuristic for optimizing the locations of cloud data centers and software components while simultaneously finding the information routing and network link capacities.

...read moreread less

Abstract: The ubiquity of cloud applications requires the meticulous design of cloud networks with high quality of service, low costs, and low CO2 emissions. This paper presents a planning problem and an extremely efficient tabu search heuristic for optimizing the locations of cloud data centers and software components while simultaneously finding the information routing and network link capacities. The objectives are to optimize the network performance, the CO2 emissions, the capital expenditures (CAPEX), and the operational expenditures (OPEX). The problem is modeled using a mixed-integer programming model and solved with both an optimization solver and a tabu search heuristic. A case study of a web search engine is presented to explain and optimize the different aspects, showing how planners can use the model to direct the optimization and find the best solutions. The efficiency of the tabu search algorithm is presented for networks with up to 500 access nodes and 1,000 potential data center locations distributed around the globe.

...read moreread less

Proceedings Article•DOI•

The Impact of Mobile Multimedia Applications on Data Center Consolidation

[...]

Kiryong Ha¹, Padmanabhan Pillai², Grace A. Lewis¹, Soumya Simanta¹, Sarah Clinch³, Nigel Davies³, Mahadev Satyanarayanan¹ - Show less +3 more•Institutions (3)

Carnegie Mellon University¹, Intel², Lancaster University³

25 Mar 2013

TL;DR: In this paper, the authors present quantitative evidence that this crucial design consideration to meet interactive performance criteria limits data center consolidation and describe an architectural solution that is a seamless extension of today's cloud computing infrastructure.

...read moreread less

Abstract: The convergence of mobile computing and cloud computing enables new multimedia applications that are both resource-intensive and interaction-intensive. For these applications, end-to-end network bandwidth and latency matter greatly when cloud resources are used to augment the computational power and battery life of a mobile device. We first present quantitative evidence that this crucial design consideration to meet interactive performance criteria limits data center consolidation. We then describe an architectural solution that is a seamless extension of today's cloud computing infrastructure.

...read moreread less

Proceedings Article•

On tackling virtual data center embedding problem

[...]

Golam Rabbani¹, Rafael Pereira Esteves¹, Maxim Podlesny¹, Gwendal Simon², Lisandro Zambenedetti Granville³, Raouf Boutaba¹ - Show less +2 more•Institutions (3)

University of Waterloo¹, École nationale supérieure des télécommunications de Bretagne², University of Rio Grande³

27 May 2013

TL;DR: A new embedding solution for data centers that, in addition to virtual machine placement, explicitly considers the relation between switches and links, allows multiple resources of the same request to be mapped to a single physical resource, and reduces resource fragmentation in terms of CPU is proposed.

...read moreread less

Abstract: Virtualizing data center networks has been considered a feasible alternative to satisfy the requirements of advanced cloud services. Proper mapping of virtual data center (VDC) resources to their physical counterparts, also known as virtual data center embedding, can impact the revenue of cloud providers. Similar to virtual networks, the problem of mapping virtual requests to physical infrastructures is known to be NPhard. Although some proposals have come up with heuristics to cope with the complexity of the embedding process focusing on virtual machine placement, these solutions ignore the correlation among other data resources, such as switches and storage. In this paper, we propose a new embedding solution for data centers that, in addition to virtual machine placement, explicitly considers the relation between switches and links, allows multiple resources of the same request to be mapped to a single physical resource, and reduces resource fragmentation in terms of CPU. Simulations show that our solution results in high acceptance ratio of VDC requests, improves utilization of the physical substrate, and generates increased revenue for infrastructure providers.

...read moreread less

Collapse