DRL-cloud: deep reinforcement learning-based resource provisioning and task scheduling for cloud service providers

doi:10.5555/3201607.3201635

Open AccessProceedings ArticleDOI

DRL-cloud: deep reinforcement learning-based resource provisioning and task scheduling for cloud service providers

- pp 129-134

TLDR

DRL-Cloud is presented, a novel Deep Reinforcement Learning (DRL)-based RP and TS system, to minimize energy cost for large-scale CSPs with very large number of servers that receive enormous numbers of user requests per day.

Abstract:

Cloud computing has become an attractive computing paradigm in both academia and industry. Through virtualization technology, Cloud Service Providers (CSPs) that own data centers can structure physical servers into Virtual Machines (VMs) to provide services, resources, and infrastructures to users. Profit-driven CSPs charge users for service access and VM rental, and reduce power consumption and electric bills so as to increase profit margin. The key challenge faced by CSPs is data center energy cost minimization. Prior works proposed various algorithms to reduce energy cost through Resource Provisioning (RP) and/or Task Scheduling (TS). However, they have scalability issues or do not consider TS with task dependencies, which is a crucial factor that ensures correct parallel execution of tasks. This paper presents DRL-Cloud, a novel Deep Reinforcement Learning (DRL)-based RP and TS system, to minimize energy cost for large-scale CSPs with very large number of servers that receive enormous numbers of user requests per day. A deep Q-learning-based two-stage RP-TS processor is designed to automatically generate the best long-term decisions by learning from the changing environment such as user request patterns and realistic electric price. With training techniques such as target network, experience replay, and exploration and exploitation, the proposed DRL-Cloud achieves remarkably high energy cost efficiency, low reject rate as well as low runtime with fast convergence. Compared with one of the state-of-the-art energy efficient algorithms, the proposed DRL-Cloud achieves up to 320% energy cost efficiency improvement while maintaining lower reject rate on average. For an example CSP setup with 5,000 servers and 200,000 tasks, compared to a fast round-robin baseline, the proposed DRL-Cloud achieves up to 144% runtime reduction.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Transformative effects of IoT, Blockchain and Artificial Intelligence on cloud computing: Evolution, vision, trends and open challenges

Sukhpal Singh Gill, +20 more

TL;DR: A conceptual model for cloud futurology is proposed in this article to explore the influence of emerging paradigms and technologies on evolution of cloud computing. But, the model is limited to three technologies: Blockchain, IoT and Artificial Intelligence.

...read moreread less

Journal ArticleDOI

Deep Reinforcement Learning for Cyber Security.

Thanh Nguyen, +1 more

- 03 Nov 2021 -

IEEE Transactions on Neural Networks

TL;DR: This article presents a survey of DRL approaches developed for cyber security, including DRL-based security methods for cyber-physical systems, autonomous intrusion detection techniques, and multiagent D RL-based game theory simulations for defense strategies against cyberattacks.

...read moreread less

Journal ArticleDOI

A scheduling scheme in the cloud computing environment using deep Q-learning

Zhao Tong, +5 more

- 01 Feb 2020 -

Information Sciences

TL;DR: A novel artificial intelligence algorithm, called deep Q-learning task scheduling (DQTS), that combines the advantages of the Q- learning algorithm and a deep neural network is proposed, aimed at solving the problem of handling directed acyclic graph tasks in a cloud computing environment.

...read moreread less

Posted Content

Transformative effects of IoT, Blockchain and Artificial Intelligence on cloud computing: Evolution, vision, trends and open challenges

Sukhpal Singh Gill, +18 more

- 21 Oct 2019 -

arXiv: Distributed, Parallel, and Cluste...

TL;DR: This study aims to explore how three emerging paradigms (Blockchain, IoT and Artificial Intelligence) will influence future cloud computing systems and proposes a conceptual model for cloud futurology to explore the influence of emerging Paradigms and technologies on evolution of cloud computing.

...read moreread less

Journal ArticleDOI

Dynamic Scheduling for Stochastic Edge-Cloud Computing Environments using A3C learning and Residual Recurrent Neural Networks

Shreshth Tuli, +3 more

- 17 Aug 2020 -

IEEE Transactions on Mobile Computing

TL;DR: In this article, the authors proposed an asynchronous advantage actor-critic (A3C) based real-time scheduler for stochastic edge-cloud environments allowing decentralized learning, concurrently across multiple agents.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Posted Content

Playing Atari with Deep Reinforcement Learning

Volodymyr Mnih, +6 more

- 19 Dec 2013 -

arXiv: Learning

TL;DR: This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them.

...read moreread less

Proceedings ArticleDOI

Dryad: distributed data-parallel programs from sequential building blocks

Michael Isard, +4 more

TL;DR: The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.

...read moreread less

Book

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines

Luiz Andre Barroso, +1 more

TL;DR: The architecture of WSCs is described, the main factors influencing their design, operation, and cost structure, and the characteristics of their software base are described.

...read moreread less

Collapse

arXiv: Learning

Resource Management with Deep Reinforcement Learning

Hongzi Mao, +3 more

DRL-cloud: deep reinforcement learning-based resource provisioning and task scheduling for cloud service providers

Citations

Transformative effects of IoT, Blockchain and Artificial Intelligence on cloud computing: Evolution, vision, trends and open challenges

Deep Reinforcement Learning for Cyber Security.

A scheduling scheme in the cloud computing environment using deep Q-learning

Transformative effects of IoT, Blockchain and Artificial Intelligence on cloud computing: Evolution, vision, trends and open challenges

Dynamic Scheduling for Stochastic Edge-Cloud Computing Environments using A3C learning and Residual Recurrent Neural Networks

References

Human-level control through deep reinforcement learning

Mastering the game of Go with deep neural networks and tree search

Playing Atari with Deep Reinforcement Learning

Dryad: distributed data-parallel programs from sequential building blocks

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines

Related Papers (5)

Human-level control through deep reinforcement learning

Reinforcement Learning: An Introduction

A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning

Playing Atari with Deep Reinforcement Learning

Resource Management with Deep Reinforcement Learning