Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

A framework to enable applications executing within virtual machines to transparently share one or more GPUs is presented and it is found that even when contention is high the consolidation algorithm is effective in improving the throughput, and that the runtime overhead of the framework is low.

Abstract:

Driven by the emergence of GPUs as a major player in high performance computing and the rapidly growing popularity of cloud environments, GPU instances are now being offered by cloud providers. The use of GPUs in a cloud environment, however, is still at initial stages, and the challenge of making GPU a true shared resource in the cloud has not yet been addressed.This paper presents a framework to enable applications executing within virtual machines to transparently share one or more GPUs. Our contributions are twofold: we extend an open source GPU virtualization software to include efficient GPU sharing, and we propose solutions to the conceptual problem of GPU kernel consolidation. In particular, we introduce a method for computing the affinity score between two or more kernels, which provides an indication of potential performance improvements upon kernel consolidation. In addition, we explore molding as a means to achieve efficient GPU sharing also in the case of kernels with high or conflicting resource requirements. We use these concepts to develop an algorithm to efficiently map a set of kernels on a pair of GPUs. We extensively evaluate our framework using eight popular GPU kernels and two Fermi GPUs. We find that even when contention is high our consolidation algorithm is effective in improving the throughput, and that the runtime overhead of our framework is low.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Planaria: Dynamic Architecture Fission for Spatial Multi-Tenant Acceleration of Deep Neural Networks

Soroush Ghodrati,Byung Hoon Ahn,Joon Kyung Kim,Sean Kinzer,Brahmendra Reddy Yatham,Navateja Alla,Hardik Sharma,Mohammad Alian,Eiman Ebrahimi,Nam Sung Kim,Cliff Young,Hadi Esmaeilzadeh +11 moreUniversity of California, San Diego,University of Kansas,Nvidia,University of Illinois at Urbana–Champaign,Google

Show Less

TL;DR: This paper defines Planaria1, a microarchitectural capability that can dynamically fission (break) into multiple smaller yet full-fledged DNN engines at runtime that enables spatially co-locating multiple DNN inference services on the same hardware, offering simultaneous multi-tenant DNN acceleration.

...read moreread less

Journal ArticleDOI

VGRIS: Virtualized GPU Resource Isolation and Scheduling in Cloud Gaming

Zhengwei Qi,Jianguo Yao,Chao Zhang,Miao Yu,Zhizhou Yang,Haibing Guan +5 moreShanghai Jiao Tong University

- 15 Jul 2014 -

ACM Transactions on Architecture and Cod...

Show Less

TL;DR: VGRIS, a resource management framework for virtualized GPU resource isolation and scheduling in cloud gaming, is proposed and experimental results show that VGRIS can effectively schedule GPU resources among various workloads.

...read moreread less

Proceedings ArticleDOI

A virtual memory based runtime to support multi-tenancy in clusters with GPUs

Michela Becchi,Kittisak Sajjapongse,Ian Graves,Adam Procter,Vignesh T. Ravi,Srimat Chakradhar +5 moreUniversity of Missouri,Ohio State University,Princeton University

Show Less

TL;DR: This paper proposes a runtime system that provides abstraction and sharing of GPUs, while allowing isolation of concurrent applications, and a central component of this runtime is a memory manager that provides a virtual memory abstraction to the applications.

...read moreread less

Book

Enabling Real-Time Mobile Cloud Computing through Emerging Technologies

Tolga Soyata

Show Less

TL;DR: Using Adobe Reader is the easiest way to submit your proposed amendments for your IGI Global proof and makes it simple for you, the contributor, to mark up the PDF.

...read moreread less

Proceedings ArticleDOI

pvFPGA: accessing an FPGA-based hardware accelerator in a paravirtualized environment

Wei Wang,Miodrag Bolic,Jonathan Parri +2 moreUniversity of Ottawa

Show Less

TL;DR: The pvFPGA as mentioned in this paper is the first system design solution for virtualizing an FPGA-based hardware accelerator on the x86 platform, where each unprivileged domain allocates a shared data pool for both user-kernel and inter-domain data transfer.

...read moreread less

1
2
3
4
5
6
…
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

Collapse

David Tarditi,Sidd Puri,Jose M. Oglesby +2 moreMicrosoft

Show Less

TL;DR: This work describes Accelerator, a system that uses data parallelism to program GPUs for general-purpose uses instead of C, and compares the performance of Accelerator versions of the benchmarks against hand-written pixel shaders.

...read moreread less

1
2
3
4
…
5
6
7

Collapse

IEEE Transactions on Computers

Show Less

SciSpace

About Careers Resources Support Browse Papers Pricing SciSpace Affiliate Program Cancellation & Refund Policy Terms Privacy

Tools

Citation generator AI Detector Paraphraser Citation Booster

Extensions

SciSpace

Directories

Papers Topics Journals Authors Conferences Institutions Questions Citation Styles

Contact

support@typeset.io +91 8431021544

Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

Citations

Planaria: Dynamic Architecture Fission for Spatial Multi-Tenant Acceleration of Deep Neural Networks

VGRIS: Virtualized GPU Resource Isolation and Scheduling in Cloud Gaming

A virtual memory based runtime to support multi-tenancy in clusters with GPUs

Enabling Real-Time Mobile Cloud Computing through Emerging Technologies

pvFPGA: accessing an FPGA-based hardware accelerator in a paravirtualized environment

References

The cost of doing science on the cloud: the Montage example

Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping

Automated control of multiple virtualized resources

Cost-benefit analysis of Cloud Computing versus desktop grids

Accelerator: using data parallelism to program GPUs for general-purpose uses

Related Papers (5)

GViM: GPU-accelerated virtual machines

A GPGPU transparent virtualization component for high performance computing clouds

rCUDA: Reducing the number of GPU-based accelerators in high performance clusters

Rodinia: A benchmark suite for heterogeneous computing

vCUDA: GPU-Accelerated High-Performance Computing in Virtual Machines