Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

A framework to enable applications executing within virtual machines to transparently share one or more GPUs is presented and it is found that even when contention is high the consolidation algorithm is effective in improving the throughput, and that the runtime overhead of the framework is low.

Abstract:

Driven by the emergence of GPUs as a major player in high performance computing and the rapidly growing popularity of cloud environments, GPU instances are now being offered by cloud providers. The use of GPUs in a cloud environment, however, is still at initial stages, and the challenge of making GPU a true shared resource in the cloud has not yet been addressed.This paper presents a framework to enable applications executing within virtual machines to transparently share one or more GPUs. Our contributions are twofold: we extend an open source GPU virtualization software to include efficient GPU sharing, and we propose solutions to the conceptual problem of GPU kernel consolidation. In particular, we introduce a method for computing the affinity score between two or more kernels, which provides an indication of potential performance improvements upon kernel consolidation. In addition, we explore molding as a means to achieve efficient GPU sharing also in the case of kernels with high or conflicting resource requirements. We use these concepts to develop an algorithm to efficiently map a set of kernels on a pair of GPUs. We extensively evaluate our framework using eight popular GPU kernels and two Fermi GPUs. We find that even when contention is high our consolidation algorithm is effective in improving the throughput, and that the runtime overhead of our framework is low.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

GPUShare: Fair-Sharing Middleware for GPU Clouds

Anshuman Goswami,Jeffrey Young,Karsten Schwan,Naila Farooqui,Ada Gavrilovska,Matthew Wolf,Greg Eisenhauer +6 moreGeorgia Institute of Technology

Show Less

TL;DR: GPUShare is presented, a software-based mechanism that can yield a kernel before all of its threads have run, thus giving finer control over the time slice for which the GPU is allocated to a process and improves fair GPU sharing across tenants.

...read moreread less

Book ChapterDOI

Hardware and Software Aspects of VM-Based Mobile-Cloud Offloading

Yang Song,Haoliang Wang,Tolga Soyata +2 moreUniversity of Rochester,George Mason University

Show Less

Proceedings ArticleDOI

GMOD: a dynamic GPU memory overflow detector

Bang Di,Jianhua Sun,Dong Li,Hao Chen,Zhe Quan +4 moreHunan University,University of California

Show Less

TL;DR: GMOD is a runtime software system that detects GPU buffer overflow at runtime and has small runtime overhead, and performs always-on monitoring on dynamically allocated buffers based on a canary-based design.

...read moreread less

Journal ArticleDOI

Maximizing the GPU resource usage by reordering concurrent kernels submission

Rommel Anatoli Quintanilla Cruz,Cristiana Bentes,Bernardo Breder,Eduardo Charles Vasconcellos,Esteban Clua,Pablo Carvalho,Lúcia Maria de A. Drummond +6 moreFederal Fluminense University,Rio de Janeiro State University

- 25 Sep 2019 -

Concurrency and Computation: Practice an...

Show Less

TL;DR: This work proposes a novel optimization approach to reorder the kernels invocation focusing on maximizing the resources utilization, improving the average turnaround time and system throughput.

...read moreread less

Proceedings ArticleDOI

Understanding the virtualization "Tax" of scale-out pass-through GPUs in GaaS clouds: An empirical study

Ming Liu,Tao Li,Neo Jia,Andy Currid,Vladimir Troy +4 moreUniversity of Florida,Nvidia

Show Less

TL;DR: This work makes the first attempt to characterize pass-through GPUs running in different consolidation scenarios and uncover the root causes of virtualization overheads, which can slow down the GPU command generation rate.

...read moreread less

1
2
3
4
5
6
7
…
8
9
10
11
12
13
14
…
15
16
17
18
19
20
21
22
23
24
25
26
27

Collapse

David Tarditi,Sidd Puri,Jose M. Oglesby +2 moreMicrosoft

Show Less

TL;DR: This work describes Accelerator, a system that uses data parallelism to program GPUs for general-purpose uses instead of C, and compares the performance of Accelerator versions of the benchmarks against hand-written pixel shaders.

...read moreread less

1
2
3
4
…
5
6
7

Collapse

IEEE Transactions on Computers

Show Less

SciSpace

About Careers Resources Support Browse Papers Pricing SciSpace Affiliate Program Cancellation & Refund Policy Terms Privacy

Tools

Citation generator AI Detector Paraphraser Citation Booster

Extensions

SciSpace

Directories

Papers Topics Journals Authors Conferences Institutions Questions Citation Styles

Contact

support@typeset.io +91 8431021544

Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

Citations

GPUShare: Fair-Sharing Middleware for GPU Clouds

Hardware and Software Aspects of VM-Based Mobile-Cloud Offloading

GMOD: a dynamic GPU memory overflow detector

Maximizing the GPU resource usage by reordering concurrent kernels submission

Understanding the virtualization "Tax" of scale-out pass-through GPUs in GaaS clouds: An empirical study

References

The cost of doing science on the cloud: the Montage example

Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping

Automated control of multiple virtualized resources

Cost-benefit analysis of Cloud Computing versus desktop grids

Accelerator: using data parallelism to program GPUs for general-purpose uses

Related Papers (5)

GViM: GPU-accelerated virtual machines

A GPGPU transparent virtualization component for high performance computing clouds

rCUDA: Reducing the number of GPU-based accelerators in high performance clusters

Rodinia: A benchmark suite for heterogeneous computing

vCUDA: GPU-Accelerated High-Performance Computing in Virtual Machines