Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

A framework to enable applications executing within virtual machines to transparently share one or more GPUs is presented and it is found that even when contention is high the consolidation algorithm is effective in improving the throughput, and that the runtime overhead of the framework is low.

Abstract:

Driven by the emergence of GPUs as a major player in high performance computing and the rapidly growing popularity of cloud environments, GPU instances are now being offered by cloud providers. The use of GPUs in a cloud environment, however, is still at initial stages, and the challenge of making GPU a true shared resource in the cloud has not yet been addressed.This paper presents a framework to enable applications executing within virtual machines to transparently share one or more GPUs. Our contributions are twofold: we extend an open source GPU virtualization software to include efficient GPU sharing, and we propose solutions to the conceptual problem of GPU kernel consolidation. In particular, we introduce a method for computing the affinity score between two or more kernels, which provides an indication of potential performance improvements upon kernel consolidation. In addition, we explore molding as a means to achieve efficient GPU sharing also in the case of kernels with high or conflicting resource requirements. We use these concepts to develop an algorithm to efficiently map a set of kernels on a pair of GPUs. We extensively evaluate our framework using eight popular GPU kernels and two Fermi GPUs. We find that even when contention is high our consolidation algorithm is effective in improving the throughput, and that the runtime overhead of our framework is low.

Citations

PDF

Open Access

More filters

Posted Content

An Empirical-cum-Statistical Approach to Power-Performance Characterization of Concurrent GPU Kernels

Nilanjan Goswami,Amer Qouneh,Chao Li,Tao Li +3 more

- 04 Nov 2020 -

arXiv: Distributed, Parallel, and Cluste...

Show Less

TL;DR: A multi-kernel throughput workload generation framework that will facilitate aggressive energy and performance management of exascale data centers and will stimulate synergistic power-performance co-optimization of throughput architectures is proposed and demonstrated.

...read moreread less

Consolidating batch and transactional workloads using dependency structure

Prioritization S. Nivethitha,V. S. Shankar Sriram +1 more

Show Less

TL;DR: The proposed work implements the concept of Dependency Structure Prioritization (DSP) to assign priority to the job to make effective resource utilization through reducing the number of job migration and missed deadline jobs.

...read moreread less

Proceedings ArticleDOI

Algorithms for Preemptive Co-scheduling of Kernels on GPUs

Lionel Eyraud-Dubois,Cristiana Bentes +1 moreUniversity of Bordeaux,Rio de Janeiro State University

Show Less

TL;DR: In this paper, a graph-based preemptive co-scheduling algorithm is proposed to reduce the number of preemptions in high-competitiveness scenarios on GPUs. But, the problem of finding the minimum amount of preemptions among all preemptive solutions of optimal makespan is a NP-hard problem.

...read moreread less

Proceedings ArticleDOI

Scaling Software Experiments to the Thousands

Christian Neuhaus,Frank Feinbube,Andreas Polze,Arkady Retik +3 moreUniversity of Potsdam,Microsoft

Show Less

TL;DR: Challenges and solutions for scaling InstantLab to provide experiment infrastructure for thousands of users are discussed and the approach to automatic management of experiment feedback and user grading based on learning success and successful completion of operating systems and software engineering experiments is discussed.

...read moreread less

Book ChapterDOI

A Heterogeneous Dynamic Scheduling Minimize Energy—HDSME

Saba Fatima,V. M. Viswanatha +1 moreVisvesvaraya Technological University

Show Less

TL;DR: This paper is implementing a new technique for scheduling resources that are in heterogeneous way during time for minimizing the energy and the cost of the operation known as HDSME (Heterogeneous-Dynamic-Scheduling-Minimize-Energy).

...read moreread less

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
…
23
24
25
26
27

Collapse

David Tarditi,Sidd Puri,Jose M. Oglesby +2 moreMicrosoft

Show Less

TL;DR: This work describes Accelerator, a system that uses data parallelism to program GPUs for general-purpose uses instead of C, and compares the performance of Accelerator versions of the benchmarks against hand-written pixel shaders.

...read moreread less

1
2
3
4
…
5
6
7

Collapse

IEEE Transactions on Computers

Show Less

SciSpace

About Careers Resources Support Browse Papers Pricing SciSpace Affiliate Program Cancellation & Refund Policy Terms Privacy

Tools

Citation generator AI Detector Paraphraser Citation Booster

Extensions

SciSpace

Directories

Papers Topics Journals Authors Conferences Institutions Questions Citation Styles

Contact

support@typeset.io +91 8431021544

Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

Citations

An Empirical-cum-Statistical Approach to Power-Performance Characterization of Concurrent GPU Kernels

Consolidating batch and transactional workloads using dependency structure

Algorithms for Preemptive Co-scheduling of Kernels on GPUs

Scaling Software Experiments to the Thousands

A Heterogeneous Dynamic Scheduling Minimize Energy—HDSME

References

The cost of doing science on the cloud: the Montage example

Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping

Automated control of multiple virtualized resources

Cost-benefit analysis of Cloud Computing versus desktop grids

Accelerator: using data parallelism to program GPUs for general-purpose uses

Related Papers (5)

GViM: GPU-accelerated virtual machines

A GPGPU transparent virtualization component for high performance computing clouds

rCUDA: Reducing the number of GPU-based accelerators in high performance clusters

Rodinia: A benchmark suite for heterogeneous computing

vCUDA: GPU-Accelerated High-Performance Computing in Virtual Machines