Scheduling multi-tenant cloud workloads on accelerator-based systems

doi:10.1109/SC.2014.47

Proceedings ArticleDOI

Scheduling multi-tenant cloud workloads on accelerator-based systems

Dipanjan Sengupta, +3 more

- pp 513-524

Chats0

TLDR

The Strings scheduler realizes the vision of a dynamic model where GPUs are treated as first class schedulable entities by decomposing the GPU scheduling problem into a combination of load balancing and per-device scheduling.

Abstract:

Accelerator-based systems are making rapid inroads into becoming platforms of choice for high end cloud services. There is a need therefore, to move from the current model in which high performance applications explicitly and programmatically select the GPU devices on which to run, to a dynamic model where GPUs are treated as first class schedulable entities. The Strings scheduler realizes this vision by decomposing the GPU scheduling problem into a combination of load balancing and per-device scheduling. (i) Device-level scheduling efficiently uses all of a GPU's hardware resources, including its computational and data movement engines, and (ii) load balancing goes beyond obtaining high throughput, to ensure fairness through prioritizing GPU requests that have attained least service. With its methods, Strings achieves improvements in system throughput and fairness of up to 8.70x and 13%, respectively, compared to the CUDA runtime.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

IVM: a task-based shared memory programming model and runtime system to enable uniform access to CPU-GPU clusters

Kittisak Sajjapongse, +2 more

TL;DR: This work proposes a programming framework called Inter-node Virtual Memory (IVM) that provides the programmer with a uniform view of compute resources and memory spaces within a CPU-GPU cluster, and a mechanism to easily incorporate load balancing within the application.

...read moreread less

Proceedings ArticleDOI

A systems perspective on GPU computing: a tribute to Karsten Schwan

Naila Farooqui

TL;DR: His legacy of key research contributions in general-purpose GPU computing includes novel scheduling and resource management abstractions, runtime specialization, and novel data management techniques to support scalable, distributed GPU frameworks.

...read moreread less

Proceedings ArticleDOI

Improving Application Concurrency on GPUs by Managing Implicit and Explicit Synchronizations

Michael Butler, +2 more

TL;DR: This work designs runtime mechanisms to bypass synchronizations in applications, along with a memory management scheme that can be integrated with these synchronization avoidance mechanisms to improve GPU utilization and system throughput, and integrates these mechanisms into a recently proposed GPU virtualization runtime named Sync-Free GPU (SF-GPU).

...read moreread less

Dissertation

Enhancing manageability of execution and data for GPGPU computing

Anshuman Goswami

TL;DR: Symphony focuses on abstracting scheduling control at the granularity of thread blocks of application kernels, and uses the software supervisor approach where a supervisor kernel cooperates with the hardware thread block scheduler to schedule application kernels on the SMs.

...read moreread less

Patent

Automatic localization of acceleration in edge computing environments

Bernat Francesc Guim, +4 more

TL;DR: In this article, the use of the local acceleration circuitry or the remote acceleration resource is selected based on the estimated time and other appropriate factors in relation to a service level agreement, and an estimated time (and cost or other identifiable or estimateable considerations) to execute the function at the respective location is identified.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Posted Content

A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems

Raj Jain, +2 more

- 24 Sep 1998 -

arXiv: Networking and Internet Architect...

TL;DR: A quantitative measure called Indiex of FRairness, applicable to any resource sharing or allocation problem, which is independent of the amount of the resource, and boundedness aids intuitive understanding of the fairness index.

...read moreread less

Journal ArticleDOI

StarPU: a unified platform for task scheduling on heterogeneous multicore architectures

Cédric Augonnet, +3 more

TL;DR: StarPU as mentioned in this paper is a runtime system that provides a high-level unified execution model for numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware and easily develop and tune powerful scheduling algorithms.

...read moreread less

A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems

Raj Jain, +2 more

TL;DR: Indiex of Fairness as mentioned in this paper is a quantitative measure that is applicable to any resource sharing or allocation problem, and it is independent of the amount of the resource and the fairness index always lies between 0 and 1.

...read moreread less

Proceedings ArticleDOI

On micro-kernel construction

Jochen Liedtke

TL;DR: Contradictory to this belief, it is shown and support by documentary evidence that inefficiency and inflexibility of current μ-kernels is not inherited from the basic idea but mostly from overloading the kernel and/or from improper implementation.

...read moreread less

Journal ArticleDOI

Symbiotic jobscheduling for a simultaneous multithreaded processor

Allan Snavely, +1 more

TL;DR: It is demonstrated that performance on a hardware multithreaded processor is sensitive to the set of jobs that are coscheduled by the operating system jobscheduler, and that a small sample of the possible schedules is sufficient to identify a good schedule quickly.

...read moreread less

Collapse

IEEE Transactions on Computer-Aided Desi...

A black-box approach to energy-aware scheduling on integrated CPU-GPU systems

Rajkishore Barik, +4 more

Scheduling multi-tenant cloud workloads on accelerator-based systems

Citations

IVM: a task-based shared memory programming model and runtime system to enable uniform access to CPU-GPU clusters

A systems perspective on GPU computing: a tribute to Karsten Schwan

Improving Application Concurrency on GPUs by Managing Implicit and Explicit Synchronizations

Enhancing manageability of execution and data for GPGPU computing

Automatic localization of acceleration in edge computing environments

References

A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems

StarPU: a unified platform for task scheduling on heterogeneous multicore architectures

A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems

On micro-kernel construction

Symbiotic jobscheduling for a simultaneous multithreaded processor

Related Papers (5)

GViM: GPU-accelerated virtual machines

Multi-tenancy on GPGPU-based servers

Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

Run-Time Scheduling Framework for Event-Driven Applications on a GPU-Based Embedded System

A black-box approach to energy-aware scheduling on integrated CPU-GPU systems