Core Algorithms of the Maui Scheduler

doi:10.1007/3-540-45540-X_6

Book ChapterDOI

Core Algorithms of the Maui Scheduler

- pp 87-102

TLDR

This paper focuses on three areas of Maui scheduling, specifically, backfill, job prioritization, and fairshare and briefly discusses the goals of each component, the issues and corresponding design decisions, and the algorithms enabling the Maui policies.

Abstract:

The Maui scheduler has received wide acceptance in the HPC community as a highly configurable and effective batch scheduler It is currently in use on hundreds of SP, O2K, and Linux cluster systems throughout the world including a high percentage of the largest and most cutting edge research sites While the algorithms used within Maui have proven themselves effective, nothing has been published to date documenting these algorithms nor the configurable aspects they support This paper focuses on three areas of Maui scheduling, specifically, backfill, job prioritization, and fairshare It briefly discusses the goals of each component, the issues and corresponding design decisions, and the algorithms enabling the Maui policies It also covers the configurable aspects of each algorithm and the impact of various parameter selections

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Apache Hadoop YARN: yet another resource negotiator

Vinod Kumar Vavilapalli, +15 more

TL;DR: The design, development, and current state of deployment of the next generation of Hadoop's compute platform: YARN is summarized, which decouples the programming model from the resource management infrastructure, and delegates many scheduling functions to per-application components.

...read moreread less

Journal ArticleDOI

Distributed computing in practice: the Condor experience

Douglas Thain, +2 more

- 01 Feb 2005 -

Concurrency and Computation: Practice an...

TL;DR: The history and philosophy of the Condor project is provided and how it has interacted with other projects and evolved along with the field of distributed computing is described.

...read moreread less

Book ChapterDOI

SLURM: Simple Linux Utility for Resource Management

Andy Yoo, +2 more

TL;DR: A new cluster resource management system called Simple Linux Utility Resource Management (SLURM) is described in this paper, designed to be flexible and fault-tolerant and can be ported to other clusters of different size and architecture with minimal effort.

...read moreread less

Proceedings ArticleDOI

Large-scale cluster management at Google with Borg

Abhishek Verma, +5 more

TL;DR: A summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions, and a qualitative examination of lessons learned from a decade of operational experience with it are presented.

...read moreread less

Proceedings ArticleDOI

Omega: flexible, scalable schedulers for large compute clusters

Malte Schwarzkopf, +3 more

TL;DR: This work presents a novel approach to address increasing scale and the need for rapid response to changing requirements using parallelism, shared state, and lock-free optimistic concurrency control to address monolithic cluster scheduler architectures.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book ChapterDOI

Job Scheduling Under the Portable Batch System

Robert L. Henderson

TL;DR: The typical batch queuing system schedules jobs for execution by a set of queue controls, which limits the set of scheduling policies available to a site.

...read moreread less

Proceedings ArticleDOI

Utilization and predictability in scheduling the IBM SP2 with backfilling

Dror G. Feitelson, +1 more

TL;DR: A more conservative approach is shown, in which small jobs move ahead only if they do not delay any job in the queue, which produces essentially the same benefits in terms of utilization as the EASY scheduler.

...read moreread less

Book ChapterDOI

The EASY - LoadLeveler API Project

Joseph F. Skovira, +3 more

TL;DR: A LoadLeveler API is developed that allows external schedulers like EASY to control the starting and stopping of jobs through Load leveler, and helps address Cornell's difficulties with the current scheduling algorithm.

...read moreread less

Book ChapterDOI

The Performance Impact of Advance Reservation Meta-scheduling

Quinn Snell, +3 more

TL;DR: This research quantifies the impact of advance reservations on and outlines the algorithms that must be used to schedule metajobs and indicates that advance reservations can improve the response time for meetajobs, while not significantly impacting overall system performance.

...read moreread less

Job Scheduling Strategies for Parallel Processing

Alexandru Iosup, +6 more

Core Algorithms of the Maui Scheduler

Citations

Apache Hadoop YARN: yet another resource negotiator

Distributed computing in practice: the Condor experience

SLURM: Simple Linux Utility for Resource Management

Large-scale cluster management at Google with Borg

Omega: flexible, scalable schedulers for large compute clusters

References

Job Scheduling Under the Portable Batch System

Utilization and predictability in scheduling the IBM SP2 with backfilling

The EASY - LoadLeveler API Project

The Performance Impact of Advance Reservation Meta-scheduling

Job Scheduling Strategies for Parallel Processing

Related Papers (5)

Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling

The ANL/IBM SP Scheduling System

SLURM: Simple Linux Utility for Resource Management

The Grid 2: Blueprint for a New Computing Infrastructure

Theory and Practice in Parallel Job Scheduling