Classification Framework of MapReduce Scheduling Algorithms

doi:10.1145/2693315

Journal ArticleDOI

Classification Framework of MapReduce Scheduling Algorithms

Nidhi Tiwari, +3 more

- 16 Apr 2015 -

ACM Computing Surveys

- Vol. 47, Iss: 3, pp 49

TLDR

A comprehensive and structured survey of the scheduling algorithms proposed so far is presented here using a novel multidimensional classification framework and identifies various open issues and directions for future research.

Abstract:

A MapReduce scheduling algorithm plays a critical role in managing large clusters of hardware nodes and meeting multiple quality requirements by controlling the order and distribution of users, jobs, and tasks execution. A comprehensive and structured survey of the scheduling algorithms proposed so far is presented here using a novel multidimensional classification framework. These dimensions are (i) meeting quality requirements, (ii) scheduling entities, and (iii) adapting to dynamic environments; each dimension has its own taxonomy. An empirical evaluation framework for these algorithms is recommended. This survey identifies various open issues and directions for future research.

Citations

PDF

Open Access

More filters

Book

コンピュータ・サイエンス : ACM computing surveys

共立出版株式会社

Journal ArticleDOI

Architecting Time-Critical Big-Data Systems

Pablo Basanta-Val, +4 more

- 01 Dec 2016 -

IEEE Transactions on Big Data

TL;DR: This paper deals with the definition of a time-critical big- data system from the point of view of requirements, analyzing the specific characteristics of some popular big-data applications and proposing an architecture and offering initial performance patterns that connect application costs with infrastructure performance.

...read moreread less

Journal ArticleDOI

MapReduce Scheduling for Deadline-Constrained Jobs in Heterogeneous Cloud Computing Systems

Chien-Hung Chen, +2 more

- 01 Jan 2018 -

IEEE Transactions on Cloud Computing

TL;DR: The Bipartite Graph modelling is utilized to propose a new MapReduce Scheduler called the BGMRS, which can obtain the optimal solution of the deadline-constrained scheduling problem by transforming the problem into a well-known graph problem: minimum weighted bipartite matching.

...read moreread less

Journal ArticleDOI

A data locality based scheduler to enhance MapReduce performance in heterogeneous environments

Nenavath Srinivas Naik, +3 more

- 01 Jan 2019 -

Future Generation Computer Systems

TL;DR: The experimental results prove that the proposed scheduler enhances the MapReduce performance in heterogeneous environments and improves data locality for different parameters as compared to the Hadoop default scheduler, Matchmaking scheduler and Delay scheduler respectively.

...read moreread less

Journal ArticleDOI

MapReduce scheduling algorithms: a review

Ibrahim Abaker Targio Hashem, +11 more

- 01 Jul 2020 -

The Journal of Supercomputing

TL;DR: This study analyzed scheduling in MapReduce on two aspects: taxonomy and performance evaluation and can serve as the benchmark to expert researchers for proposing a novel MapReduced scheduling algorithm and for novice researchers, it can be used as a starting point.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

Journal ArticleDOI

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 01 Jan 2008 -

Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

Book

Hadoop: The Definitive Guide

Tom White

TL;DR: This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoops clusters.

...read moreread less

Proceedings ArticleDOI

Apache Hadoop YARN: yet another resource negotiator

Vinod Kumar Vavilapalli, +15 more

TL;DR: The design, development, and current state of deployment of the next generation of Hadoop's compute platform: YARN is summarized, which decouples the programming model from the resource management infrastructure, and delegates many scheduling functions to per-application components.

...read moreread less

Book

Scheduling Algorithms

Peter Brucker

TL;DR: Besides scheduling problems for single and parallel machines and shop scheduling problems, this book covers advanced models involving due-dates, sequence dependent changeover times and batching.

...read moreread less

Collapse

Classification Framework of MapReduce Scheduling Algorithms

Citations

コンピュータ・サイエンス : ACM computing surveys

Architecting Time-Critical Big-Data Systems

MapReduce Scheduling for Deadline-Constrained Jobs in Heterogeneous Cloud Computing Systems

A data locality based scheduler to enhance MapReduce performance in heterogeneous environments

MapReduce scheduling algorithms: a review

References

MapReduce: simplified data processing on large clusters

MapReduce: simplified data processing on large clusters

Hadoop: The Definitive Guide

Apache Hadoop YARN: yet another resource negotiator

Scheduling Algorithms

Related Papers (5)

Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling

MapReduce: simplified data processing on large clusters

Improving MapReduce performance in heterogeneous environments

Improving MapReduce Performance Using Smart Speculative Execution Strategy

Hadoop: The Definitive Guide