scispace - formally typeset
Journal ArticleDOI

Classification Framework of MapReduce Scheduling Algorithms

TLDR
A comprehensive and structured survey of the scheduling algorithms proposed so far is presented here using a novel multidimensional classification framework and identifies various open issues and directions for future research.
Abstract
A MapReduce scheduling algorithm plays a critical role in managing large clusters of hardware nodes and meeting multiple quality requirements by controlling the order and distribution of users, jobs, and tasks execution. A comprehensive and structured survey of the scheduling algorithms proposed so far is presented here using a novel multidimensional classification framework. These dimensions are (i) meeting quality requirements, (ii) scheduling entities, and (iii) adapting to dynamic environments; each dimension has its own taxonomy. An empirical evaluation framework for these algorithms is recommended. This survey identifies various open issues and directions for future research.

read more

Citations
More filters
Journal ArticleDOI

Architecting Time-Critical Big-Data Systems

TL;DR: This paper deals with the definition of a time-critical big- data system from the point of view of requirements, analyzing the specific characteristics of some popular big-data applications and proposing an architecture and offering initial performance patterns that connect application costs with infrastructure performance.
Journal ArticleDOI

MapReduce Scheduling for Deadline-Constrained Jobs in Heterogeneous Cloud Computing Systems

TL;DR: The Bipartite Graph modelling is utilized to propose a new MapReduce Scheduler called the BGMRS, which can obtain the optimal solution of the deadline-constrained scheduling problem by transforming the problem into a well-known graph problem: minimum weighted bipartite matching.
Journal ArticleDOI

A data locality based scheduler to enhance MapReduce performance in heterogeneous environments

TL;DR: The experimental results prove that the proposed scheduler enhances the MapReduce performance in heterogeneous environments and improves data locality for different parameters as compared to the Hadoop default scheduler, Matchmaking scheduler and Delay scheduler respectively.
Journal ArticleDOI

MapReduce scheduling algorithms: a review

TL;DR: This study analyzed scheduling in MapReduce on two aspects: taxonomy and performance evaluation and can serve as the benchmark to expert researchers for proposing a novel MapReduced scheduling algorithm and for novice researchers, it can be used as a starting point.
References
More filters
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Book

Hadoop: The Definitive Guide

Tom White
TL;DR: This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoops clusters.
Proceedings ArticleDOI

Apache Hadoop YARN: yet another resource negotiator

TL;DR: The design, development, and current state of deployment of the next generation of Hadoop's compute platform: YARN is summarized, which decouples the programming model from the resource management infrastructure, and delegates many scheduling functions to per-application components.
Book

Scheduling Algorithms

Peter Brucker
TL;DR: Besides scheduling problems for single and parallel machines and shop scheduling problems, this book covers advanced models involving due-dates, sequence dependent changeover times and batching.
Related Papers (5)