scispace - formally typeset
Proceedings ArticleDOI

Improving Energy Efficiency of IO-Intensive MapReduce Jobs

TLDR
This paper investigates which power management setting can be used to improve the energy efficiency of IO-intensive MapReduce jobs by performing a thorough empirical study and indicates that a constant CPU frequency can reduce the energy consumption of an IO- intensive job, while improving its performance.
Abstract
Map-Reduce is a popular data-parallel programming model for varied analysis of huge volumes of data. While a multicore and many CPU HPC infrastructure can be used to improve parallelism of map-reduce tasks, IO-bandwidth limitations may make them ineffective. IO-intensive activities are essential in any MapReduce cluster. In HPC nodes, IO-intensive jobs get queued at the IO-resources while the CPU remain underutilized, resulting in a poor performance, high power consumption and thus, energy inefficiency. In this paper, we investigate which power management setting can be used to improve the energy efficiency of IO-intensive MapReduce jobs by performing a thorough empirical study. Our analysis indicates that a constant CPU frequency can reduce the energy consumption of an IO-intensive job, while improving its performance. Consequently, we build a set of regression models to predict the energy consumption of IO-intensive jobs at a CPU frequency for a given input data volume. We obtained same set of models, with different coefficients, for two different types of IO-intensive jobs, which substantiates the suitability of identified models. These models predict respective outcomes with 80% accuracy for 80% of the new test cases.

read more

Citations
More filters
Journal ArticleDOI

A heuristic method towards deadline-aware energy-efficient mapreduce scheduling problem in Hadoop YARN

TL;DR: This paper considers a deadline-aware energy-efficient MR scheduling problem in the Hadoop YARN framework, and proposes a heuristic method which considerably minimizes the energy consumption for all benchmarks against the custom-made makespan minimizing scheme which does not consider energy-saving criteria.
Proceedings ArticleDOI

A review of big data environment and its related technologies

TL;DR: This paper reviews the big data its back ground and examines the several representatives related to technologies, such as Hadoop, Data Center, Cloud Computing, and Internet of Things (IoT).
Journal ArticleDOI

Optimizing MapReduce for energy efficiency

TL;DR: This paper presents a Configurator based on performance and energy models to improve the energy efficiency of MapReduce systems, and is the first to model it and design a configurator to optimize these parameter settings for maximizing the energy Efficiency of Map Reduce systems.
References
More filters
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Book

Regression Analysis by Example

TL;DR: Simple linear regression Multiple linear regression Regression Diagnostics: Detection of Model Violations Qualitative Variables as Predictors Transformation of Variables Weighted Least Squares The Problem of Correlated Errors Analysis of Collinear Data Biased Estimation of Regression Coefficients Variable Selection Procedures Logistic Regression Appendix References as discussed by the authors
Journal ArticleDOI

Regression Analysis by Example

Terri L. Moore
- 01 May 2001 - 
TL;DR: This book serves well as an introduction to the speciŽ c area of methods for detecting and correcting model violations in the standard linear regression model and provides a general overview of transformations of variables and focuses on three traditional situations where transformations can be applied.
Related Papers (5)