scispace - formally typeset
Open AccessProceedings ArticleDOI

On Using Pattern Matching Algorithms in MapReduce Applications

Reads0
Chats0
TLDR
This paper studies CPU utilization time patterns of several MapReduce applications to evaluate the hypothesis in tweaking system parameters in executing similar applications, and results showed effectiveness of the approach on pseudo-distributed Map Reduce platforms.
Abstract
In this paper, we study CPU utilization time patterns of several MapReduce applications. After extracting running patterns of several applications, they are saved in a reference database to be later used to tweak system parameters to efficiently execute unknown applications in future. To achieve this goal, CPU utilization patterns of new applications are compared with the already known ones in the reference database to find/predict their most probable execution patterns. Because of different patterns lengths, the Dynamic Time Warping (DTW)is utilized for such comparison, a correlation analysis is then applied to DTWs' outcomes to produce feasible similarity patterns. Three real applications (Word Count, Exim Mainlogparsing and Terasort) are used to evaluate our hypothesis in tweaking system parameters in executing similar applications. Results were very promising and showed effectiveness of our approach on pseudo-distributed MapReduce platforms

read more

Citations
More filters
Proceedings ArticleDOI

Automated design of self-adaptive software with control-theoretical formal guarantees

TL;DR: This paper proposes a broad scope methodology for automatically constructing both an approximate dynamic model of a software system and a suitable controller for managing its non-functional requirements, and provides formal guarantees concerning the system's dynamic behavior.
Journal Article

Automated design of self-adaptive software with control-theoretical formal guarantees

TL;DR: In this paper, a broad scope methodology for automatically constructing both an approximate dynamic model of a software system and a suitable controller for managing its non-functional requirements is proposed, which provides formal guarantees concerning the system's dynamic behavior by keeping its model continuously updated to compensate for changes in the execution environment and effects of the initial approximation.
Proceedings ArticleDOI

Automated multi-objective control for self-adaptive software design

TL;DR: This paper develops an automated control synthesis methodology that takes, as input, the configurable software components (or knobs) and the goals to be achieved, and automatically constructs a control system that manages the specified knobs and guarantees the goals are met.
Journal ArticleDOI

A study on using uncertain time series matching algorithms for MapReduce applications

TL;DR: In this paper, the authors study CPU utilization time patterns of several MapReduce applications and save the patterns along with their statistical information in a reference database to be later used to tweak system parameters to efficiently execute future unknown applications.
Proceedings ArticleDOI

A Framework for Implementing Asynchronous Replication Scheme in Utility-Based Computing Environment

TL;DR: This paper proposes an intelligent framework that can reinforce an effective resource selection scheme by allowing the components that give impact on the performance such as resource/data freshness of the replicated system in such environment to be considered.
References
More filters
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Proceedings ArticleDOI

Improving MapReduce performance in heterogeneous environments

TL;DR: A new scheduling algorithm, Longest Approximate Time to End (LATE), that is highly robust to heterogeneity and can improve Hadoop response times by a factor of 2 in clusters of 200 virtual machines on EC2.
Journal ArticleDOI

Toward accurate dynamic time warping in linear time and space

TL;DR: This paper introduces FastDTW, an approximation of DTW that has a linear time and space complexity and shows a large improvement in accuracy over existing methods.

Job Scheduling for Multi-User MapReduce Clusters

TL;DR: Two simple techniques, delay scheduling and copy-compute splitting, are developed which improve throughput and response times in multi-user MapReduce workloads by factors of 2 to 10 and can also raise throughput in a single-user, FIFO workload by a factor of 2.
Journal ArticleDOI

On the energy (in)efficiency of Hadoop clusters

TL;DR: It is found that running Hadoop clusters in fractional configurations can save between 9% and 50% of energy consumption, and that there is a tradeoff between performance energy consumption.