Method for testing the fault tolerance of MapReduce frameworks

doi:10.1016/J.COMNET.2015.04.009

Journal ArticleDOI

Method for testing the fault tolerance of MapReduce frameworks

João Eugenio Marynowski, +3 more

- 05 Jul 2015 -

Computer Networks

- Vol. 86, pp 1-13

Chats0

TLDR

A method to create a set of fault cases, derived from a Petri net (PN), and a framework to automate the execution of these fault cases in a distributed system to provide network reliability enhancements as a byproduct.

About:

This article is published in Computer Networks.The article was published on 2015-07-05. It has received 19 citations till now. The article focuses on the topics: Fault injection & Fault tolerance.

Citations

PDF

Open Access

More filters

BookDOI

Algorithms and architectures for parallel processing

Yang Xiang, +5 more

- 01 Jan 2015 -

Lecture Notes in Computer Science

TL;DR: This work aims at overcoming inefficiency by designing a distributed parallel system architecture that improves the performance of SPARQL endpoints by incorporating two functionalities: a queuing system to avoid bottlenecks during the execution of SParQL queries; and an intelligent relaxation of the queries submitted to the endpoint at hand whenever the relaxation itself and the consequently lowered complexity of the query are beneficial for the overall performance of the system.

...read moreread less

Book ChapterDOI

Modeling performance of Hadoop applications: A journey from queueing networks to stochastic well formed nets

Danilo Ardagna, +5 more

TL;DR: This paper provides performance analysis models to estimate MapReduce job execution times in Hadoop clusters governed by the YARN Capacity Scheduler, and proposes models of increasing complexity and accuracy, able to estimate job performance under a number of scenarios of interest.

...read moreread less

Journal ArticleDOI

Analytical composite performance models for Big Data applications

Soroush Karimian-Aliabadi, +4 more

- 15 Sep 2019 -

Journal of Network and Computer Applicat...

TL;DR: Analytical models based on Stochastic Activity Networks (SANs) are proposed to accurately model the execution of Hadoop, Tez and Spark applications, i.e., the most referred frameworks to support Big Data analyses.

...read moreread less

Journal ArticleDOI

Automatic Testing of Design Faults in MapReduce Applications

Jesus Moran, +3 more

- 16 Mar 2018 -

IEEE Transactions on Reliability

TL;DR: New testing techniques that aimed to detect design faults by simulating different infrastructure configurations that as whole are more likely to reveal failures using random testing, and partition testing together with combinatorial testing are proposed.

...read moreread less

Journal ArticleDOI

Testing MapReduce programs: A systematic mapping study

Jesus Moran, +2 more

- 01 Mar 2019 -

Journal of Software: Evolution and Proce...

TL;DR: MapReduce is a processing model used in Big Data to facilitate the analysis of large data under a distributed architecture that simplifies the management of large amounts of data.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

Journal ArticleDOI

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 01 Jan 2008 -

Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

Journal ArticleDOI

Petri nets: Properties, analysis and applications

Tadao Murata

TL;DR: The author proceeds with introductory modeling examples, behavioral and structural properties, three methods of analysis, subclasses of Petri nets and their analysis, and one section is devoted to marked graphs, the concurrent system model most amenable to analysis.

...read moreread less

Journal ArticleDOI

Basic concepts and taxonomy of dependable and secure computing

Algirdas Avizienis, +3 more

- 01 Jan 2004 -

IEEE Transactions on Dependable and Secu...

TL;DR: The aim is to explicate a set of general concepts, of relevance across a wide range of situations and, therefore, helping communication and cooperation among a number of scientific and technical communities, including ones that are concentrating on particular types of system, of system failures, or of causes of systems failures.

...read moreread less

Basic Concepts and Taxonomy of Dependable and Secure Computing

Algirdas Avizienis, +3 more

TL;DR: In this paper, the main definitions relating to dependability, a generic concept including a special case of such attributes as reliability, availability, safety, integrity, maintainability, etc.

...read moreread less

Collapse

Method for testing the fault tolerance of MapReduce frameworks

Citations

Algorithms and architectures for parallel processing

Modeling performance of Hadoop applications: A journey from queueing networks to stochastic well formed nets

Analytical composite performance models for Big Data applications

Automatic Testing of Design Faults in MapReduce Applications

Testing MapReduce programs: A systematic mapping study

References

MapReduce: simplified data processing on large clusters

MapReduce: simplified data processing on large clusters

Petri nets: Properties, analysis and applications

Basic concepts and taxonomy of dependable and secure computing

Basic Concepts and Taxonomy of Dependable and Secure Computing

Related Papers (5)

MRTree: Functional Testing Based on MapReduce's Execution Behaviour

Towards Ex Vivo Testing of MapReduce Applications

Testing data transformations in MapReduce programs

MapReduce: simplified data processing on large clusters

Anomaly-based Fault Detection System in Distributed System