scispace - formally typeset
Open AccessJournal ArticleDOI

Automatic Testing of Design Faults in MapReduce Applications

Reads0
Chats0
TLDR
New testing techniques that aimed to detect design faults by simulating different infrastructure configurations that as whole are more likely to reveal failures using random testing, and partition testing together with combinatorial testing are proposed.
Abstract
New processing models are being adopted in Big Data engineering to overcome the limitations of traditional technology. Among them, MapReduce stands out by allowing for the processing of large volumes of data over a distributed infrastructure that can change during runtime. The developer only designs the functionality of the program and its execution is managed by a distributed system. As a consequence, a program can behave differently at each execution because it is automatically adapted to the resources available at each moment. Therefore, when the program has a design fault, this could be revealed in some executions and masked in others. However, during testing, these faults are usually masked because the test infrastructure is stable, and they are only revealed in production because the environment is more aggressive with infrastructure failures, among other reasons. This paper proposes new testing techniques that aimed to detect these design faults by simulating different infrastructure configurations. The testing techniques generate a representative set of infrastructure configurations that as whole are more likely to reveal failures using random testing, and partition testing together with combinatorial testing. The techniques are automated by using a test execution engine called MRTest that is able to detect these faults using only the test input data, regardless of the expected output. Our empirical evaluation shows that MRTest can automatically detect these design faults within a reasonable time.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

TEA- Cloud : A Formal Framework for Testing Cloud Computing Systems

TL;DR: The aim of the framework is to provide a complete methodology to help users to model both software and hardware parts of cloud systems and automatically test the validity of these clouds using a cost-effective approach.
Posted Content

Quality Assurance Technologies of Big Data Applications: A Systematic Literature Review

TL;DR: This paper aims at summarizing and assessing existing quality assurance (QA) technologies addressing quality issues in big data applications, by conducting a systematic literature review by searching major scientific databases.
Proceedings ArticleDOI

FSM Modeling of Testing Security Policies for MapReduce Frameworks

TL;DR: A novel approach to test Access Control List (ACL) policies in MapReduce framework by using the FSM formalism to write a system specification that takes into account these policies expressed in XACML language.
Journal ArticleDOI

Optimization Driven Constraints Handling in Combinatorial Interaction Testing

TL;DR: The proposed Jaya-Bat based optimization algorithm is the integration of the Jaya optimization algorithm (JOA) and the Bat optimization algorithms (BA) and it is clear that the proposed algorithm is capable of selecting the test cases optimally with better performance.
References
More filters
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Proceedings Article

Spark: cluster computing with working sets

TL;DR: Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.
Book

Art of Software Testing

TL;DR: Comprehensively covers psychological and economic principles, managerial aspects of testing, test tools, high-order testing, code inspections, and debugging, and programming students will find this reference work indispensible.
Book

The Art of Software Testing

TL;DR: The Art of Software Testing, Third Edition as discussed by the authors provides a brief but powerful and comprehensive presentation of time-proven software testing approaches, and is an investment that will pay for itself with the first bug you find.
Related Papers (5)