Testing MapReduce programs: A systematic mapping study
Summary (6 min read)
1 INTRODUCTION
- Big Data or Data-intensive programs are those that cannot run using the traditional technology/techniques 1 and usually need novel approaches.
- There are not many studies related toMapReduce applications.
- This interest in Big Data during the previous years could have evolved the state-of-the-art about software testing in the MapReduce programs.
- In contrast to the aforementioned mapping study, this paper obtains more thorough results because of its deeper scope and different approach/motivation.
- The research questions are proposed in Section 3 together with the systematic steps planned to answer them.
2 MAPREDUCE PROCESSING MODEL
- TheMapReduce programs 2 divide one problem into several subproblems that are executed in parallel over a large number of computers.
- To illustrate MapReduce, let us imagine a program that calculates the average temperature per year.
- The Map function receives years with temperatures and creates the <key, value> pairs in order to group the temperatures per year.
- The programs are executed by a framework that automatically manages the resource allocation, the re-execution of one part of the program in case of infrastructure failures, and the scheduling of all executions between other mechanisms.
- Another problem is that new raw data are continuously generated and the data model could change over time, and then the program needs some changes.
3 PLANNING OF THE MAPPING STUDY
- This mapping study aims to characterize the knowledge of software testing approaches in the MapReduce programs through an empirical study of the research literature.
- To avoid bias, the planning of the mapping study describes several tasks based on Kitchenham et al. guidelines 22: 1. Formulation of the research questions (Subsection 3.1).
- The search process to extract the significant literature (primary studies) to answer the research questions (Subsection 3.2).
- Data synthesis to summarize, mix and put the data into context to answer the questions (Subsection 3.4).
- These tasks are planned and then conducted independently as described in Figure 2.
3.1 Research Questions
- The research questions are formulated to cover all the information about software testing research in the context of theMapReduce programs with different points of view.
- This work formulates the research questions based on the 5W+1H model 37,38, also known as the Kipling method 39.
- This method is used in other software engineering empirical studies 40,41 and answers the questions:.
- The research questions of this mapping study are: RQ1.
3.2 Search Process
- The mapping study answers the research questions based on a series of studies that contain relevant information about these questions.
- These studies are called primary studies and are obtained through the tasks described in Figure 3.
- First, the search terms (set of several words/terms) related to software testing andMapReduce are searched for in different data sources (journals, conferences and electronic databases).
- The papers that match these searches together with other studies recommended by experts constitute the potential primary studies.
- Finally, these studies are filtered in the study selection in order to obtain only the studies that contain information to answer the research questions.
3.2.1 Search Terms
- The search terms are obtained from the three points of view proposed by Kitchenham et al. 22: (1) population that refers to the technologies and areas related toMapReduce, (2) intervention that are the issues related to software testing, and (3) outcomes that are the improvements obtained through software testing.
- Hadoop is a distributed system that supports the execution of MapReduce programs and non-MapReduce programs, but there are several papers that use Hadoop andMapReducewords interchangeably in the title.
- In order to obtain the maximum relevant literature and avoid missing some primary studies due to the buzzwords and jargon, a thorough search is performed considering the MapReduce and Big Data related technologies enumerated in Table 1.
- This work performs a wide search with 9384 combinations of terms in the paper title, obtained by 92MapReduce technology related terms and 102 quality related terms.
3.2.2 Data Sources
- The potential primary studies may be in different data sources.
- This category contains 624 proceedings/volumes from the year of theMapReduce paper (2004) to June 2016.
- The opinion of authors with experience in software testing and MapReduce, together with the other related mapping study 25 could provide potential primary studies.
- To avoid this problem, the authors created a program that splits the 9384 combinations of search terms in 2346 searches and simulates a human performing these requests.
3.2.3 Study Selection
- Some potential primary studies obtained from the data sources could not contain information about software testing in theMapReduce programs.
- The filters consist in the next criteria applied in the following order: C1) Filter by year.
- The potential primary studies are excluded when they do not contain Big Data information.
- Some other papers employ theMapReduce and Big Data capabilities to speed up testing in other non-MapReduce programs.
- This mapping study performs a wide search with more than 70000 research papers found before the filter C1.
3.3 Data Extraction
- The relevant information of the primary studies is extracted through a template divided in two parts.
- The first checklist contains the following roles: Manager, Analyst, Architect, Tester, Test manager, Test strategist, Other stakeholders, Unclear andNot applicable.
- These data are extracted in a checklist with the following information about the research validation of the studies: a).
3.4 Data Synthesis
- The data extracted from the primary studies are synthesized in order to answer the research questions.
- In empirical software engineering there are several synthesis methods 62 based on different approaches according to the type of data or research questions, among others.
- And (2) meta-ethnography 64 for the remaining research questions.
- Create a group of labels for each previous segment/phrase based on the type of reason for testing.
- Once the data are extracted from the primary studies in these checklists, the research questions are answered by a frequency analysis.
3.5 Limitations of the Systematic Mapping Study
- Another less important potential bias could occur during the search process if some primary studies are not found without search terms or expert opinions.
- This study searches 5 electronic databases and 2311 proceedings/volumes related to software testing in theMapReduce programs, also known as Data sources.
- The mapping study excludes the non-relevant studies based on 4 filters, also known as Study selection.
- The majority of the data extracted are based on checklists, in some cases obtained from international standards and in others created or adapted to theMapReduce processing model, also known as 3. Data extraction.
4 RESULTS
- The results are obtained through the conducting of the systematic mapping study that answers the research questions based on the planning of Section 3.
- From them, the data are extracted, and the synthesis is developed in Subsection 4.2.
- Finally, the general results are discussed in Subsection 4.3.
4.1 Primary Studies
- In this work there are 54 primary studies derived from more than 70000 potential studies obtained though the search process detailed in Figure 4.
- TheMapReduce processing model was described in 2004, but the software testing efforts in this field according to the primary studies started in 2010 with only 1 study and after six years and six months the number of primary studies has increased to 54.
- The different types of validations employed in the research are summarized in Table 4.
- Testing in Big Data has open up new challenges 67, especially in the understanding of the data and its complex structures 68.
- In the case of high Volume, it could be difficult to check whether the test case output is the expected, and the use of automatic tools can be helpful 69.
4.1.1 Performance testing and analysis
- These prediction models characterize the performance based on different kinds of input parameters.
- The prediction models can have different goals beyond the execution time, for example the Yang et al. model 74 helps to obtain the values of the input parameters that achieve the best execution time.
- While some models predict the performance analyzing the execution time of several samples 78 or considering the previous executions 79, other models consider some specific characteristics of theMapReduce execution.
- The Vianna et al. model 80 considers the influence over the performance of the MapReduce tasks that are executed in parallel.
- The tester can also monitor the execution of theMapReduce programs and test cases, obtaining charts to evaluate the performance and potential bottlenecks 102.
4.1.2 Functional testing
- Themisconfiguration is one of themost common problems that lead to amemory/performance issues inMapReduce 104.
- Empirical study 105, the users rarely tune the configuration parameters that are related to performance.
- Another technique to detect these faults caused by non-determinism checks dynamically the properties of the program under test with random data 113.
- There are several studies that proposed to inject infrastructure failures in the test case design 114.
- Another technique to generate the data of the test cases, employs a bacteriological algorithm aimed to kill some semantic mutants specific for MapReduce: varies both the number of the Reducers and the existence or not of the Combiner functionality 117.
4.2 Synthesis
- The primary studies contain the answers to the research questions, but this information is hidden inside.
- The synthesis obtains valuable information in order to answer the research questions based on the data extracted from the primary studies.
- The data are extracted following the template defined in Subsection 3.3 and then synthesized by the methods described in Subsection 3.4.
- In the following subsections the primary studies are analyzed, classified and summarized in order to obtain the answer to each research question systematically.
4.2.1 RQ1Why is testing performed in the MapReduce programs?
- The MapReduce programs are tested for several reasons.
- The specific faults of theMapReduce programs and the number of the programs that fail in production, also known as Failure related.
- In Table 5 each reason for testing is also classified based on the degree of formality of the evidence in accordance with the following types: reasons with formal evidence and with informal evidence.
- Influence of the infrastructure in application performance: whereas the MapReduce applications can be designed without consider the infrastructure, the program performance is influenced by the production infrastructure.
- From the 41 “performance related” reasons for testing, the most frequent are focused in the analysis (36.83% of “performance related” reasons) and optimization of the performance (26.83% of “performance related” reasons), followed by the fulfillment of the performance goals (24.39% of “performance related” reasons).
4.2.2 RQ2What testing is performed in the MapReduce programs?
- The planning of Subsection 3.4 proposes a meta-ethnography 64 to answer this research question.
- The data extracted from each primary study has two facets in order to answer RQ2: a. Quality (sub)characteristics for each study according to the ISO/IEC 25010:2011 42 represented in Table 7. b. Quality-Related Types of Testing proposed in each study based on ISO/IEC/IEEE 29119-4:2015 58 and summarized in Table 8.
- The majority of efforts are focused on “performance efficiency” with 64.81% of the studies, then on “functional suitability” with 25.93% of the studies, and finally on “reliability” with 5.56% of the studies.
- Regarding the type of testing, 59.26% apply “performance-related testing”, 22.22% employ “functional testing” and 3.7% use “backup/recovery testing”.
- The results obtained through the combination of both facets are more or less those expected: the “performance-related testing” is used to “performance efficiency” characteristics, the “functional testing” to “functional suitability”, and “backup/recovery testing” to “reliability”.
4.2.3 RQ3 How is testing performed in the MapReduce programs?
- This research question is answered through the meta-ethnography 64 proposed in Subsection 3.4.
- In order to answer RQ3, the primary studies are analyzed considering three facets: a.
- Testing methods/techniques are summarized in Table 9 according to the test activities proposed in Annex A of ISO/IEC/IEEE 29119-1:2013 9. b.
- Dependency between the primary studies and theMapReduce processing model is depicted in Table 10.
- C. Tools created or used in the primary studies to perform software testing are characterized in Table 11.
- Other testing activities are used to a lesser degree, such as for example “structure based” in 7.41% of the studies or static analysis in 5.56% of the studies.
4.2.4 RQ4Who, where and when is testing performed in the MapReduce programs?
- The planning of themapping study described in Section 3.4 proposes a meta-ethnography 64 to answer the research question through three facets: a.
- The different roles that participate in the testing efforts of theMapReduce programs, described in Table 12.
- To a lesser extent the testing efforts are oriented towards the parts of the program that could not containMapReduce functions: 7.41% of the studies consider the integration testing between the MapReduce functions with other parts of the program, and 3.7% of the studies for testing the system.
- From these results, it appears that the fulfillment of the contract or user requirements tested in the acceptance testing level is not greatly affected by the existence of MapReduce functions in the system.
- Regardless of the test level, the testing described in the primary studies is mainly performed in the Software/System Qualification Testing Process.
4.3 Discussion of Results
- The research questions of Subsection 3.1 are answered through the primary studies, data extraction and data synthesis.
- The most frequent reasons are based on performance issues (analyze, optimize and fulfil performance goals), existence of several and specific failures, the type and quality of the data processed by these programs, and testing to predict and select efficiently the resources.
- According to Table 8, the studies related to functionality only represent 22.22% even though 42.17% of the reasons for testing are related to functionality .
- This classification reflects the research efforts to boost the Big Data Engineering field because 44.1% of the studies improve the technology, 18.31% analyse the technology through studies and surveys, 9.01% create new technologies to manage and analyse data, and 6.62% are focused on the state-of-the-art and challenges.
5 CONCLUSIONS
- The number of studies on software testing in theMapReduce programs has increased during recent years.
- A characterization is carried out based on 54 research studies obtained from more than 70000 potential papers.
- These reasons for testing assume that both functional and performance testing are necessary, but the studies employ different approaches: functional testing considers different aspects of the program (such as specification and structure) while performance testing is more focused on simulation and evaluation.
- Regardless of the type of testing, the majority of efforts are specific for the MapReduce technology at unit and integration level of theMap and Reduce functions.
- There is room to mature with better validations and thus improve the research impact.
Did you find this useful? Give us your feedback
Citations
27 citations
25 citations
Cites methods from "Testing MapReduce programs: A syste..."
...So MapReduce is used as part of the data analysis subsystem to build the analysis engine cluster [36]....
[...]
17 citations
5 citations
Cites background from "Testing MapReduce programs: A syste..."
...There exist only few works on functional testing of big data programs, most of them address testing of programs built using control flow based programming models like MapReduce [20]....
[...]
...Testing big data processing programs is an open issue that is receiving increasing attention [5,20]....
[...]
...Most work has focused on performance testing since performance is a major concern in a big data environment given the computational resources required [20]....
[...]
...The testing of big data processing programs has gained interest as pointed out in [5] and [20]....
[...]
...Regarding functional testing, few works have been done, most of them being concentrated on MapReduce [20], leaving an open research area for testing big data programs on other models and technologies....
[...]
References
34,965 citations
20,309 citations
9,674 citations
9,564 citations
9,097 citations
Related Papers (2)
Frequently Asked Questions (11)
Q2. What are the journals and conferences that could be a good source of potential studies?
The non-JCR journals and non-CORE conferences related to software testing or Big Data could be a good source of potential primary studies.
Q3. What is the MRFlow testing technique 116?
TheMRFlow testing technique 116 generates the test coverage items that can be used to generate the test inputs based on the data-flow technique adapted to theMapReduce processing model.
Q4. What is the key to the data processed by the functions?
The data processed by these functions are in the form of <key, value> pairs in which the key is the identifier of each subproblem and the value contains the information needed to solve it.
Q5. What are the common reasons for testing the MapReduce programs?
The most frequent reasons are based on performance issues (analyze, optimize and fulfil performance goals), existence of several and specific failures, the type and quality of the data processed by these programs, and testing to predict and select efficiently the resources.
Q6. What can be done to improve the performance of the prediction models?
Then the prediction models can be designed with more standardized subset of parameters that have a notorious influence in performance.
Q7. How can the performance of the Pig programs be predicted?
When the programs executed in the public cloud have an deadline requirements to satisfy, the performance can be predicted with Locally Weighted Linear Regression model considering the previous execution and the data executed in parallel 91.
Q8. How can the performance of a program in a cloud be predicted?
In the case of an I/O intensive programs in cloud, the performance can be predicted using a CART (Classification And Regression Tree) model 90.
Q9. How many searches are available for the potential studies?
The potential primary studies are obtained after approximately a week to avoid bans in the search engines due to a high number of requests.
Q10. How can the performance of MapReduce and Big Data applications be evaluated?
The performance of the MapReduce and Big Data applications can also be evaluated through large scale stochastic models by Mean Field Analysis 77.
Q11. What are some of the technologies to execute and manage MapReduce programs?
There are several technologies to execute and manage MapReduce programs such as Spark 3, Flink 4 and Hadoop 5, all broadly implemented in industry 6.