scispace - formally typeset
Proceedings ArticleDOI

Comparative Analysis of Techniques for Big-data Performance Testing

Reads0
Chats0
TLDR
In this article , the authors present several performance testing approaches for big data and compare them with various parameters such as domain applicability, scripting interface etc., and suggest tools to improve overall performance.
Abstract
Big data has become the primary objective for any area or department on the globe, including government, healthcare, and industrial sectors. Every day, a large amount of big data is generated, and information is added based on the three characteristics of big data. Handling massive amounts of data with traditional methods has become a significant challenge. To test large amount of duplicate data is very challenging task for researchers. Big data performance testing has great promise for obtaining more optimised solutions. Better performance saves time and money while also producing more optimised results. This paper makes an attempt to present several performance testing approaches. Furthermore, we compare the performance testing techniques and analyse the performance improvement factors. Finally, various tools for performance testing of big data, such as Apache Jmeter, Apache Drill, LoadRunner, WebLoad, YCSB etc., are also compared with various parameters such as domain applicability, scripting interface etc., and tools are suggested to improve overall performance.

read more

Content maybe subject to copyright    Report

References
More filters
Journal ArticleDOI

Big data analytics on Apache Spark

TL;DR: This review shows what Apache Spark has for designing and implementing big data algorithms and pipelines for machine learning, graph analysis and stream processing and highlights some research and development directions on Apache Spark for big data analytics.
Proceedings ArticleDOI

A performance evaluation of Apache Kafka in support of big data streaming applications

TL;DR: A through evaluation of several configurations and performance metrics of Kafka in order to allow users to avoid bottlenecks, reach its full potential and avoid bottlenecking and eventually leverage some good practice for efficient stream processing.
Proceedings ArticleDOI

Performance evaluation of big data frameworks for large-scale data analytics

TL;DR: Analysis of the results has shown that replacing Hadoop with Spark or Flink can lead to a reduction in execution times by 77% and 70% on average, respectively, for non-sort benchmarks.
Proceedings ArticleDOI

Issues in big data testing and benchmarking

TL;DR: Initial solutions and challenges with respect to big data generation, methods for creating realistic, privacy-aware, and arbitrarily scalable data sets, workloads, and benchmarks from real world data are described.
Proceedings ArticleDOI

A performance study of big data analytics platforms

TL;DR: This paper uses the TPC-H benchmark to compare the performance of four Big Data systems picked from the major categories of Big Data platforms: a commercial parallel relational database (from the traditional DBMS world), Hive and Spark SQL ( from the SQL-on-Hadoop world), and AsterixDB (From the world of NoSQL systems).