scispace - formally typeset
Search or ask a question
Topic

Spark (mathematics)

About: Spark (mathematics) is a research topic. Over the lifetime, 7304 publications have been published within this topic receiving 63322 citations.


Papers
More filters
Proceedings ArticleDOI
30 Jun 2014
TL;DR: The evaluation shows that AJIRA is competitive in a wide range of scenarios both in terms of processing time and scalability, making it an ideal choice where flexibility, extensibility, and the processing of both large and dynamic data with a single programming model are either desirable or even mandatory requirements.
Abstract: Currently, MapReduce is the most popular programming model for large-scale data processing and this motivated the research community to improve its efficiency either with new extensions, algorithmic optimizations, or hardware. In this paper we address two main limitations of MapReduce: one relates to the model's limited expressiveness, which prevents the implementation of complex programs that require multiple steps or iterations. The other relates to the efficiency of its most popular implementations (e.g., Hadoop), which provide good resource utilization only for massive volumes of input, operating sub optimally for smaller or rapidly changing input. To address these limitations, we present AJIRA, a new middleware designed for efficient and generic data processing. At a conceptual level, AJIRA replaces the traditional map/reduce primitives by generic operators that can be dynamically allocated, allowing the execution of more complex batch and stream processing jobs. At a more technical level, AJIRA adopts a distributed, multi-threaded architecture that strives at minimizing overhead for non-critical functionality. These characteristics allow AJIRA to be used as a single programming model for both batch and stream processing. To this end, we evaluated its performance against Hadoop, Spark, Esper, and Storm, which are state of the art systems for both batch and stream processing. Our evaluation shows that AJIRA is competitive in a wide range of scenarios both in terms of processing time and scalability, making it an ideal choice where flexibility, extensibility, and the processing of both large and dynamic data with a single programming model are either desirable or even mandatory requirements.

26 citations

Journal ArticleDOI
01 Oct 2018
TL;DR: Titan is a library that enables data provenance—tracking data through transformations—in Apache Spark while minimally impacting Spark job performance; observed overheads for capturing data lineage rarely exceed 30% above the baseline job execution time.
Abstract: Debugging data processing logic in data-intensive scalable computing (DISC) systems is a difficult and time-consuming effort. Today's DISC systems offer very little tooling for debugging programs, and as a result, programmers spend countless hours collecting evidence (e.g., from log files) and performing trial-and-error debugging. To aid this effort, we built Titian, a library that enables data provenance--tracking data through transformations--in Apache Spark. Data scientists using the Titian Spark extension will be able to quickly identify the input data at the root cause of a potential bug or outlier result. Titian is built directly into the Spark platform and offers data provenance support at interactive speeds--orders of magnitude faster than alternative solutions--while minimally impacting Spark job performance; observed overheads for capturing data lineage rarely exceed 30% above the baseline job execution time.

26 citations

Proceedings ArticleDOI
01 Mar 1999

26 citations

Patent
28 Feb 1994
TL;DR: In this paper, a torch jet spark plug is designed for an internal combustion engine with a relatively simple structure, requiring a minimal number of discrete components to enhance its manufacturability.
Abstract: A torch jet spark plug is provided which is suitable for use in a torch jet-assisted spark ignition system for an internal combustion engine. The torch jet spark plug is configured to ignite an air/fuel mixture within a combustion prechamber formed integrally within the body of the spark plug, such that a jet emanates from the prechamber and projects into the main combustion chamber of the engine, so as to enhance the burning rate within the main chamber. More particularly, the spark plug is configured so as to be relatively unsusceptible to pre-ignition, and also substantially eliminates the occurrence of internal short circuits to ground due to deposits accumulating within the prechamber. The spark plug achieves each of the above objects while having a relatively uncomplicated structure, requiring a minimal number of discrete components so as to enhance its manufacturability.

26 citations


Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
76% related
Combustion
172.3K papers, 1.9M citations
72% related
Cluster analysis
146.5K papers, 2.9M citations
72% related
Cloud computing
156.4K papers, 1.9M citations
71% related
Hydrogen
132.2K papers, 2.5M citations
69% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202210
2021429
2020525
2019661
2018758
2017683