Topic
Spark (mathematics)
About: Spark (mathematics) is a research topic. Over the lifetime, 7304 publications have been published within this topic receiving 63322 citations.
Papers published on a yearly basis
Papers
More filters
••
30 Jun 2014TL;DR: The evaluation shows that AJIRA is competitive in a wide range of scenarios both in terms of processing time and scalability, making it an ideal choice where flexibility, extensibility, and the processing of both large and dynamic data with a single programming model are either desirable or even mandatory requirements.
Abstract: Currently, MapReduce is the most popular programming model for large-scale data processing and this motivated the research community to improve its efficiency either with new extensions, algorithmic optimizations, or hardware. In this paper we address two main limitations of MapReduce: one relates to the model's limited expressiveness, which prevents the implementation of complex programs that require multiple steps or iterations. The other relates to the efficiency of its most popular implementations (e.g., Hadoop), which provide good resource utilization only for massive volumes of input, operating sub optimally for smaller or rapidly changing input. To address these limitations, we present AJIRA, a new middleware designed for efficient and generic data processing. At a conceptual level, AJIRA replaces the traditional map/reduce primitives by generic operators that can be dynamically allocated, allowing the execution of more complex batch and stream processing jobs. At a more technical level, AJIRA adopts a distributed, multi-threaded architecture that strives at minimizing overhead for non-critical functionality. These characteristics allow AJIRA to be used as a single programming model for both batch and stream processing. To this end, we evaluated its performance against Hadoop, Spark, Esper, and Storm, which are state of the art systems for both batch and stream processing. Our evaluation shows that AJIRA is competitive in a wide range of scenarios both in terms of processing time and scalability, making it an ideal choice where flexibility, extensibility, and the processing of both large and dynamic data with a single programming model are either desirable or even mandatory requirements.
26 citations
••
01 Oct 2018TL;DR: Titan is a library that enables data provenance—tracking data through transformations—in Apache Spark while minimally impacting Spark job performance; observed overheads for capturing data lineage rarely exceed 30% above the baseline job execution time.
Abstract: Debugging data processing logic in data-intensive scalable computing (DISC) systems is a difficult and time-consuming effort. Today's DISC systems offer very little tooling for debugging programs, and as a result, programmers spend countless hours collecting evidence (e.g., from log files) and performing trial-and-error debugging. To aid this effort, we built Titian, a library that enables data provenance--tracking data through transformations--in Apache Spark. Data scientists using the Titian Spark extension will be able to quickly identify the input data at the root cause of a potential bug or outlier result. Titian is built directly into the Spark platform and offers data provenance support at interactive speeds--orders of magnitude faster than alternative solutions--while minimally impacting Spark job performance; observed overheads for capturing data lineage rarely exceed 30% above the baseline job execution time.
26 citations
••
06 May 200226 citations
•
[...]
TL;DR: In this paper, a torch jet spark plug is designed for an internal combustion engine with a relatively simple structure, requiring a minimal number of discrete components to enhance its manufacturability.
Abstract: A torch jet spark plug is provided which is suitable for use in a torch jet-assisted spark ignition system for an internal combustion engine. The torch jet spark plug is configured to ignite an air/fuel mixture within a combustion prechamber formed integrally within the body of the spark plug, such that a jet emanates from the prechamber and projects into the main combustion chamber of the engine, so as to enhance the burning rate within the main chamber. More particularly, the spark plug is configured so as to be relatively unsusceptible to pre-ignition, and also substantially eliminates the occurrence of internal short circuits to ground due to deposits accumulating within the prechamber. The spark plug achieves each of the above objects while having a relatively uncomplicated structure, requiring a minimal number of discrete components so as to enhance its manufacturability.
26 citations