scispace - formally typeset
Search or ask a question
Topic

Spark (mathematics)

About: Spark (mathematics) is a research topic. Over the lifetime, 7304 publications have been published within this topic receiving 63322 citations.


Papers
More filters
Patent
24 Dec 2014
TL;DR: In this article, a mass video semantic annotation method based on Spark is proposed, which is mainly based on elastic distributed storage of mass video under a Hadoop big data cluster environment and adopts a Spark computation mode to conduct video annotation.
Abstract: The invention provides a mass video semantic annotation method based on Spark. The method is mainly based on elastic distributed storage of mass video under a Hadoop big data cluster environment and adopts a Spark computation mode to conduct video annotation. The method mainly comprises the following contents: a video segmentation method based on a fractal theory and realization thereof on Spark; a video feature extraction method based on Spark and a visual word forming method based on a meta-learning strategy; a video annotation generation method based on Spark. Compared with the traditional single machine computation, parallel computation or distributed computation, the mass video semantic annotation method based on Spark can improve the computation speed by more than a hundred times and has the advantages of complete annotation content information, low error rate and the like.

18 citations

Proceedings ArticleDOI
09 Apr 2018
TL;DR: The technique uses the novel metric of critical participation, computed on time-based snapshots of execution traces, that provides immediate insights into specific parts of the computation, which allows SnailTrail to work online in real-time, rather than requiring complete offline traces as with traditional CPA.
Abstract: We rigorously generalize critical path analysis (CPA) to long-running and streaming computations and present SnailTrail, a system built on Timely Dataflow, which applies our analysis to a range of popular distributed dataflow engines. Our technique uses the novel metric of critical participation, computed on time-based snapshots of execution traces, that provides immediate insights into specific parts of the computation. This allows SnailTrail to work online in real-time, rather than requiring complete offline traces as with traditional CPA. It is thus applicable to scenarios like model training in machine learning, and sensor stream processing. SnailTrail assumes only a highly general model of dataflow computation (which we define) and we show it can be applied to systems as diverse as Spark, Flink, TensorFlow, and Timely Dataflow itself. We further show with examples from all four of these systems that SnailTrail is fast and scalable, and that critical participation can deliver performance analysis and insights not available using prior techniques.

18 citations

Journal ArticleDOI
TL;DR: In this paper, the electrical conditions necessary for reliable operation of sliding-spark arrays have been determined and the emission from an array can be considerably enhanced by critically damping the spark discharge circuit.
Abstract: The electrical conditions necessary for reliable operation of sliding-spark arrays have been determined The emission from an array can be considerably enhanced by critically damping the spark discharge circuit When used to pre-ionize a CO2 laser there must be a delay between the array and main discharge pulses, but this delay may be as small as 100 ns and less than the array discharge pulse width

18 citations

Proceedings ArticleDOI
24 Sep 2017
TL;DR: PBSE is a robust, path-based speculative execution that employs three key ingredients: path progress, path diversity, and path-straggler detection and speculation that is applicable to many data-parallel frameworks such as Hadoop/HDFS+QFS, Spark and Flume.
Abstract: We reveal loopholes of Speculative Execution (SE) implementations under a unique fault model: node-level network throughput degradation. This problem appears in many data-parallel frameworks such as Hadoop MapReduce and Spark. To address this, we present PBSE, a robust, path-based speculative execution that employs three key ingredients: path progress, path diversity, and path-straggler detection and speculation. We show how PBSE is superior to other approaches such as cloning and aggressive speculation under the aforementioned fault model. PBSE is a general solution, applicable to many data-parallel frameworks such as Hadoop/HDFS+QFS, Spark and Flume.

18 citations

Journal ArticleDOI
TL;DR: In this paper, an electronically controlled high-voltage spectroscopic spark source is described, which is suitable for producing low-energy ignitor sparks as well as high-energy excitation sparks.

18 citations


Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
76% related
Combustion
172.3K papers, 1.9M citations
72% related
Cluster analysis
146.5K papers, 2.9M citations
72% related
Cloud computing
156.4K papers, 1.9M citations
71% related
Hydrogen
132.2K papers, 2.5M citations
69% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202210
2021429
2020525
2019661
2018758
2017683