scispace - formally typeset
Search or ask a question
Topic

Spark (mathematics)

About: Spark (mathematics) is a research topic. Over the lifetime, 7304 publications have been published within this topic receiving 63322 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A two-phase parallelization algorithm including fitness evaluation parallelization and genetic operation parallelization for pairwise test suite generation is proposed, which outperforms the sequential genetic algorithm and competes with other approaches in both test suite size and computational performance.
Abstract: Pairwise testing is an effective test generation technique that requires all pairs of parameter values to be covered by at least one test case. It has been proven that generating minimum test suite is an NP-complete problem. Genetic algorithms have been used for pairwise test suite generation by researchers. However, it is always a time-consuming process, which leads to significant limitations and obstacles for practical use of genetic algorithms towards large-scale test problems. Parallelism will be an effective way to not only enhance the computation performance but also improve the quality of the solutions. In this paper, we use Spark, a fast and general parallel computing platform, to parallelize the genetic algorithm to tackle the problem. We propose a two-phase parallelization algorithm including fitness evaluation parallelization and genetic operation parallelization. Experimental results show that our algorithm outperforms the sequential genetic algorithm and competes with other approaches in both test suite size and computational performance. As a result, our algorithm is a promising improvement of the genetic algorithm for pairwise test suite generation.

42 citations

Proceedings ArticleDOI

42 citations

Proceedings ArticleDOI
14 Dec 2015
TL;DR: GeeLytics is designed to support dynamic stream processing topologies by taking into account the system characteristics of heterogeneous edge/Cloud nodes and also the current system workload, which shall achieve low latency analytics results while minimizing the edge-to-Cloud bandwidth consumption.
Abstract: High data rate sensors such as video cameras, audio sensors, and motion sensors are becoming ubiquitous in the Internet of Things (IoT). In large scale IoT systems like smart cities, a large number of sensors are now widely deployed at different locations, generating a huge amount of stream data. Although the generated data provide us great potential to sense our live environments, it still remains a big challenge to efficiently extract real-time results from sensor data to make fast decisions. Existing stream processing platforms, such as Storm, Spark Streaming, and S4, are well designed to process stream data within a cluster in the Cloud, but they are not suitable for highly distributed IoT systems in which data are naturally geo-distributed and low latency analytics results are expected to be shared across users and applications. To tackle this problem, we design an edge analytics platform called GeeLytics, which can perform real-time stream processing both at the network edges and in the Cloud in a dynamic and transparent manner. In this position paper we discuss its use cases, motivation, and preliminary architecture design. As compared with the start of the art, GeeLytics is designed to support dynamic stream processing topologies by taking into account the system characteristics of heterogeneous edge/Cloud nodes and also the current system workload. This shall achieve low latency analytics results while minimizing the edge-to-Cloud bandwidth consumption. In addition, using docker application containers for packaging up deployable tasks and a distributed pub/sub mechanism for inter-task stream data routing, GeeLytics shall provide better resource isolation and system efficiency to support multi-tenancy.

42 citations

01 Aug 1993
TL;DR: The Simulation Problem Analysis Research Kernel (SPARK) environment for simulation of nonlinear differential algebraic systems has been revised to improve modeling convenience, modeling flexibility, and solution efficiency.
Abstract: The Simulation Problem Analysis Research Kernel (SPARK) environment for simulation of nonlinear differential algebraic systems has been revised to improve modeling convenience, modeling flexibility, and solution efficiency. Solution efficiency has been enhanced by automatic decomposition of the problem into strongly connected components, characterized as separately solvable subproblems. The normally constructed data flow graph in SPARK allows such components to be identified and placed in the correct order for sequential solution resulting in significant speed-up for problems that are not strongly interconnected. Modeling flexibility has been enhanced by adding the Multivalued Objects. Whereas conventional SPARK objects represent single equations and produce a single result, this extension allows more complex objects which themselves solve simultaneous sets of equations for multiple results. The need for such objects arises when submodels are to be solved independently of the spark solver; e.g., to use a specially tailored algorithm. With regard to modeling convenience, the graphical user interface now allows model definition by selection and placement of object icons in a graphical window in an X-windows environment. These objects can be connected with macro links comprising multiple problem variables. The resulting problem is then translated into a Network Language Specification file for SPARK processing.

42 citations

Patent
Noyes D Smith1
30 Jul 1968

42 citations


Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
76% related
Combustion
172.3K papers, 1.9M citations
72% related
Cluster analysis
146.5K papers, 2.9M citations
72% related
Cloud computing
156.4K papers, 1.9M citations
71% related
Hydrogen
132.2K papers, 2.5M citations
69% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202210
2021429
2020525
2019661
2018758
2017683