scispace - formally typeset
Search or ask a question
Topic

Spark (mathematics)

About: Spark (mathematics) is a research topic. Over the lifetime, 7304 publications have been published within this topic receiving 63322 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper explores Apache spark with in-memory computing and introduces a novel feature descriptor, namely, adaptive local motion descriptor (ALMD) by considering motion and appearance, which is an extension of local ternary pattern used for static texture analysis, and ALMD also generate persistent codes to describe the local-textures.
Abstract: Human action recognition plays a significant part in the computer vision and multimedia research society due to its numerous applications. However, despite different approaches proposed to address this problem, some issues regarding the robustness and efficiency of the action recognition still need to be solved. Moreover, due to the speedy development of multimedia applications from numerous origins, e.g., CCTV or video surveillance, there is an increasing demand for parallel processing of the large-scale video data. In this paper, we introduce a novel approach to recognize the human actions. First, we explore Apache spark with in-memory computing, to resolve the task of human action recognition in the distributed environment. Secondly, we introduce a novel feature descriptor, namely, adaptive local motion descriptor (ALMD) by considering motion and appearance, which is an extension of local ternary pattern used for static texture analysis, and ALMD also generate persistent codes to describe the local-textures. Finally, the spark machine learning library random forest is employed to recognize the human actions. Experimental results show the superiority of the proposed approach over other state-of-the-arts.

33 citations

Patent
09 Sep 2015
TL;DR: In this paper, a parallelized human body behavior identification method is presented, where skeleton data of Kinect is used as input; a distributed behavior identification algorithm is implemented based on a Spark computing framework; and a complete parallel identifying process is formed.
Abstract: The present invention discloses a parallelized human body behavior identification method. According to the method, skeleton data of Kinect is used as input; a distributed behavior identification algorithm is implemented based on a Spark computing framework; and a complete parallel identifying process is formed. Acquisition of the skeleton data of a human body is based on scene depth acquisition capacity of Kinect and the data is preprocessed to ensure invariability of displacement and scale of characteristics; and a human body structural vector, joint included angle information and skeleton weight bias are respectively selected for static behavior characteristics and a dynamic behavior searching algorithm for a structural similarity is provided. On the identification algorithm, a neural network algorithm is parallelized on Spark; a quasi-newton method L-BFGS is adopted to optimize a network weight updating process; and the training speed is obviously increased. According to an identification platform, a Hadoop distributed file system HDFS is used as a behavior data storage layer; Spark is applied to a universal resource manager YARN; the parallel neural network algorithm is used as an upper application; and the integral system architecture has excellent extendibility.

33 citations

Book
22 Feb 2018
TL;DR: Algorithmic Aspects of Parallel Data Processing discusses recent algorithmic developments for distributed data processing and uses a theoretical model of parallel processing called the Massively Parallel Computation (MPC) model, which is a simplification of the BSP model.
Abstract: The last decade has seen a huge and growing interest in processing large data sets on large distributed clusters. This trend began with the MapReduce framework, and has been widely adopted by several other systems, including PigLatin, Hive, Scope, Dremmel, Spark and Myria to name a few. While the applications of such systems are diverse (for example, machine learning, data analytics), most involve relatively standard data processing tasks like identifying relevant data, cleaning, filtering, joining, grouping, transforming, extracting features, and evaluating results. This has generated great interest in the study of algorithms for data processing on large distributed clusters. Algorithmic Aspects of Parallel Data Processing discusses recent algorithmic developments for distributed data processing. It uses a theoretical model of parallel processing called the Massively Parallel Computation (MPC) model, which is a simplification of the BSP model where the only cost is given by the amount of communication and the number of communication rounds. The survey studies several algorithms for multi-join queries, sorting, and matrix multiplication. It discusses their relationships and common techniques applied across the different data processing tasks.

33 citations

Journal ArticleDOI
TL;DR: A shared nearest-neighbor quantum game-based attribute reduction (SNNQGAR) algorithm that incorporates the hierarchical coevolutionary Spark model that can be successfully applied to segment overlapping and interdependent fuzzy cerebral tissues, and it exhibits a stable and consistent segmentation performance for neonatal cerebral cortical surfaces.
Abstract: The unprecedented increase in data volume has become a severe challenge for conventional patterns of data mining and learning systems tasked with handling big data. The recently introduced Spark platform is a new processing method for big data analysis and related learning systems, which has attracted increasing attention from both the scientific community and industry. In this paper, we propose a shared nearest-neighbor quantum game-based attribute reduction (SNNQGAR) algorithm that incorporates the hierarchical coevolutionary Spark model. We first present a shared coevolutionary nearest-neighbor hierarchy with self-evolving compensation that considers the features of nearest-neighborhood attribute subsets and calculates the similarity between attribute subsets according to the shared neighbor information of attribute sample points. We then present a novel attribute weight tensor model to generate ranking vectors of attributes and apply them to balance the relative contributions of different neighborhood attribute subsets. To optimize the model, we propose an embedded quantum equilibrium game paradigm (QEGP) to ensure that noisy attributes do not degrade the big data reduction results. A combination of the hierarchical coevolutionary Spark model and an improved MapReduce framework is then constructed that it can better parallelize the SNNQGAR to efficiently determine the preferred reduction solutions of the distributed attribute subsets. The experimental comparisons demonstrate the superior performance of the SNNQGAR, which outperforms most of the state-of-the-art attribute reduction algorithms. Moreover, the results indicate that the SNNQGAR can be successfully applied to segment overlapping and interdependent fuzzy cerebral tissues, and it exhibits a stable and consistent segmentation performance for neonatal cerebral cortical surfaces.

33 citations


Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
76% related
Combustion
172.3K papers, 1.9M citations
72% related
Cluster analysis
146.5K papers, 2.9M citations
72% related
Cloud computing
156.4K papers, 1.9M citations
71% related
Hydrogen
132.2K papers, 2.5M citations
69% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202210
2021429
2020525
2019661
2018758
2017683