Topic
Spark (mathematics)
About: Spark (mathematics) is a research topic. Over the lifetime, 7304 publications have been published within this topic receiving 63322 citations.
Papers published on a yearly basis
Papers
More filters
••
33 citations
••
TL;DR: This paper explores Apache spark with in-memory computing and introduces a novel feature descriptor, namely, adaptive local motion descriptor (ALMD) by considering motion and appearance, which is an extension of local ternary pattern used for static texture analysis, and ALMD also generate persistent codes to describe the local-textures.
Abstract: Human action recognition plays a significant part in the computer vision and multimedia research society due to its numerous applications. However, despite different approaches proposed to address this problem, some issues regarding the robustness and efficiency of the action recognition still need to be solved. Moreover, due to the speedy development of multimedia applications from numerous origins, e.g., CCTV or video surveillance, there is an increasing demand for parallel processing of the large-scale video data. In this paper, we introduce a novel approach to recognize the human actions. First, we explore Apache spark with in-memory computing, to resolve the task of human action recognition in the distributed environment. Secondly, we introduce a novel feature descriptor, namely, adaptive local motion descriptor (ALMD) by considering motion and appearance, which is an extension of local ternary pattern used for static texture analysis, and ALMD also generate persistent codes to describe the local-textures. Finally, the spark machine learning library random forest is employed to recognize the human actions. Experimental results show the superiority of the proposed approach over other state-of-the-arts.
33 citations
•
09 Sep 2015
TL;DR: In this paper, a parallelized human body behavior identification method is presented, where skeleton data of Kinect is used as input; a distributed behavior identification algorithm is implemented based on a Spark computing framework; and a complete parallel identifying process is formed.
Abstract: The present invention discloses a parallelized human body behavior identification method. According to the method, skeleton data of Kinect is used as input; a distributed behavior identification algorithm is implemented based on a Spark computing framework; and a complete parallel identifying process is formed. Acquisition of the skeleton data of a human body is based on scene depth acquisition capacity of Kinect and the data is preprocessed to ensure invariability of displacement and scale of characteristics; and a human body structural vector, joint included angle information and skeleton weight bias are respectively selected for static behavior characteristics and a dynamic behavior searching algorithm for a structural similarity is provided. On the identification algorithm, a neural network algorithm is parallelized on Spark; a quasi-newton method L-BFGS is adopted to optimize a network weight updating process; and the training speed is obviously increased. According to an identification platform, a Hadoop distributed file system HDFS is used as a behavior data storage layer; Spark is applied to a universal resource manager YARN; the parallel neural network algorithm is used as an upper application; and the integral system architecture has excellent extendibility.
33 citations
•
22 Feb 2018TL;DR: Algorithmic Aspects of Parallel Data Processing discusses recent algorithmic developments for distributed data processing and uses a theoretical model of parallel processing called the Massively Parallel Computation (MPC) model, which is a simplification of the BSP model.
Abstract: The last decade has seen a huge and growing interest in processing large data sets on large distributed clusters. This trend began with the MapReduce framework, and has been widely adopted by several other systems, including PigLatin, Hive, Scope, Dremmel, Spark and Myria to name a few. While the applications of such systems are diverse (for example, machine learning, data analytics), most involve relatively standard data processing tasks like identifying relevant data, cleaning, filtering, joining, grouping, transforming, extracting features, and evaluating results. This has generated great interest in the study of algorithms for data processing on large distributed clusters. Algorithmic Aspects of Parallel Data Processing discusses recent algorithmic developments for distributed data processing. It uses a theoretical model of parallel processing called the Massively Parallel Computation (MPC) model, which is a simplification of the BSP model where the only cost is given by the amount of communication and the number of communication rounds. The survey studies several algorithms for multi-join queries, sorting, and matrix multiplication. It discusses their relationships and common techniques applied across the different data processing tasks.
33 citations
••
TL;DR: A shared nearest-neighbor quantum game-based attribute reduction (SNNQGAR) algorithm that incorporates the hierarchical coevolutionary Spark model that can be successfully applied to segment overlapping and interdependent fuzzy cerebral tissues, and it exhibits a stable and consistent segmentation performance for neonatal cerebral cortical surfaces.
Abstract: The unprecedented increase in data volume has become a severe challenge for conventional patterns of data mining and learning systems tasked with handling big data. The recently introduced Spark platform is a new processing method for big data analysis and related learning systems, which has attracted increasing attention from both the scientific community and industry. In this paper, we propose a shared nearest-neighbor quantum game-based attribute reduction (SNNQGAR) algorithm that incorporates the hierarchical coevolutionary Spark model. We first present a shared coevolutionary nearest-neighbor hierarchy with self-evolving compensation that considers the features of nearest-neighborhood attribute subsets and calculates the similarity between attribute subsets according to the shared neighbor information of attribute sample points. We then present a novel attribute weight tensor model to generate ranking vectors of attributes and apply them to balance the relative contributions of different neighborhood attribute subsets. To optimize the model, we propose an embedded quantum equilibrium game paradigm (QEGP) to ensure that noisy attributes do not degrade the big data reduction results. A combination of the hierarchical coevolutionary Spark model and an improved MapReduce framework is then constructed that it can better parallelize the SNNQGAR to efficiently determine the preferred reduction solutions of the distributed attribute subsets. The experimental comparisons demonstrate the superior performance of the SNNQGAR, which outperforms most of the state-of-the-art attribute reduction algorithms. Moreover, the results indicate that the SNNQGAR can be successfully applied to segment overlapping and interdependent fuzzy cerebral tissues, and it exhibits a stable and consistent segmentation performance for neonatal cerebral cortical surfaces.
33 citations