scispace - formally typeset
Search or ask a question
Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.


Papers
More filters
Proceedings ArticleDOI
01 Dec 2010
TL;DR: A Binary Neural Network Classifier (BNNC) is proposed in which hidden layer training is done in parallel, which offers high degree of parallelism in hidden layer formation and can be used for voluminous realistic database.
Abstract: In this paper, a Binary Neural Network Classifier (BNNC) is proposed in which hidden layer training is done in parallel. Learning Algorithm for the BNNC is described, which is based on the principle of Fast Covering Learning Algorithm (FCLA) proposed by Wang and Chaudhari [1]. The BNNC offers high degree of parallelism in hidden layer formation. Each module in the hidden layer of BNNC is exposed to the patterns of only one class. For achieving better accuracy, issue of overlapped classes are also handled. The method is tested on few benchmark datasets, accuracies are within the acceptable range. Due to parallelism at hidden layer level, training time is decreased, therefore, it can be used for voluminous realistic database. An analytical formulation is developed to evaluate the number of hidden layer neurons, it is in the 0(log(N)), where N represents the number of inputs.

3 citations

Journal ArticleDOI
TL;DR: This paper presents replicated data algorithms for digital image convolutions and median filtering, and compares their performance with conventional data parallel algorithms for the same on three popular array interconnection networks, namely, the 2-D mesh, the 3-DMesh, and the hypercube.
Abstract: Data parallel processing on processor array architectures has gained popularity in data intensive applications, such as image processing and scientific computing, as massively parallel processor array machines became feasible commercially. The data parallel paradigm of assigning one processing element to each data element results in an inefficient utilization of a large processor array when a relatively small data structure is processed on it. The large degree of parallelism of a massively parallel processor array machine does not result in a faster solution to a problem involving relatively small data structures than the modest degree of parallelism of a machine that is just as large as the data structure. We presented data replication technique to speed up the processing of small data structures on large processor arrays. In this paper, we present replicated data algorithms for digital image convolutions and median filtering, and compare their performance with conventional data parallel algorithms for the same on three popular array interconnection networks, namely, the 2-D mesh, the 3-D mesh, and the hypercube.

3 citations

Dissertation
01 Jan 2014
TL;DR: A number of modified algorithms, for accelerating the identification of halos and sub-structures, using entry-level graphics hardware, based on an adaptive hierarchical refinement of the friends-of-friends (FoF) method using six phase-space dimensions are presented.
Abstract: Cosmological simulations are used by astronomers to investigate large scale structure formation and galaxy evolution. Structure finding, that is, the discovery of gravitationally-bound objects such as dark matter halos, is a crucial step in many such simulations. During recent years, advancing computational capacity has lead to halo-finders needing to manage increasingly larger simulations. As a result, many multi-core solutions have arisen in an attempt to process these simulations more efficiently. However, a many-core approach to the problem using graphics processing units (GPUs) appears largely unexplored. Since these simulations are inherently n-body problems, they contain a high degree of parallelism, which makes them very well suited to a GPU architecture. Therefore, it makes sense to determine the potential for further research in halo-finding algorithms on a GPU. We present a number of modified algorithms, for accelerating the identification of halos and sub-structures, using entry-level graphics hardware. The algorithms are based on an adaptive hierarchical refinement of the friends-of-friends (FoF) method using six phase-space dimensions: This allows for robust tracking of sub-structures. These methods are highly amenable to parallel implementation and run on GPUs. We implemented four separate systems; two on GPUs and two on CPUs. The first system for both CPU and GPU was implemented as a proof of concept exercise to familiarise us with the problem: These utilised minimum spanning trees (MSTs) and brute force methods. Our second implementation, for the CPU and GPU, capitalised on knowledge gained from the proof of concept applications, leading us to use kd-trees to efficiently solve the problem. The CPU implementations were intended to serve as benchmarks for our GPU applications. In order to verify the efficacy of the implemented systems, we applied our halo finders to cosmological simulations of varying size and compared the results obtained to those given by a widely used FoF commercial halo-finder. To conduct a fair comparison, CPU benchmarks

3 citations

Patent
03 Nov 2014
TL;DR: In this article, a window operator can be processed according to a variety of techniques that introduce parallelism, and the window function sub-results can be calculated separately on different nodes.
Abstract: A window operator can be processed according to a variety of techniques that introduce parallelism. Window function sub-results can be calculated separately on different nodes. Overall superior performance can result. Skewness in input data can be accounted for by controlling a degree of parallelism at nodes.

3 citations

Journal ArticleDOI
TL;DR: A new optimized Monte Carlo algorithm that drastically reduces the number of iterations and an efficient parallel version that is implemented on GPU is designed and Experimental results show that the algorithm is so efficient as to be comparable with the formula compilation approach, but with the significant advantage of avoiding exponential behavior.
Abstract: In recent years, probabilistic data management has received a lot of attention due to several applications that deal with uncertain data: RFID systems, sensor networks, data cleaning, scientific and biomedical data management, and approximate schema mappings. Query evaluation is a challenging problem in probabilistic databases, proved to be #P-hard. A general method for query evaluation is based on the lineage of the query and reduces the query evaluation problem to computing the probability of a propositional formula. The main approaches proposed in the literature to approximate probabilistic queries confidence computation are based on Monte Carlo simulation, or formula compilation into decision diagrams (e.g., d-trees). The former executes a polynomial, but with too many, iterations, while the latter is polynomial for easy queries, but may be exponential in the worst case. We designed a new optimized Monte Carlo algorithm that drastically reduces the number of iterations and proposed an efficient parallel version that we implemented on GPU. Thanks to the elevated degree of parallelism provided by the GPU, combined with the linear speedup of our algorithm, we managed to reduce significantly the long running time required by a sequential Monte Carlo algorithm. Experimental results show that our algorithm is so efficient as to be comparable with the formula compilation approach, but with the significant advantage of avoiding exponential behavior.

3 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
85% related
Scheduling (computing)
78.6K papers, 1.3M citations
83% related
Network packet
159.7K papers, 2.2M citations
80% related
Web service
57.6K papers, 989K citations
80% related
Quality of service
77.1K papers, 996.6K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20221
202147
202048
201952
201870
201775