Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Binary Neural Network Classifier and it's bound for the number of hidden layer neurons

[...]

Narendra S. Chaudhari¹, Aruna Tiwari•Institutions (1)

Indian Institute of Technology Indore¹

01 Dec 2010

TL;DR: A Binary Neural Network Classifier (BNNC) is proposed in which hidden layer training is done in parallel, which offers high degree of parallelism in hidden layer formation and can be used for voluminous realistic database.

...read moreread less

Abstract: In this paper, a Binary Neural Network Classifier (BNNC) is proposed in which hidden layer training is done in parallel. Learning Algorithm for the BNNC is described, which is based on the principle of Fast Covering Learning Algorithm (FCLA) proposed by Wang and Chaudhari [1]. The BNNC offers high degree of parallelism in hidden layer formation. Each module in the hidden layer of BNNC is exposed to the patterns of only one class. For achieving better accuracy, issue of overlapped classes are also handled. The method is tested on few benchmark datasets, accuracies are within the acceptable range. Due to parallelism at hidden layer level, training time is decreased, therefore, it can be used for voluminous realistic database. An analytical formulation is developed to evaluate the number of hidden layer neurons, it is in the 0(log(N)), where N represents the number of inputs.

...read moreread less

3 citations

Journal Article•DOI•

Replicated image algorithms and their analyses on SIMD machines

[...]

P. J. Narayanan¹, Larry S. Davis¹•Institutions (1)

University of Maryland, College Park¹

01 Jan 1992-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: This paper presents replicated data algorithms for digital image convolutions and median filtering, and compares their performance with conventional data parallel algorithms for the same on three popular array interconnection networks, namely, the 2-D mesh, the 3-DMesh, and the hypercube.

...read moreread less

Abstract: Data parallel processing on processor array architectures has gained popularity in data intensive applications, such as image processing and scientific computing, as massively parallel processor array machines became feasible commercially. The data parallel paradigm of assigning one processing element to each data element results in an inefficient utilization of a large processor array when a relatively small data structure is processed on it. The large degree of parallelism of a massively parallel processor array machine does not result in a faster solution to a problem involving relatively small data structures than the modest degree of parallelism of a machine that is just as large as the data structure. We presented data replication technique to speed up the processing of small data structures on large processor arrays. In this paper, we present replicated data algorithms for digital image convolutions and median filtering, and compare their performance with conventional data parallel algorithms for the same on three popular array interconnection networks, namely, the 2-D mesh, the 3-D mesh, and the hypercube.

...read moreread less

3 citations

Dissertation•

Fast galactic structure finding using graphics processing units

[...]

Daniel Wood

01 Jan 2014

TL;DR: A number of modified algorithms, for accelerating the identification of halos and sub-structures, using entry-level graphics hardware, based on an adaptive hierarchical refinement of the friends-of-friends (FoF) method using six phase-space dimensions are presented.

...read moreread less

Abstract: Cosmological simulations are used by astronomers to investigate large scale structure formation and galaxy evolution. Structure finding, that is, the discovery of gravitationally-bound objects such as dark matter halos, is a crucial step in many such simulations. During recent years, advancing computational capacity has lead to halo-finders needing to manage increasingly larger simulations. As a result, many multi-core solutions have arisen in an attempt to process these simulations more efficiently. However, a many-core approach to the problem using graphics processing units (GPUs) appears largely unexplored. Since these simulations are inherently n-body problems, they contain a high degree of parallelism, which makes them very well suited to a GPU architecture. Therefore, it makes sense to determine the potential for further research in halo-finding algorithms on a GPU. We present a number of modified algorithms, for accelerating the identification of halos and sub-structures, using entry-level graphics hardware. The algorithms are based on an adaptive hierarchical refinement of the friends-of-friends (FoF) method using six phase-space dimensions: This allows for robust tracking of sub-structures. These methods are highly amenable to parallel implementation and run on GPUs. We implemented four separate systems; two on GPUs and two on CPUs. The first system for both CPU and GPU was implemented as a proof of concept exercise to familiarise us with the problem: These utilised minimum spanning trees (MSTs) and brute force methods. Our second implementation, for the CPU and GPU, capitalised on knowledge gained from the proof of concept applications, leading us to use kd-trees to efficiently solve the problem. The CPU implementations were intended to serve as benchmarks for our GPU applications. In order to verify the efficacy of the implemented systems, we applied our halo finders to cosmological simulations of varying size and compared the results obtained to those given by a widely used FoF commercial halo-finder. To conduct a fair comparison, CPU benchmarks

...read moreread less

3 citations

Patent•

Parallelized execution of window operator

[...]

Di Wu, Boyung Lee, Yongsik Yoon

03 Nov 2014

TL;DR: In this article, a window operator can be processed according to a variety of techniques that introduce parallelism, and the window function sub-results can be calculated separately on different nodes.

...read moreread less

Abstract: A window operator can be processed according to a variety of techniques that introduce parallelism. Window function sub-results can be calculated separately on different nodes. Overall superior performance can result. Skewness in input data can be accounted for by controlling a degree of parallelism at nodes.

...read moreread less

3 citations

Journal Article•DOI•

An Effective GPU-Based Approach to Probabilistic Query Confidence Computation

[...]

Edoardo Serra¹, Francesca Spezzano¹•Institutions (1)

University of Maryland, College Park¹

01 Jan 2015-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A new optimized Monte Carlo algorithm that drastically reduces the number of iterations and an efficient parallel version that is implemented on GPU is designed and Experimental results show that the algorithm is so efficient as to be comparable with the formula compilation approach, but with the significant advantage of avoiding exponential behavior.

...read moreread less

Abstract: In recent years, probabilistic data management has received a lot of attention due to several applications that deal with uncertain data: RFID systems, sensor networks, data cleaning, scientific and biomedical data management, and approximate schema mappings. Query evaluation is a challenging problem in probabilistic databases, proved to be #P-hard. A general method for query evaluation is based on the lineage of the query and reduces the query evaluation problem to computing the probability of a propositional formula. The main approaches proposed in the literature to approximate probabilistic queries confidence computation are based on Monte Carlo simulation, or formula compilation into decision diagrams (e.g., d-trees). The former executes a polynomial, but with too many, iterations, while the latter is polynomial for easy queries, but may be exponential in the worst case. We designed a new optimized Monte Carlo algorithm that drastically reduces the number of iterations and proposed an efficient parallel version that we implemented on GPU. Thanks to the elevated degree of parallelism provided by the GPU, combined with the linear speedup of our algorithm, we managed to reduce significantly the long running time required by a sequential Monte Carlo algorithm. Experimental results show that our algorithm is so efficient as to be comparable with the formula compilation approach, but with the significant advantage of avoiding exponential behavior.

...read moreread less

3 citations

Collapse

Network Information

Performance

Metrics

1,515

Papers

27,447

Citations

No. of papers in the topic in previous years
Year	Papers
2022	1
2021	47
2020	48
2019	52
2018	70
2017	75

Degree of parallelism

Papers published on a yearly basis

Papers

Trending Questions (7)

Network Information

Related Topics (5)

Performance

Metrics