scispace - formally typeset
Search or ask a question

Showing papers by "Rezaul Chowdhury published in 2010"


Proceedings ArticleDOI
19 Apr 2010
TL;DR: This work introduces a multicore-oblivious (MO) approach to algorithms and schedulers for HM, and presents efficient MO algorithms for several fundamental problems including matrix transposition, FFT, sorting, the Gaussian Elimination Paradigm, list ranking, and connected components.
Abstract: We address the design of algorithms for multicores that are oblivious to machine parameters. We propose HM, a multicore model consisting of a parallel shared-memory machine with hierarchical multi-level caching, and we introduce a multicore-oblivious (MO) approach to algorithms and schedulers for HM. An MO algorithm is specified with no mention of any machine parameters, such as the number of cores, number of cache levels, cache sizes and block lengths. However, it is equipped with a small set of instructions that can be used to provide hints to the run-time scheduler on how to schedule parallel tasks. We present efficient MO algorithms for several fundamental problems including matrix transposition, FFT, sorting, the Gaussian Elimination Paradigm, list ranking, and connected components. The notion of an MO algorithm is complementary to that of a network-oblivious (NO) algorithm, recently introduced by Bilardi et al. for parallel distributed-memory machines where processors communicate point-to-point. We show that several of our MO algorithms translate into efficient NO algorithms, adding to the body of known efficient NO algorithms.

70 citations


Journal ArticleDOI
TL;DR: The results indicate that cache-oblivious GEP offers an attractive trade-off between efficiency and portability.
Abstract: We consider triply-nested loops of the type that occur in the standard Gaussian elimination algorithm, which we denote by GEP (or the Gaussian Elimination Paradigm). We present two related cache-oblivious methods I-GEP and C-GEP, both of which reduce the number of cache misses incurred (or I/Os performed) by the computation over that performed by standard GEP by a factor of $\sqrt{M}$, where M is the size of the cache. Cache-oblivious I-GEP computes in-place and solves most of the known applications of GEP including Gaussian elimination and LU-decomposition without pivoting and Floyd-Warshall all-pairs shortest paths. Cache-oblivious C-GEP uses a modest amount of additional space, but is completely general and applies to any code in GEP form. Both I-GEP and C-GEP produce system-independent cache-efficient code, and are potentially applicable to being used by optimizing compilers for loop transformation. We present parallel I-GEP and C-GEP that achieve good speed-up and match the sequential caching performance cache-obliviously for both shared and distributed caches for sufficiently large inputs. We present extensive experimental results for both in-core and out-of-core performance of our algorithms. We consider both sequential and parallel implementations, and compare them with finely-tuned cache-aware BLAS code for matrix multiplication and Gaussian elimination without pivoting. Our results indicate that cache-oblivious GEP offers an attractive trade-off between efficiency and portability.

45 citations


Proceedings ArticleDOI
01 Sep 2010
TL;DR: This paper presents fast multi-level grid based approximation algorithms for efficiently estimating the compute-intensive terms of E/MM/sol and the efficiency and scalability of the fast free energy estimation of bio-molecules, potentially with millions of atoms.
Abstract: Bio-molecules reach their stable configuration in solvent which is primarily water with a small concentration of salt ions. One approximation of the total free energy of a bio-molecule includes the classical molecular mechanical energy EMM (which is understood as the self intra-molecular energy in vacuum) and the solvation energy Gsol which is caused by the change of the environment of the molecule from vacuum to solvent (and hence also known as the molecule-solvent interaction energy). This total free energy is used to model and study the stability of bio-molecules in isolation or in their interactions with drugs. In this paper we present fast O (N log N) multi-level grid based approximation algorithms (where N is the number of atoms) for efficiently estimating the compute-intensive terms of EMM and Gsol. The fast octree-based algorithm for Gsol is additionally dependent on an O (N) size computation of the biomolecular surface and its spatial derivatives (normals). We also provide several examples with timing results, and speed/accuracy tradeoffs, demonstrating the efficiency and scalability of our fast free energy estimation of bio-molecules, potentially with millions of atoms.

6 citations