scispace - formally typeset
Search or ask a question

Showing papers by "Rezaul Chowdhury published in 2007"


Proceedings ArticleDOI
09 Jun 2007
TL;DR: This paper establishes several important properties of this cache-oblivious framework, and extends the framework to solve GEP in its full generality within the same time and I/O bounds, and presents extensive experimental results for both in-core and out-of-core performance of the algorithms.
Abstract: The Gaussian Elimination Paradigm (GEP) was introduced by the authors in [6] to represent the triply-nested loop computation that occurs in several important algorithms including Gaussian elimination without pivoting and Floyd-Warshall's all-pairs shortest paths algorithm. An efficient cache-oblivious algorithm for these instances of GEP was presented in [6]. In this paper we establish several important properties of this cache-oblivious framework, and extend the framework to solve GEP in its full generality within the same time and I/O bounds. We then analyze a parallel implementation of the framework and its caching performance for both shared and distributed caches. We present extensive experimental results for both in-core and out-of-core performance of our algorithms. We consider both sequential and parallel implementations of our algorithms, and compare them with finely-tuned cache-aware BLAS code for matrix multiplication and Gaussian elimination without pivoting. Our results indicate that cache-oblivious GEP offers an attractive tradeoff between efficiency and portability.

28 citations


Dissertation
01 Jan 2007
TL;DR: A general cache-oblivious framework for cache-efficient sequential and parallel solution of any problem in GEP, and experimental results suggest that the GEP framework offers an attractive trade-off between efficiency and portability.
Abstract: The ideal-cache model is an abstraction of the memory hierarchy in modern computers which facilitates the design of algorithms that can use the caches (i.e., memory levels) in the hierarchy efficiently without using the knowledge of cache parameters. In addition to possibly running faster than traditional flat-memory algorithms due to reduced cache-misses, these cache-oblivious algorithms are also system-independent and thus more portable than cache-aware algorithms. These algorithms are useful both in applications that work on massive datasets and in applications that run on small-memory systems such as handheld devices. The major contribution of this dissertation is a number of new cache-efficient and cache-oblivious algorithms and data structures for problems in three different domains: graph algorithms, problems in the Gaussian Elimination Paradigm (GEP), and problems with dynamic programming algorithms. Among graph problems we concentrate on shortest path computation, and for the computation-intensive problems in the latter two domains we also present efficient parallelizations of our cache-oblivious algorithms for distributed and shared caches. We perform extensive experimental study of most of our algorithms, and compare them with best known existing algorithms and software. In the area of graph algorithms and data structures, we introduce the first efficient cache-oblivious priority queue supporting Decrease-Key operations, and use it to obtain the first non-trivial cache-oblivious single-source shortest path algorithms for both directed and undirected graphs with general non-negative edge-weights. Our experimental results show that shortest path computation using a light-weight version of this new priority queue is faster than using highly optimized traditional priority queues even when the computation is in-core. We also present several new cache-efficient exact and approximate all-pairs shortest path algorithms for both weighted and unweighted undirected graphs. The Gaussian Elimination Paradigm (GEP) includes many important practical problems with constructs similar to that in Gaussian elimination without pivoting, e.g., Floyd-Warshall's all-pairs shortest path, LU decomposition without pivoting, matrix multiplication, etc. We present a general cache-oblivious framework for cache-efficient sequential and parallel solution of any problem in GEP. Our experimental results comparing our cache-oblivious algorithms with industrial-strength cache-aware BLAS (i.e., Basic Linear Algebra Subprogram) code suggest that our GEP framework offers an attractive trade-off between efficiency and portability. In the domain of dynamic programs, we present efficient cache-oblivious sequential and parallel algorithms for a number of important dynamic programs in bioinformatics including optimal pairwise sequence alignment, median of three sequences, and RNA secondary structure prediction with and without (simple) pseudo-knots. All our algorithms improve significantly over the cache complexity of earlier algorithms, and either match or improve over their space complexity. We empirically compare most of our algorithms with the best publicly available code written by others, and our experimental results indicate that our algorithms run faster than these software.

18 citations