Cache-oblivious shortest paths in graphs using buffer heap

doi:10.1145/1007912.1007949

Home
/
Papers
/
Cache-oblivious shortest paths in graphs using buffer heap

Proceedings Article•DOI•

Cache-oblivious shortest paths in graphs using buffer heap

Rezaul Chowdhury¹, Vijaya Ramachandran¹•Institutions (1)

University of Texas at Austin¹

27 Jun 2004-pp 245-254

TL;DR: These results appear to give the first non-trivial cache-oblivious bounds for shortest path problems on general graphs and undirected and directed single-source shortest path (SSSP) problems for graphs with non-negative edge-weights.

read less

Abstract: We present the Buffer Heap (BH), a cache-oblivious priority queue that supports Delete-Min, Delete, and Decrease-Key operations in O(1overB log2NoverB) amortized block transfers from external memory, where B is the (unknown) block-size and N is the maximum number of elements in the queue. As is common in cache-oblivious algorithms, we assume a 'tall cache' (i.e., M = Ω(B1 + e), where M is the size of the main memory). We also assume the Decrease-Key operation only verifies that the element does not exist in the priority queue with a smaller key value, hence it also supports the insert operation in the same amortized bound. The amortized time bound for each operation is O(log N). We also present a Cache-Oblivious Tournament Tree (COTT), which is simpler than the Buffer Heap, but has weaker bounds.Using the Buffer Heap we present cache-oblivious algorithms for undirected and directed single-source shortest path (SSSP) problems for graphs with non-negative edge-weights. On a graph with V vertices and E edges, our algorithm for the undirected case performs O(V + EoverB log2VoverB) block transfers and for the directed case performs O((V + EoverB) . log2VoverB) block transfers. The running time of both algorithms is O((V + E). log V).For both priority queues with Decrease-Key operation, and for shortest path problems on general graphs, our results appear to give the first non-trivial cache-oblivious bounds.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Book Chapter•DOI•

Cache-oblivious algorithms and data structures

[...]

Gerth Stølting Brodal¹•Institutions (1)

Aarhus University¹

08 Jul 2004

TL;DR: An overview of the results achieved on cache-oblivious algorithms and data structures since the seminal paper by Frigo et al. in 1999 is given.

...read moreread less

Abstract: Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the ideal-cache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cache-oblivious algorithms. Cache-oblivious algorithms are described as standard RAM algorithms with only one memory level, i.e. without any knowledge about memory hierarchies, but are analyzed in the two-level I/O model of Aggarwal and Vitter for an arbitrary memory and block size and an optimal off-line cache replacement strategy. The result are algorithms that automatically apply to multi-level memory hierarchies. This paper gives an overview of the results achieved on cache-oblivious algorithms and data structures since the seminal paper by Frigo et al.

...read moreread less

113 citations

Cites background from "Cache-oblivious shortest paths in g..."

...Undirected single source shortest path (SSSP) can be solved cache-obliviously in O(V +E/B log(E/B)) I/Os [32, 37], matching the known bounds for the I/O model [51]....
[...]

Journal Article•DOI•

Engineering a cache-oblivious sorting algorithm

[...]

Gerth Stølting Brodal¹, Rolf Fagerberg², Kristoffer Vinther•Institutions (2)

Aarhus University¹, University of Southern Denmark²

12 Jun 2008-ACM Journal of Experimental Algorithms

TL;DR: A carefully implemented cache-oblivious sorting algorithm, which can be faster than the best Quicksort implementation the authors are able to find for input sizes well within the limits of RAM and at least as fast as the recent cache-aware implementations included in the test.

...read moreread less

Abstract: This paper is an algorithmic engineering study of cache-oblivious sorting. We investigate by empirical methods a number of implementation issues and parameter choices for the cache-oblivious sorting algorithm Lazy Funnelsort and compare the final algorithm with Quicksort, the established standard for comparison-based sorting, as well as with recent cache-aware proposals. The main result is a carefully implemented cache-oblivious sorting algorithm, which, our experiments show, can be faster than the best Quicksort implementation we are able to find for input sizes well within the limits of RAM. It is also at least as fast as the recent cache-aware implementations included in the test. On disk, the difference is even more pronounced regarding Quicksort and the cache-aware algorithms, whereas the algorithm is slower than a careful implementation of multiway Mergesort, such as TPIE.

...read moreread less

59 citations

Book Chapter•DOI•

Cache-Oblivious Data Structures and Algorithms for Undirected Breadth-First Search and Shortest Paths

[...]

Gerth Stølting Brodal¹, Rolf Fagerberg², Ulrich Meyer³, Norbert Zeh⁴•Institutions (4)

Aarhus University¹, University of Southern Denmark², Max Planck Society³, Dalhousie University⁴

08 Jul 2004

TL;DR: In this paper, the authors present improved cache-oblivious data structures and algorithms for breadth-first search and the single-source shortest path problem on undirected graphs with non-negative edge weights.

...read moreread less

Abstract: We present improved cache-oblivious data structures and algorithms for breadth-first search and the single-source shortest path problem on undirected graphs with non-negative edge weights. Our results removes the performance gap between the currently best cache-aware algorithms for these problems and their cache-oblivious counterparts. Our shortest-path algorithm relies on a new data structure, called bucket heap, which is the first cache-oblivious priority queue to efficiently support a weak DecreaseKey operation.

...read moreread less

48 citations

Cache-oblivious data structures and algorithms for undirected breadth-first search and shortest paths

[...]

Gerth Stølting Brodal¹, Rolf Fagerberg, Ulrich Meyer¹, Norbert Zeh¹, Torben Hagerup¹, Jyrki Katajainen - Show less +2 more•Institutions (1)

Max Planck Society¹

01 Jan 2004

TL;DR: This work presents improved cache-oblivious data structures and algorithms for breadth-first search and the single-source shortest path problem on undirected graphs with non-negative edge weights and removes the performance gap between the currently best cache-aware algorithms for these problems.

...read moreread less

38 citations

Cites methods from "Cache-oblivious shortest paths in g..."

...Independently of our work, the bucket heap as well as a cache-oblivious version of the tournament tree have simultaneously been developed by Chowdhury and Ramachandran [12]....
[...]

Journal Article•DOI•

An Optimal Cache-Oblivious Priority Queue and Its Application to Graph Algorithms

[...]

Lars Arge¹, Michael A. Bender, Erik D. Demaine, Bryan Holland-Minkley, J. Ian Munro - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

01 Feb 2007-SIAM Journal on Computing

TL;DR: An optimal cache-oblivious priority queue data structure, supporting insertion, deletion, and delete-min operations in amortized memory transfers, is developed, as efficient as several previously developed external memory (cache-aware)priority queue data structures.

...read moreread less

Abstract: We develop an optimal cache-oblivious priority queue data structure, supporting insertion, deletion, and delete-min operations in $O(\frac{1}{B}\log_{M/B}\frac{N}{B})$ amortized memory transfers, where $M$ and $B$ are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hierarchy. In a cache-oblivious data structure, $M$ and $B$ are not used in the description of the structure. Our structure is as efficient as several previously developed external memory (cache-aware) priority queue data structures, which all rely crucially on knowledge about $M$ and $B$. Priority queues are a critical component in many of the best known external memory graph algorithms, and using our cache-oblivious priority queue we develop several cache-oblivious graph algorithms.

...read moreread less

38 citations

Cites background from "Cache-oblivious shortest paths in g..."

...Note that recently, cache-oblivious algorithm for undirected shortest path computation have also been developed [29, 34]....
[...]
...[29], as well as Chowdhuey and Ramachandran [34], have also developed cache-oblivious priority queues that support updates in the same bound as the I/O-efficient structure of Kumar and Schwabe [42]....
[...]
...Brodal et al. [29], as well as Chowdhuey and Ramachandran [34], have also developed cache-oblivious priority queues that support updates in the same bound as the I/O-efficient structure of Kumar and Schwabe [42]....
[...]

1
2
3
4
…
5

References

PDF

Open Access

More filters

Journal Article•DOI•

A note on two problems in connexion with graphs

[...]

Edsger W. Dijkstra

01 Dec 1959-Numerische Mathematik

TL;DR: A tree is a graph with one and only one path between every two nodes, where at least one path exists between any two nodes and the length of each branch is given.

...read moreread less

Abstract: We consider n points (nodes), some or all pairs of which are connected by a branch; the length of each branch is given. We restrict ourselves to the case where at least one path exists between any two nodes. We now consider two problems. Problem 1. Constrnct the tree of minimum total length between the n nodes. (A tree is a graph with one and only one path between every two nodes.) In the course of the construction that we present here, the branches are subdivided into three sets: I. the branches definitely assignec~ to the tree under construction (they will form a subtree) ; II. the branches from which the next branch to be added to set I, will be selected ; III. the remaining branches (rejected or not yet considered). The nodes are subdivided into two sets: A. the nodes connected by the branches of set I, B. the remaining nodes (one and only one branch of set II will lead to each of these nodes), We start the construction by choosing an arbitrary node as the only member of set A, and by placing all branches that end in this node in set II. To start with, set I is empty. From then onwards we perform the following two steps repeatedly. Step 1. The shortest branch of set II is removed from this set and added to

...read moreread less

22,704 citations

"Cache-oblivious shortest paths in g..." refers background in this paper

...We also assume the Decrease-Key operation only veri.es that the element does not exist in the priority queue with a smaller key value, hence it also supports the insert operation in the same amortized bound....
[...]

Journal Article•DOI•

Fibonacci heaps and their uses in improved network optimization algorithms

[...]

Michael L. Fredman¹, Robert E. Tarjan²•Institutions (2)

University of California, San Diego¹, Bell Labs²

01 Jul 1987-Journal of the ACM

TL;DR: Using F-heaps, a new data structure for implementing heaps that extends the binomial queues proposed by Vuillemin and studied further by Brown, the improved bound for minimum spanning trees is the most striking.

...read moreread less

Abstract: In this paper we develop a new data structure for implementing heaps (priority queues). Our structure, Fibonacci heaps (abbreviated F-heaps), extends the binomial queues proposed by Vuillemin and studied further by Brown. F-heaps support arbitrary deletion from an n-item heap in O(log n) amortized time and all other standard heap operations in O(1) amortized time. Using F-heaps we are able to obtain improved running times for several network optimization algorithms. In particular, we obtain the following worst-case bounds, where n is the number of vertices and m the number of edges in the problem graph: O(n log n + m) for the single-source shortest path problem with nonnegative edge lengths, improved from O(mlog(m/n+2)n);O(n2log n + nm) for the all-pairs shortest path problem, improved from O(nm log(m/n+2)n);O(n2log n + nm) for the assignment problem (weighted bipartite matching), improved from O(nmlog(m/n+2)n);O(mβ(m, n)) for the minimum spanning tree problem, improved from O(mlog log(m/n+2)n); where β(m, n) = min {i | log(i)n ≤ m/n}. Note that β(m, n) ≤ log*n if m ≥ n.Of these results, the improved bound for minimum spanning trees is the most striking, although all the results give asymptotic improvements for graphs of appropriate densities.

...read moreread less

2,484 citations

Proceedings Article•DOI•

Fibonacci Heaps And Their Uses In Improved Network Optimization Algorithms

[...]

Michael L. Fredman¹, Robert E. Tarjan²•Institutions (2)

University of California¹, Bell Labs²

24 Oct 1984

TL;DR: The structure, Fibonacci heaps (abbreviated F-heaps), extends the binomial queues proposed by Vuillemin and studied further by Brown to obtain improved running times for several network optimization algorithms.

...read moreread less

1,757 citations

"Cache-oblivious shortest paths in g..." refers background in this paper

...We also assume the Decrease-Key operation only veri.es that the element does not exist in the priority queue with a smaller key value, hence it also supports the insert operation in the same amortized bound....
[...]

Journal Article•DOI•

The input/output complexity of sorting and related problems

[...]

Alok Aggarwal¹, S. Vitter Jeffrey²•Institutions (2)

IBM¹, Brown University²

01 Sep 1988-Communications of The ACM

TL;DR: Tight upper and lower bounds are provided for the number of inputs and outputs (I/OS) between internal memory and secondary storage required for five sorting-related problems: sorting, the fast Fourier transform (FFT), permutation networks, permuting, and matrix transposition.

...read moreread less

Abstract: We provide tight upper and lower bounds, up to a constant factor, for the number of inputs and outputs (I/OS) between internal memory and secondary storage required for five sorting-related problems: sorting, the fast Fourier transform (FFT), permutation networks, permuting, and matrix transposition. The bounds hold both in the worst case and in the average case, and in several situations the constant factors match. Secondary storage is modeled as a magnetic disk capable of transferring P blocks each containing B records in a single time unit; the records in each block must be input from or output to B contiguous locations on the disk. We give two optimal algorithms for the problems, which are variants of merge sorting and distribution sorting. In particular we show for P = 1 that the standard merge sorting algorithm is an optimal external sorting method, up to a constant factor in the number of I/Os. Our sorting algorithms use the same number of I/Os as does the permutation phase of key sorting, except when the internal memory size is extremely small, thus affirming the popular adage that key sorting is not faster. We also give a simpler and more direct derivation of Hong and Kung's lower bound for the FFT for the special case B = P = O(1).

...read moreread less

1,344 citations

Proceedings Article•DOI•

Cache-oblivious algorithms

[...]

Matteo Frigo¹, Charles E. Leiserson¹, Harald Prokop¹, Sridhar Ramachandran¹•Institutions (1)

Massachusetts Institute of Technology¹

17 Oct 1999

TL;DR: It is proved that an optimal cache-oblivious algorithm designed for two levels of memory is also optimal for multiple levels and that the assumption of optimal replacement in the ideal-cache model can be simulated efficiently by LRU replacement.

...read moreread less

Abstract: This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these algorithms are cache oblivious: no variables dependent on hardware parameters, such as cache size and cache-line length, need to be tuned to achieve optimality. Nevertheless, these algorithms use an optimal amount of work and move data optimally among multiple levels of cache. For a cache with size Z and cache-line length L where Z=/spl Omega/(L/sup 2/) the number of cache misses for an m/spl times/n matrix transpose is /spl Theta/(1+mn/L). The number of cache misses for either an n-point FFT or the sorting of n numbers is /spl Theta/(1+(n/L)(1+log/sub Z/n)). We also give an /spl Theta/(mnp)-work algorithm to multiply an m/spl times/n matrix by an n/spl times/p matrix that incurs /spl Theta/(1+(mn+np+mp)/L+mnp/L/spl radic/Z) cache faults. We introduce an "ideal-cache" model to analyze our algorithms. We prove that an optimal cache-oblivious algorithm designed for two levels of memory is also optimal for multiple levels and that the assumption of optimal replacement in the ideal-cache model. Can be simulated efficiently by LRU replacement. We also provide preliminary empirical results on the effectiveness of cache-oblivious algorithms in practice.

...read moreread less

789 citations