scispace - formally typeset
Search or ask a question

Showing papers by "Gerth Stølting Brodal published in 2008"


Journal ArticleDOI
TL;DR: A carefully implemented cache-oblivious sorting algorithm, which can be faster than the best Quicksort implementation the authors are able to find for input sizes well within the limits of RAM and at least as fast as the recent cache-aware implementations included in the test.
Abstract: This paper is an algorithmic engineering study of cache-oblivious sorting. We investigate by empirical methods a number of implementation issues and parameter choices for the cache-oblivious sorting algorithm Lazy Funnelsort and compare the final algorithm with Quicksort, the established standard for comparison-based sorting, as well as with recent cache-aware proposals. The main result is a carefully implemented cache-oblivious sorting algorithm, which, our experiments show, can be faster than the best Quicksort implementation we are able to find for input sizes well within the limits of RAM. It is also at least as fast as the recent cache-aware implementations included in the test. On disk, the difference is even more pronounced regarding Quicksort and the cache-aware algorithms, whereas the algorithm is slower than a careful implementation of multiway Mergesort, such as TPIE.

59 citations


Journal ArticleDOI
TL;DR: It is demonstrated empirically that the actual running time of Quicksort is adaptive with respect to the presortedness measure Inv, and it is proved that for the randomized version of quicksort, the number of element swaps performed is provably adaptive withrespect to the measure Inv.
Abstract: Quicksort was first introduced in 1961 by Hoare. Many variants have been developed, the best of which are among the fastest generic-sorting algorithms available, as testified by the choice of Quicksort as the default sorting algorithm in most programming libraries. Some sorting algorithms are adaptive, i.e., they have a complexity analysis that is better for inputs, which are nearly sorted, according to some specified measure of presortedness. Quicksort is not among these, as it uses Ω(n log n) comparisons even for sorted inputs. However, in this paper, we demonstrate empirically that the actual running time of Quicksort is adaptive with respect to the presortedness measure Inv. Differences close to a factor of two are observed between instances with low and high Inv value. We then show that for the randomized version of Quicksort, the number of element swaps performed is provably adaptive with respect to the measure Inv. More precisely, we prove that randomized Quicksort performs expected O(n(1 + log(1 + Inv/n))) element swaps, where Inv denotes the number of inversions in the input sequence. This result provides a theoretical explanation for the observed behavior and gives new insights on the behavior of Quicksort. We also give some empirical results on the adaptive behavior of Heapsort and Mergesort.

27 citations


Journal ArticleDOI
TL;DR: It is shown that the minmax regret median of a tree can be found in O(nlogn) time by a modification of Averbakh and Berman's O(nlog^2n)-time algorithm: a dynamic solution to their bottleneck subproblem of finding the middle of every root-leaf path in a tree.

23 citations


Journal ArticleDOI
TL;DR: Two algorithms for calculating the quartet distance between all pairs of trees in a set of binary evolutionary trees on a common set of species are presented, performing significantly better on large sets of trees compared to performing distinct pairwise distance calculations.
Abstract: We present two algorithms for calculating the quartet distance between all pairs of trees in a set of binary evolutionary trees on a common set of species. The algorithms exploit common substructure among the trees to speed up the pairwise distance calculations, thus performing significantly better on large sets of trees compared to performing distinct pairwise distance calculations, as we illustrate experimentally, where we see a speedup factor of around 130 in the best case.

5 citations


Book ChapterDOI
15 Dec 2008
TL;DR: This paper focuses on algorithms for selecting and reporting maximal sums from an array of numbers and obtains an O(n· max {1,log(k/n)}) time algorithm that selects a subarray storing the k'th largest sum.
Abstract: In an array of n numbers each of the $\binom{n}{2}+n$ contiguous subarrays define a sum. In this paper we focus on algorithms for selecting and reporting maximal sums from an array of numbers. First, we consider the problem of reporting k subarrays inducing the k largest sums among all subarrays of length at least l and at most u. For this problem we design an optimal O(n + k) time algorithm. Secondly, we consider the problem of selecting a subarray storing the k'th largest sum. For this problem we prove a time bound of θ(n · max {1,log(k/n)}) by describing an algorithm with this running time and by proving a matching lower bound. Finally, we combine the ideas and obtain an O(n· max {1,log(k/n)}) time algorithm that selects a subarray storing the k'th largest sum among all subarrays of length at least l and at most u.

5 citations


Proceedings ArticleDOI
09 Jun 2008
TL;DR: This paper presents an I/O-efficient dynamic data structure for point location in general planar subdivisions that uses linear space to store a subdivision with N segments.
Abstract: Point location is an extremely well-studied problem both in internal memory models and recently also in the external memory model. In this paper, we present an I/O-efficient dynamic data structure for point location in general planar subdivisions. Our structure uses linear space to store a subdivision with N segments. Insertions and deletions of segments can be performed in amortized O(logBN) I/Os and queries can be answered in O(logB2N) I/Os in the worst-case. The previous best known linear space dynamic structure also answers queries in O(logB2N) I/Os, but only supports insertions in amortized O(logB2N) I/Os. Our structure is also considerably simpler than previous structures.

2 citations


Book ChapterDOI
01 Jan 2008
TL;DR: A frequency-dividing circuit comprises an asynchronous counter set with a preset data n, a coincidence detection circuit for detecting the coincidence of a plurality of outputs supplied from the asynchronous counter, and a frequency-divided output signal and load pulse generation circuit.
Abstract: In the cache-oblivious setting, the computational model is a machine with two levels of memory: a cache of limited capacity and a secondary memory of infinite capacity. The capacity of the cache is assumed to be M elements, and data is moved between the two levels of memory in blocks of B consecutive elements. Computations can only be performed on elements stored in cache, i.e., elements from secondary memory need to be moved to the cache before operations can access the elements. Programs are written as acting directly on one unbounded memory, i.e., programs are like standard RAM programs. The necessary block transfers between cache and secondary memory are handled automatically by the model, assuming an optimal offline cache replacement strategy. The core assumption of the cache-oblivious model is that M and B are unknown to the algorithm, whereas in the related I/O model introduced by Aggarwal and Vitter [1], the algorithms know M and B , and the algorithms perform the block transfers explicitly. A thorough discussion of the cache-oblivious model and its relation to multilevel memory hierarchies is given in [8].