scispace - formally typeset
Search or ask a question
Author

Gerth Stølting Brodal

Bio: Gerth Stølting Brodal is an academic researcher from Aarhus University. The author has contributed to research in topics: Data structure & Priority queue. The author has an hindex of 39, co-authored 166 publications receiving 4420 citations. Previous affiliations of Gerth Stølting Brodal include National Research Foundation of South Africa & Max Planck Society.


Papers
More filters
01 Jan 2004
TL;DR: This work presents improved cache-oblivious data structures and algorithms for breadth-first search and the single-source shortest path problem on undirected graphs with non-negative edge weights and removes the performance gap between the currently best cache-aware algorithms for these problems.

38 citations

Proceedings ArticleDOI
21 Oct 2006
TL;DR: This work develops the first linear-space data structures for dynamic planar point location in general subdivisions that achieve logarithmic query time and poly-logarithsmic update time.
Abstract: We develop the first linear-space data structures for dynamic planar point location in general subdivisions that achieve logarithmic query time and poly-logarithmic update time

37 citations

Journal ArticleDOI
TL;DR: A new strategy for Dietz and Raman's dynamic two player pebble game on graphs is presented and the upper bound on the required number of pebbles is improved from 2b+2d+O(√b) to d+2b, and a lower bound is given that shows that the number ofpebbles depends on the out-degree d.
Abstract: The problem of making bounded in-degree and out-degree data structures partially persistent is considered. The node copying method of Driscoll et al. is extended so that updates can be performed in worst-case constant time on the pointer machine model. Previously it was only known to be possible in amortised constant time.The result is presented in terms of a new strategy for Dietz and Raman's dynamic two player pebble game on graphs.It is shown how to implement the strategy and the upper bound on the required number of pebbles is improved from 2b+2d+O(√b) to d+2b. where b is the bound of the in-degree and d the bound of the out-degree. We also give a lower bound that shows that the number of pebbles depends on the out-degree d.

36 citations

Proceedings ArticleDOI
23 Jan 2011
TL;DR: A linear-size data structure is constructed that achieves a query bound of O(n + fK/B) I/Os and it is shown that it is the requirement that the K smallest elements be reported in sorted order which makes the problem hard.
Abstract: We study the following problem: Given an array A storing N real numbers, preprocess it to allow fast reporting of the K smallest elements in the subarray A[i, j] in sorted order, for any triple (i, j, K) with 1 ≤ i ≤ j ≤ N and 1 ≤ K ≤ j − i + 1 We are interested in scenarios where the array A is large, necessitating an I/O-efficient solutionFor a parameter f with 1 ≤ f ≤ logmn, we construct a data structure that uses O((N/f) logmn) space and achieves a query bound of O(logBN + fK/B) I/Os, where B is the block size, M is the size of the main memory, n:= N/B, and m:= M/B Our main contribution is to show that this solution is nearly optimal To be precise, we show that achieving a query bound of O(logαn + fK/B) I/Os, for any constant α, requires Ω(Nf−1 logMn/log(f−1 logMn)) space, assuming B = Ω(log N) For M ≥ B1+e, this is within a log logmn factor of the upper bound The lower bound assumes indivisibility of records and holds even if we assume K is always set to j − 1 + 1We also show that it is the requirement that the K smallest elements be reported in sorted order which makes the problem hard If the K smallest elements in the query range can be reported in any order, then we can obtain a linear-size data structure with a query bound of O(logBN + K/B) I/Os

34 citations

Proceedings ArticleDOI
17 Jan 2010
TL;DR: The xDict attains the optimal tradeoff between insertions and queries, even in the broader external-memory model, for the range where inserts cost between ε with 0 < ε < 1.
Abstract: Several existing cache-oblivious dynamic dictionaries achieve O(logBN) (or slightly better O(logBN/M)) memory transfers per operation, where N is the number of items stored, M is the memory size, and B is the block size, which matches the classic B-tree data structure. One recent structure achieves the same query bound and a sometimes-better amortized update bound of O(1/BΘ(1/log log B)2) logBN + 1/B log2N) memory trans-fers. This paper presents a new data structure, the xDict, implementing predecessor queries in O(1/e log BN/M) worst-case memory transfers and insertions and deletions in O(1/eB1-e logBN/M) amortized memory transfers, for any constant e with 0

34 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted and outlines the beginnings of a comprehensive statistical framework for applying split network methods.
Abstract: The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discussion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a tree-like manner, analysis of the data may not be best served by using methods that enforce a tree structure but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic networks should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted. Additionally, the article outlines the beginnings of a comprehensive statistical framework for applying split network methods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this article describes a new program, SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances, and trees.

7,273 citations

Journal ArticleDOI
TL;DR: FastTree is a method for constructing large phylogenies and for estimating their reliability, instead of storing a distance matrix, that uses sequence profiles of internal nodes in the tree to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins.
Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N2) space and O(N2L) time, but FastTree requires just O(NLa + N) memory and O(Nlog (N)La) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 h and 2.4 GB of memory. Just computing pairwise Jukes–Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 h and 50 GB of memory. In simulations, FastTree was slightly more accurate than Neighbor-Joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

3,500 citations

Journal Article
TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.
Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

2,436 citations

01 Jan 2007
TL;DR: This paper provides a brief introduction to the key elements of BOLD, discusses their functional capabilities, and concludes by examining computational resources and future prospects.
Abstract: The Barcode of Life Data System ( BOLD ) is an informatics workbench aiding the acquisition, storage, analysis and publication of DNA barcode records. By assembling molecular, morphological and distributional data, it bridges a traditional bioinformatics chasm. BOLD is freely available to any researcher with interests in DNA barcoding. By providing specialized services, it aids the assembly of records that meet the standards needed to gain BARCODE designation in the global sequence databases. Because of its web-based delivery and flexible data security model, it is also well positioned to support projects that involve broad research alliances. This paper provides a brief introduction to the key elements of BOLD , discusses their functional capabilities, and concludes by examining computational resources and future prospects.

1,859 citations