Home
/
Authors
/
Gerth Stølting Brodal

Author

Gerth Stølting Brodal

Other affiliations: National Research Foundation of South Africa, Max Planck Society, Aalborg University

Bio: Gerth Stølting Brodal is an academic researcher from Aarhus University. The author has contributed to research in topics: Data structure & Priority queue. The author has an hindex of 39, co-authored 166 publications receiving 4420 citations. Previous affiliations of Gerth Stølting Brodal include National Research Foundation of South Africa & Max Planck Society.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Cache-oblivious planar orthogonal range searching and counting

[...]

Lars Arge¹, Gerth Stølting Brodal¹, Rolf Fagerberg², Morten Laustsen¹•Institutions (2)

Aarhus University¹, University of Southern Denmark²

06 Jun 2005

TL;DR: The first cache-oblivious data structure for planar orthogonal range counting is presented, and a general four-sided range searching structure is presented that uses O(N log22 N/log2 log2 N) space and answers queries in O(logB N + T/B) memory transfers.

...read moreread less

Abstract: We present the first cache-oblivious data structure for planar orthogonal range counting, and improve on previous results for cache-oblivious planar orthogonal range searching.Our range counting structure uses O(N log2 N) space and answers queries using O(logB N) memory transfers, where B is the block size of any memory level in a multilevel memory hierarchy. Using bit manipulation techniques, the space can be further reduced to O(N). The structure can also be modified to support more general semigroup range sum queries in O(logB N) memory transfers, using O(N log2 N) space for three-sided queries and O(N log22 N/log2 log2 N) space for four-sided queries.Based on the O(N log N) space range counting structure, we develop a data structure that uses O(N log2 N) space and answers three-sided range queries in O(logB N+T/B) memory transfers, where T is the number of reported points. Based on this structure, we present a general four-sided range searching structure that uses O(N log22 N/log2 log2 N) space and answers queries in O(logB N + T/B) memory transfers.

...read moreread less

27 citations

Journal Article•DOI•

On the adaptiveness of Quicksort

[...]

Gerth Stølting Brodal¹, Rolf Fagerberg², Gabriel Moruz³•Institutions (3)

Aarhus University¹, University of Southern Denmark², Goethe University Frankfurt³

29 Aug 2008-ACM Journal of Experimental Algorithms

TL;DR: It is demonstrated empirically that the actual running time of Quicksort is adaptive with respect to the presortedness measure Inv, and it is proved that for the randomized version of quicksort, the number of element swaps performed is provably adaptive withrespect to the measure Inv.

...read moreread less

Abstract: Quicksort was first introduced in 1961 by Hoare. Many variants have been developed, the best of which are among the fastest generic-sorting algorithms available, as testified by the choice of Quicksort as the default sorting algorithm in most programming libraries. Some sorting algorithms are adaptive, i.e., they have a complexity analysis that is better for inputs, which are nearly sorted, according to some specified measure of presortedness. Quicksort is not among these, as it uses Ω(n log n) comparisons even for sorted inputs. However, in this paper, we demonstrate empirically that the actual running time of Quicksort is adaptive with respect to the presortedness measure Inv. Differences close to a factor of two are observed between instances with low and high Inv value. We then show that for the randomized version of Quicksort, the number of element swaps performed is provably adaptive with respect to the measure Inv. More precisely, we prove that randomized Quicksort performs expected O(n(1 + log(1 + Inv/n))) element swaps, where Inv denotes the number of inversions in the input sequence. This result provides a theoretical explanation for the observed behavior and gives new insights on the behavior of Quicksort. We also give some empirical results on the adaptive behavior of Heapsort and Mergesort.

...read moreread less

27 citations

Proceedings Article•DOI•

Efficient algorithms for computing the triplet and quartet distance between trees of arbitrary degree

[...]

Gerth Stølting Brodal¹, Rolf Fagerberg², Thomas Mailund¹, Christian N. S. Pedersen¹, Andreas Sand¹ - Show less +1 more•Institutions (2)

Aarhus University¹, University of Southern Denmark²

06 Jan 2013

TL;DR: This paper shows how to compute the triplet distance in time O(n log n) and the quartet distance inTime O(dn log n), where d is the maximal degree of any node in the two trees, and presents efficient algorithms for computing these distances.

...read moreread less

Abstract: The triplet and quartet distances are distance measures to compare two rooted and two unrooted trees, respectively. The leaves of the two trees should have the same set of n labels. The distances are defined by enumerating all subsets of three labels (triplets) and four labels (quartets), respectively, and counting how often the induced topologies in the two input trees are different. In this paper we present efficient algorithms for computing these distances. We show how to compute the triplet distance in time O(n log n) and the quartet distance in time O(dn log n), where d is the maximal degree of any node in the two trees. Within the same time bounds, our framework also allows us to compute the parameterized triplet and quartet distances, where a parameter is introduced to weight resolved (binary) topologies against unresolved (non-binary) topologies. The previous best algorithm for computing the triplet and parameterized triplet distances have O(n2) running time, while the previous best algorithms for computing the quartet distance include an O(d9n log n) time algorithm and an O(n2.688) time algorithm, where the latter can also compute the parameterized quartet distance. Since d ≤ n, our algorithms improve on all these algorithms.

...read moreread less

26 citations

Book Chapter•DOI•

The Randomized Complexity of Maintaining the Minimum

[...]

Gerth Stølting Brodal¹, Shiva Chaudhuri¹, Jaikumar Radhakrishnan²•Institutions (2)

Max Planck Society¹, Tata Institute of Fundamental Research²

03 Jul 1996

TL;DR: In this paper, the complexity of maintaining a set under the operations Insert, Delete and FindMin is considered, and it is shown that any randomized algorithm with expected amortized cost t comparisons per Insert and Delete has expected cost at least n/(e22t) − 1 comparisons for FindMin.

...read moreread less

Abstract: The complexity of maintaining a set under the operations Insert, Delete and FindMin is considered. In the comparison model it is shown that any randomized algorithm with expected amortized cost t comparisons per Insert and Delete has expected cost at least n/(e22t) − 1 comparisons for FindMin. If FindMin is replaced by a weaker operation, FindAny, then it is shown that a randomized algorithm with constant expected cost per operation exists, but no deterministic algorithm. Finally, a deterministic algorithm with constant amortized cost per operation for an offline version of the problem is given.

...read moreread less

26 citations

Journal Article•

Faster algorithms for computing longest common increasing subsequences

[...]

Gerth Stølting Brodal¹, Kanela Kaligosi², Irit Katriel¹, Martin Kutz²•Institutions (2)

Aarhus University¹, Max Planck Society²

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: In this article, the authors presented an output-dependent expected running time of O((m + nl ) log log a + Sort) and O(m) space, where l is the length of an LCIS, a is the size of the alphabet, and Sort is the time to sort each input sequence.

...read moreread less

Abstract: We present algorithms for finding a longest common increasing subsequence of two or more input sequences. For two sequences of lengths m and n, where m > n, we present an algorithm with an output-dependent expected running time of O((m + nl ) log log a + Sort) and O(m) space, where l is the length of an LCIS, a is the size of the alphabet, and Sort is the time to sort each input sequence. For k ≥ 3 length-n sequences we present an algorithm which improves the previous best bound by more than a factor k for many inputs. In both cases, our algorithms are conceptually quite simple but rely on existing sophisticated data structures. Finally, we introduce the problem of longest common weakly-increasing (or non-decreasing) subsequences (LCWIS), for which we present an O(m + n log n)-time algorithm for the 3-letter alphabet case. For the extensively studied longest common subsequence problem, comparable speedups have not been achieved for small alphabets.

...read moreread less

25 citations

1
2
3
4
5
6
7
8
…
9
10
11
12
13
14
15
…
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Application of Phylogenetic Networks in Evolutionary Studies

[...]

Daniel H. Huson¹, David Bryant•Institutions (1)

University of Tübingen¹

01 Feb 2006-Molecular Biology and Evolution

TL;DR: This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted and outlines the beginnings of a comprehensive statistical framework for applying split network methods.

...read moreread less

Abstract: The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discussion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a tree-like manner, analysis of the data may not be best served by using methods that enforce a tree structure but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic networks should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted. Additionally, the article outlines the beginnings of a comprehensive statistical framework for applying split network methods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this article describes a new program, SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances, and trees.

...read moreread less

7,273 citations

Journal Article•DOI•

BOLD: The Barcode of Life Data System: Barcoding

[...]

Sujeevan Ratnasingham, Paul D. N. Hebert

01 Jan 2007-Molecular Ecology Notes

3,792 citations

Journal Article•DOI•

FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix

[...]

Morgan N. Price¹, Paramvir S. Dehal¹, Adam P. Arkin¹, Adam P. Arkin²•Institutions (2)

Lawrence Berkeley National Laboratory¹, University of California, Berkeley²

01 Jul 2009-Molecular Biology and Evolution

TL;DR: FastTree is a method for constructing large phylogenies and for estimating their reliability, instead of storing a distance matrix, that uses sequence profiles of internal nodes in the tree to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins.

...read moreread less

Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N2) space and O(N2L) time, but FastTree requires just O(NLa + N) memory and O(Nlog (N)La) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 h and 2.4 GB of memory. Just computing pairwise Jukes–Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 h and 50 GB of memory. In simulations, FastTree was slightly more accurate than Neighbor-Joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

...read moreread less

3,500 citations

Journal Article•

Fast Tree: Computing Large Minimum-Evolution Trees with Profiles instead of a Distance Matrix

[...]

Morgan N. Price, Paramvir S. Dehal, Adam P. Arkin

18 Jun 2009-Lawrence Berkeley National Laboratory

TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.

...read moreread less

Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

...read moreread less

2,436 citations

BOLD : The Barcode of Life Data System (www.barcodinglife.org)

[...]

Sujeevan Ratnasingham, Paul D. N. H Ebert

01 Jan 2007

TL;DR: This paper provides a brief introduction to the key elements of BOLD, discusses their functional capabilities, and concludes by examining computational resources and future prospects.

...read moreread less

Abstract: The Barcode of Life Data System ( BOLD ) is an informatics workbench aiding the acquisition, storage, analysis and publication of DNA barcode records. By assembling molecular, morphological and distributional data, it bridges a traditional bioinformatics chasm. BOLD is freely available to any researcher with interests in DNA barcoding. By providing specialized services, it aids the assembly of records that meet the standards needed to gain BARCODE designation in the global sequence databases. Because of its web-based delivery and flexible data security model, it is also well positioned to support projects that involve broad research alliances. This paper provides a brief introduction to the key elements of BOLD , discusses their functional capabilities, and concludes by examining computational resources and future prospects.

...read moreread less

1,859 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse