scispace - formally typeset
Search or ask a question
Author

Rolf Fagerberg

Bio: Rolf Fagerberg is an academic researcher from University of Southern Denmark. The author has contributed to research in topics: Cache-oblivious algorithm & Vertex (geometry). The author has an hindex of 25, co-authored 97 publications receiving 1906 citations. Previous affiliations of Rolf Fagerberg include Aarhus University & Odense University.


Papers
More filters
Book ChapterDOI
08 Jul 2001
TL;DR: An algorithm which constructs an evolutionary tree of n species in time O(nd logd n) using at most n⌈d/2⌉(log2ċd/ 2ċ 1 n+O(1)) experiments for d = 2, and improves the previous best upper bound by a factor Θ(log d).
Abstract: We present tight upper and lower bounds for the problem of constructing evolutionary trees in the experiment model We describe an algorithm which constructs an evolutionary tree of n species in time O(nd logd n) using at most n⌈d/2⌉(log2ċd/2ċ 1 n+O(1)) experiments for d > 2, and at most n(log n+O(1)) experiments for d = 2, where d is the degree of the tree This improves the previous best upper bound by a factor Θ(log d) For d = 2 the previously best algorithm with running time O(n log n) had a bound of 4n log n on the number of experiments By an explicit adversary argument, we show an Ω(nd logd n) lower bound, matching our upper bounds and improving the previous best lower bound by a factor Θ(logd n) Central to our algorithm is the construction and maintenance of separator trees of small height, which may be of independent interest

34 citations

Journal ArticleDOI
TL;DR: This work formalizes the student-project allocation problem as a mixed integer linear programming problem and focuses on different ways to model fairness and utilitarian principles, and proposes novel combinations of the models that attain feasible, stable, fair and collectively satisfactory solutions within a minute of computation.
Abstract: We consider the problem of allocating students to project topics satisfying side constraints and taking into account students’ preferences. Students rank projects according to their preferences for the topic and side constraints limit the possibilities to team up students in the project topics. The goal is to find assignments that are fair and that maximize the collective satisfaction. Moreover, we consider issues of stability and envy from the students’ viewpoint. This problem arises as a crucial activity in the organization of a first year course at the Faculty of Science of the University of Southern Denmark. We formalize the student-project allocation problem as a mixed integer linear programming problem and focus on different ways to model fairness and utilitarian principles. On the basis of real-world data, we compare empirically the quality of the allocations found by the different models and the computational effort to find solutions by means of a state-of-the-art commercial solver. We provide empirical evidence about the effects of these models on the distribution of the student assignments, which could be valuable input for policy makers in similar settings. Building on these results we propose novel combinations of the models that, for our case, attain feasible, stable, fair and collectively satisfactory solutions within a minute of computation. Since 2010, these solutions are used in practice at our institution.

33 citations

Journal ArticleDOI
TL;DR: This work presents a deterministic local routing algorithm that is guaranteed to find a path between any pair of vertices in a half-theta-6-graph (the half-$\theta_6$-graph is equivalent to the Delaunay triangulation where the empty region is an equilateral triangle).
Abstract: We present a deterministic local routing algorithm that is guaranteed to find a path between any pair of vertices in a half-$\theta_6$-graph (the half-$\theta_6$-graph is equivalent to the Delaunay triangulation where the empty region is an equilateral triangle). The length of the path is at most $5/\sqrt{3} \approx 2.887$ times the Euclidean distance between the pair of vertices. Moreover, we show that no local routing algorithm can achieve a better routing ratio, thereby proving that our routing algorithm is optimal. This is somewhat surprising because the spanning ratio of the half-$\theta_6$-graph is 2, meaning that even though there always exists a path whose length is at most twice the Euclidean distance, we cannot always find such a path when routing locally. Since every triangulation can be embedded in the plane as a half-$\theta_6$-graph using $O(\log n)$ bits per vertex coordinate via Schnyder's embedding scheme [W. Schnyder, Embedding planar graphs on the grid, in Proceedings of the 1st Annual ...

31 citations

Proceedings ArticleDOI
06 Jun 2005
TL;DR: The first cache-oblivious data structure for planar orthogonal range counting is presented, and a general four-sided range searching structure is presented that uses O(N log22 N/log2 log2 N) space and answers queries in O(logB N + T/B) memory transfers.
Abstract: We present the first cache-oblivious data structure for planar orthogonal range counting, and improve on previous results for cache-oblivious planar orthogonal range searching.Our range counting structure uses O(N log2 N) space and answers queries using O(logB N) memory transfers, where B is the block size of any memory level in a multilevel memory hierarchy. Using bit manipulation techniques, the space can be further reduced to O(N). The structure can also be modified to support more general semigroup range sum queries in O(logB N) memory transfers, using O(N log2 N) space for three-sided queries and O(N log22 N/log2 log2 N) space for four-sided queries.Based on the O(N log N) space range counting structure, we develop a data structure that uses O(N log2 N) space and answers three-sided range queries in O(logB N+T/B) memory transfers, where T is the number of reported points. Based on this structure, we present a general four-sided range searching structure that uses O(N log22 N/log2 log2 N) space and answers queries in O(logB N + T/B) memory transfers.

27 citations

Journal ArticleDOI
TL;DR: This work significantly improves the space bounds of the Ziv-Lempel adaptive dictionary compression schemes, improving the previously known complexities for both approximate string matching and regular expression matching problems.
Abstract: We study the approximate string matching and regular expression matching problem for the case when the text to be searched is compressed with the Ziv-Lempel adaptive dictionary compression schemes. We present a time-space trade-off that leads to algorithms improving the previously known complexities for both problems. In particular, we significantly improve the space bounds, which in practical applications are likely to be a bottleneck.

27 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted and outlines the beginnings of a comprehensive statistical framework for applying split network methods.
Abstract: The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discussion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a tree-like manner, analysis of the data may not be best served by using methods that enforce a tree structure but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic networks should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted. Additionally, the article outlines the beginnings of a comprehensive statistical framework for applying split network methods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this article describes a new program, SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances, and trees.

7,273 citations

Journal ArticleDOI
TL;DR: FastTree is a method for constructing large phylogenies and for estimating their reliability, instead of storing a distance matrix, that uses sequence profiles of internal nodes in the tree to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins.
Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N2) space and O(N2L) time, but FastTree requires just O(NLa + N) memory and O(Nlog (N)La) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 h and 2.4 GB of memory. Just computing pairwise Jukes–Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 h and 50 GB of memory. In simulations, FastTree was slightly more accurate than Neighbor-Joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

3,500 citations

Journal Article
TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.
Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

2,436 citations