scispace - formally typeset
Search or ask a question
Author

Rolf Fagerberg

Bio: Rolf Fagerberg is an academic researcher from University of Southern Denmark. The author has contributed to research in topics: Cache-oblivious algorithm & Vertex (geometry). The author has an hindex of 25, co-authored 97 publications receiving 1906 citations. Previous affiliations of Rolf Fagerberg include Aarhus University & Odense University.


Papers
More filters
Journal ArticleDOI
Rolf Fagerberg1
TL;DR: A generalization of binomial queues is presented, the definition of which involves a freely chosen sequence of integers greater than one that leads to different worst case bounds for the priority queue operations, allowing the user to adapt the data structure to the needs of a specific application.

8 citations

Book ChapterDOI
TL;DR: In this paper, the zipper tree, an online binary search tree that performs each access in O(log n) worst-case time, is presented, where n is the number of nodes in the tree.
Abstract: We present the zipper tree, an $O(\log \log n)$-competitive online binary search tree that performs each access in $O(\log n)$ worst-case time. This shows that for binary search trees, optimal worst-case access time and near-optimal amortized access time can be guaranteed simultaneously.

7 citations

01 May 1994
TL;DR: The results proven in this paper show that congestion is unlikely to be a problem with a chromatic priority queue, and present some heuristics which can be used if one wishes to even further reduce the probability of any problem due to congestion.
Abstract: We investigate the problem of implementing a priority queue to be used in a parallel environment, where asynchronous processes have access to a shared memory. Chromatic trees are a generalization of red-black trees appropriate for applications in such an environment, and it turns out that an appropriate priority queue can be obtained via minor modifications of chromatic trees. As opposed to earlier proposals, our deletemin operation is worst-case constant time, and insert is carried out as a fast search and constant time update, followed by an amortized constant number of rebalancing operations, which can be performed later by other processes, one at a time. If a general delete is desired, it can be implemented as a fast search and constant time update, followed by an amortized constant number of rebalancing operations, which again can be performed later by other processes, one at time. The amortization results here extend the results previously obtained for chromatic search trees. Since our proposal differs from previous work in that the processes doing updates need not have an exclusive lock on the root at any point, the number of processes which can work on the structure at any one time is not limited by the height of the tree. In fact, parallelism of the order the size of the tree is possible. The results proven in this paper show that congestion is unlikely to be a problem with a chromatic priority queue. We also present some heuristics which can be used if one wishes to even further reduce the probability of any problem due to congestion. As always, when working in an asynchronous environment, a locking scheme is required. We have included one possible locking scheme, specifically tuned to this application.

7 citations

Journal ArticleDOI
TL;DR: It is proved that randomized Quicksort performs expected O(n + log (1 + Inv/n))) element swaps, where Inv denotes the number of inversions in the input sequence, and provides a theoretical explanation for the observed behavior, and gives new insights on the behavior of the quicksort algorithm.
Abstract: Quicksort was first introduced in 1961 by Hoare. Many variants have been developed, the best of which are among the fastest generic sorting algorithms available, as testified by the choice of Quicksort as the default sorting algorithm in most programming libraries. Some sorting algorithms are adaptive, i.e. they have a complexity analysis which is better for inputs which are nearly sorted, according to some specified measure of presortedness. Quicksort is not among these, as it uses Omega(n log n) comparisons even when the input is already sorted. However, in this paper we demonstrate empirically that the actual running time of Quicksort is adaptive with respect to the presortedness measure Inv. Differences close to a factor of two are observed between instances with low and high Inv value. We then show that for the randomized version of Quicksort, the number of element swaps performed is provably adaptive with respect to the measure Inv. More precisely, we prove that randomized Quicksort performs expected O(n (1 + log (1 + Inv/n))) element swaps, where Inv denotes the number of inversions in the input sequence. This result provides a theoretical explanation for the observed behavior, and gives new insights on the behavior of the Quicksort algorithm. We also give some empirical results on the adaptive behavior of Heapsort and Mergesort.

7 citations

Journal ArticleDOI
TL;DR: Using graph transformation, this work derives about 1000 rules for amino acid side chain chemistry from the M-CSA database, a curated repository of enzymatic mechanisms, and proposes hundreds of hypothetical catalytic mechanisms for a large number of unrelated reactions in the Rhea database.
Abstract: Motivation The design of enzymes is as challenging as it is consequential for making chemical synthesis in medical and industrial applications more efficient, cost-effective and environmentally friendly. While several aspects of this complex problem are computationally assisted, the drafting of catalytic mechanisms, i.e. the specification of the chemical steps-and hence intermediate states-that the enzyme is meant to implement, is largely left to human expertise. The ability to capture specific chemistries of multistep catalysis in a fashion that enables its computational construction and design is therefore highly desirable and would equally impact the elucidation of existing enzymatic reactions whose mechanisms are unknown. Results We use the mathematical framework of graph transformation to express the distinction between rules and reactions in chemistry. We derive about 1000 rules for amino acid side chain chemistry from the M-CSA database, a curated repository of enzymatic mechanisms. Using graph transformation, we are able to propose hundreds of hypothetical catalytic mechanisms for a large number of unrelated reactions in the Rhea database. We analyze these mechanisms to find that they combine in chemically sound fashion individual steps from a variety of known multistep mechanisms, showing that plausible novel mechanisms for catalysis can be constructed computationally. Availability and implementation The source code of the initial prototype of our approach is available at https://github.com/Nojgaard/mechsearch. Supplementary information Supplementary data are available at Bioinformatics online.

6 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted and outlines the beginnings of a comprehensive statistical framework for applying split network methods.
Abstract: The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discussion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a tree-like manner, analysis of the data may not be best served by using methods that enforce a tree structure but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic networks should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted. Additionally, the article outlines the beginnings of a comprehensive statistical framework for applying split network methods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this article describes a new program, SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances, and trees.

7,273 citations

Journal ArticleDOI
TL;DR: FastTree is a method for constructing large phylogenies and for estimating their reliability, instead of storing a distance matrix, that uses sequence profiles of internal nodes in the tree to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins.
Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N2) space and O(N2L) time, but FastTree requires just O(NLa + N) memory and O(Nlog (N)La) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 h and 2.4 GB of memory. Just computing pairwise Jukes–Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 h and 50 GB of memory. In simulations, FastTree was slightly more accurate than Neighbor-Joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

3,500 citations

Journal Article
TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.
Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

2,436 citations