# A Linear-Time Majority Tree Algorithm

15 Sep 2003-pp 216-227

TL;DR: A randomized linear-time algorithm for computing the majority rule consensus tree is given, widely used for summarizing a set of phylogenetic trees, which is usually a post-processing step in constructing a phylogeny.

Abstract: We give a randomized linear-time algorithm for computing the majority rule consensus tree. The majority rule tree is widely used for summarizing a set of phylogenetic trees, which is usually a post-processing step in constructing a phylogeny. We are implementing the algorithm as part of an interactive visualization system for exploring distributions of trees, where speed is a serious concern for real-time interaction. The linear running time is achieved by using succinct representation of the subtrees and efficient methods for the final tree reconstruction.

## Summary (3 min read)

Jump to: [1 Introduction] – [1.1 Notation] – [1.2 Prior Work] – [2 Majority Rule Tree Algorithm] – [2.1 Finding Majority Bipartitions] – [2.2 Constructing the Majority Tree] – [2.3 Final Check] – [2.4 Analysis Summary] – [3 Weighted Trees] – [4 Implementation] and [5 Acknowledgments]

### 1 Introduction

- With the recent explosion in the amount of genomic data available, and exponential increases in computing power, biologists are now able to consider larger scale problems in phylogeny: that is, the construction of evolutionary trees on hundreds or thousands of taxa, and ultimately of the entire “Tree of Life” which would include millions of taxa.
- Large sets of trees arise given any kind of input data on the taxa (e.g. gene sequence, gene order, character) and whatever optimization criterion is used to select the “best” tree.
- Maximum likelihood estimation, also computationally hard, generally produces trees with unique scores.
- The authors visualization system is designed to support both kinds of projects.

### 1.1 Notation

- Without loss of generality, the authors assume the input trees are rooted at the branch connecting a distinguished taxon s0, known as the outgroup, to the rest of the tree.
- Consider a node i in an input tree Tj. Removing the branch from i towards the root divides.
- The induced bipartition of the taxa set into two subsets identifies the combinatorial type of node i.
- The majority rule tree, or Ml tree, includes nodes for exactly those bipartitions which occur in more than half of the input trees, or more generally in more than some fraction l of the input trees.
- While this example shows binary trees, the algorithm also works for input trees with polytomies (internal nodes of degree greater than three).

### 1.2 Prior Work

- The authors algorithm follows the same intuitive scheme as most previous algorithms.
- In the first stage, the authors read through the input trees and count the occurrences of each bipartition, storing the counts in a table.
- This requires n/w machine words per node, and accounts for (n/w) factor in the bound.
- If the authors assume that the size of a machine word is O(lg x), so that for instance they can compare two bipartitions in O(1) time, then they say that Day’s algorithm achieves an optimal O(tn) running time.
- Majority trees are also computed by PAUP [17], using an unknown (to us) algorithm.

### 2 Majority Rule Tree Algorithm

- The authors algorithm has two main stages: scanning the trees to find the majority bipartitions (details in Section 2.1) and then constructing the majority rule tree from these bipartitions (details in Section 2.2).
- It ends by checking the output tree for errors due to (very unlikely) bad random choices.
- Figure 3 contains pseudo-code for the algorithm.

### 2.1 Finding Majority Bipartitions

- In the first stage of the algorithm, the authors traverse each input tree in post-order, determining each bipartition as they complete the traversal of its subtree.
- To handle collisions, the authors use a standard strategy called chaining: instead of storing a count at each table address, they store a linked list of counts, one for each bipartition which has hashed to that address.
- Similarly if B1 and B2 are bipartitions corresponding to leaves the authors can detect the double collision immediately by checking that the two taxa match before incrementing the count.
- A similar statement of course holds for h2, and when B has more than two children.
- The authors can use this fact to compute the hash code recursively during the postorder traversal.

### 2.2 Constructing the Majority Tree

- Once the authors have all the counts in the table they are ready to compute the majority rule consensus tree.
- The counts let us identify which are the majority bipartitions that appear in more than lt trees.
- For any majority bipartition B and its parent.
- When the authors are done, each node B in the output tree, interior or leaf, points to the node of smallest cardinality that was an ancestor in any one of the input trees.
- Assuming there was no double collision, Facts 2, 3, and 4 imply that the output tree is the correct majority rule consensus tree.

### 2.3 Final Check

- After constructing the majority rule tree, the authors check it against the hash table in order to detect any occurrence of the final remaining case of a double collision, when two bipartitions B1, B2 of the same cardinality k > 1 have the same value for both h1 and h2.
- Recall that, if B1, B2 are singletons or have different cardinalities, double collisions would already have been detected when putting the data into the hash table.
- To check the tree, the authors do a post-order traversal of the completed majority rule tree, recursively computing the cardinality of the bipartition at each node, and checking that these cardinalities match those in the corresponding records in the hash table.
- Whenever B1 or B2 was encountered during the first stage of the algorithm, the count for B was incremented.
- Notice that since B1, B2 have the same cardinality, one cannot be the ancestor of the other; so the two sets S(B1), S(B2) are disjoint.

### 2.4 Analysis Summary

- The majority rule consensus tree algorithm runs in O(tn) time.
- It does two traversals of the input set, and every time it visits a node it does a constant number of operations, each of which requires constant expected time (again, assuming that w = O(lg x)).
- The final check of the majority tree takes O(n) time.
- The probability that any double collision occurs is 1/c, where c is the constant such that m2 > ctn.
- Thus the probability that the algorithm succeeds on its first try is 1 − 1/c, the probability that r attempts will be required decreases exponentially with r, and the expected number of attempts is less than two.

### 3 Weighted Trees

- The majority rule tree has an interesting characterization as the median of the set of input trees, which is useful for extending the definition to weighted trees.
- Now consider the medial weight of each bipartition over all input trees, including those that do not contain the bipartition.
- Note that it is simple, although space-consuming, to compute this median weight for each majority bipartition in O(nt) time.
- In the second pass through the set of input trees, the authors store the weights for each majority edge in a linked list, as they are encountered.
- Since there are O(n) majority bipartitions and t trees the number of weights stored is O(nt).

### 4 Implementation

- The authors majority rule consensus tree algorithm is implemented as part of their treeset visualization system, which in turn is implemented within Mesquite [10].
- Mesquite is a framework for phylogenetic analysis written by Wayne and David Maddison, available for download at their Web site [10].
- Mesquite is organized into cooperating of modules.
- The authors visualization system has been implemented in such a module, TreeSetVisualization, the first published version of which can be downloaded from their webpage [1].
- The majority tree implementation will be part of the next version of the module.

### 5 Acknowledgments

- The first author was also supported by an Alfred P. Sloan Foundation Research Fellowship.
- The authors thank Jeff Klingner for the tree set visualization module and Wayne and David Maddison for Mesquite, and for encouraging us to consider the majority tree.
- The second and third authors would like to thank the Department of Computer Sciences and the Center for Computational Biology and Bioinformatics at University of Texas, and the Computer Science Department at the University of California, Davis for hosting them for several visits during 2002 and 2003.

Did you find this useful? Give us your feedback

A Linear-Time Majority Tree Algorithm

Nina Amenta

1

, Frederick Clarke

2

, and Katherine St. John

2,3

1

Computer Science Department

University of California, 2063 Engineering II

One Sheilds Ave, Davis, CA 95616.

amenta@cs.ucdavis.edu

2

Dept. of Mathematics & Computer Science

Lehman College– City University of New York

Bronx, NY 12581

fclarke72@aol.com, stjohn@lehman.cuny.edu

3

Department of Computer Science

CUNY Graduate Center, New York, NY 10016

Abstract. We give a randomized linear-time algorithm for computing

themajorityruleconsensustree.Themajorityruletreeiswidelyused

for summarizing a set of phylogenetic trees, which is usually a post-

processing step in constructing a phylogeny. We are implementing the

algorithm as part of an interactive visualization system for exploring dis-

tributions of trees, where speed is a serious concern for real-time interac-

tion. The linear running time is achieved by using succinct representation

of the subtrees and eﬃcient methods for the ﬁnal tree reconstruction.

1 Introduction

Making sense of large quantities of data is a fundamental challenge in com-

putational biology in general and phylogenetics in particular. With the recent

explosion in the amount of genomic data available, and exponential increases in

computing power, biologists are now able to consider larger scale problems in

phylogeny: that is, the construction of evolutionary trees on hundreds or thou-

sands of taxa, and ultimately of the entire “Tree of Life” which would include

millions of taxa. One diﬃculty with this program is that most programs used

for phylogeny reconstruction [8,9,17] are based upon heuristics for NP-hard opti-

mization problems, and instead of producing a single optimal tree they generally

output hundreds or thousands of likely candidates for the optimal tree. The usual

way this large volume of data is summarized is with a consensus tree.

A consensus tree for a set of input trees is a single tree which includes features

on which all or most of the input trees agree. There are several kinds of consensus

trees. The simplest is the strict consensus tree, which includes only nodes that

appear in all of the input trees. A node here is identiﬁed by the set of taxa in the

subtree rooted at the node; the roots of two subtrees with diﬀerent topologies,

but on the same subset of taxa, are considered the same node. For some sets

of input trees, the strict consensus tree works well, but for others, it produces

G. Benson and R. Page (Eds.): WABI 2003, LNBI 2812, pp. 216–227, 2003.

c

Springer-Verlag Berlin Heidelberg 2003

A Linear-Time Majority Tree Algorithm 217

Fig. 1. The tree visualization module in Mesquite. The window on the left shows a

projection of the distribution of trees. The user interactively selects subsets of trees

with the mouse, and, in response, the consensus tree of the subset is computed on-the-

ﬂy and displayed in the window on the right. Two selected subsets and their majority

trees are shown.

a tree with very few interior (non-terminal) nodes, since if a node is missing in

even one input tree it is not in the strict consensus. The majority rule consensus

tree includes all nodes that appear in a majority of input trees, rather than all

of them. The majority rule tree is interesting for a much broader range of inputs

than the strict consensus tree. Other kinds of consensus tree, such as Adams

consensus, are also used (see [3], §6.2, for an excellent overview of consensus

methods). The maximum agreement subtree, which includes a maximal subset

of taxa for which the subtrees induced by the input trees agree, gives meaningful

results in some cases in which the majority rule tree does not, but the best

algorithm has an O(tn

3

+n

d

) running time [7] (where d is the maximum outdegree

of the trees), which is not as practical for large trees as the majority rule tree.

Much recent work has been done on the related question of combining trees on

overlapping, but not identical, sets of taxa ([2,13,14,15,16]).

In this paper, we present a randomized algorithm to compute the majority

rule consensus tree, where the expected running time is linear both in the number

t of trees and in the number n of taxa. Earlier algorithms were quadratic in n,

which will be problematic for larger phylogenies. Our O(tn) expected running

time is optimal, since just reading a set of t trees on n taxa requires Ω(tn)

time. The expectation in the running time is over random choices made during

218 N. Amenta, F. Clarke, and K. St. John

the course of the algorithm, independent of the input; thus, on any input, the

running time is linear with high probability.

We were motivated to ﬁnd an eﬃcient algorithm for the majority rule tree,

because we wanted to compute it on-the-ﬂy in an interactive visualization appli-

cation [1]. The goal of the visualization system is to give the user a more sensitive

description of the distribution of a set of trees than can be presented with a sin-

gle consensus tree. Figure 1 shows a screen shot. The window on the left shows

a representation of the distribution of trees, where each point corresponds to a

tree. The user interactively selects subsets of trees and, in response, the consen-

sus tree of the subset is computed on-the-ﬂy and displayed. This package is built

as a module within Mesquite [10], a framework for phylogenetic computation by

Wayne and David Maddison. See Section 4 for more details.

Our original version of the visualization system computed only strict con-

sensus trees. We found in our prototype implementation that a simple O(tn

2

)

algorithm for the strict consensus tree was unacceptably slow for real-time in-

teraction, and we implemented instead the O(tn) strict consensus algorithm of

Day [6]. This inspired our search for a linear-time majority tree algorithm.

Having an algorithm which is eﬃcient in t is essential, and most earlier al-

gorithms focus on this. Large sets of trees arise given any kind of input data on

the taxa (e.g. gene sequence, gene order, character) and whatever optimization

criterion is used to select the “best” tree. The heuristic searches used for max-

imizing parsimony often return large sets of trees with equal parsimony scores.

Maximum likelihood estimation, also computationally hard, generally produces

trees with unique scores. While technically one of these is the optimal tree, there

are many others for which the likelihood is only negligibly sub-optimal. So, the

output of the computation is again more accurately represented by a consensus

tree.

Handling larger sets of taxa is also becoming increasingly important. Maxi-

mum parsimony and maximum likelihood have been used on sets of about 500

taxa, while researchers are exploring other methods, including genetic algorithms

and super-tree methods, for constructing very large phylogenies, with the ulti-

mate goal of estimating the entire “Tree of Life”. Our visualization system is

designed to support both kinds of projects. It is also important for the visual-

ization application to have an algorithm which is eﬃcient when n>t,sothat

when a user selects a small subset of trees on many taxa some eﬃciency can be

realized.

1.1 Notation

Let S represent a set of taxa, with |S| = n.LetT = {T

1

,T

2

,...,T

t

} be the

input set of trees, each with n leaves labeled by S,with|T | = t.

Without loss of generality, we assume the input trees are rooted at the branch

connecting a distinguished taxon s

0

,knownastheoutgroup,totherestofthe

tree. If T is given as unrooted trees, or trees rooted arbitrarily, we choose an

arbitrary taxon as s

0

and use it to root (or re-root) the trees.

A Linear-Time Majority Tree Algorithm 219

s

0

s

1

s

2

s

3

s

4

s

0

s

1

s

2

s

3

s

4

s

0

s

1

s

2

s

3

s

4

s

0

s

1

s

2

s

3

s

4

T

1

T

2

T

3

Majority rule

consensus tree

Fig. 2. Three input trees, rooted at the branch connecting s

0

,andtheirmajoritytree

(for a > 1/2 majority). The input trees need not be binary.

Consider a node i in an input tree T

j

. Removing the branch from i towards

the root divides T

j

into the subtree below i and the remainder of the tree (in-

cluding s

0

). The induced bipartition of the taxa set into two subsets identiﬁes

the combinatorial type of node i. We can represent the bipartition by the subset

of taxa which does not include s

0

; that is, by the taxa at the leaves of the sub-

tree rooted at i.IfB is the bipartition, this set is S(B). We will says that the

cardinality of B,andofi, is the cardinality of S(B). For example, in Figure 2,

s

1

s

2

| s

0

s

3

s

4

s

5

is a bipartition of tree T

1

and S(s

1

s

2

| s

0

s

3

s

4

s

5

)={s

1

s

2

}.The

cardinality of this bipartition is 2.

The majority rule tree,orM

l

tree, includes nodes for exactly those bipar-

titions which occur in more than half of the input trees, or more generally in

more than some fraction l of the input trees. Margush and McMorris [11] showed

that this set of bipartitions does indeed constitute a tree for any 1/2 <l≤ 1.

McMorris, Meronk and Neumann [12] called this family of trees the M

l

trees

(e.g. the M

1

tree is the strict consensus tree); we shall call them all generically

majority rule trees, regardless of the size of the majority.

See Figure 2 for a simple example. While this example shows binary trees, the

algorithm also works for input trees with polytomies (internal nodes of degree

greater than three).

1.2 Prior Work

Our algorithm follows the same intuitive scheme as most previous algorithms.

In the ﬁrst stage, we read through the input trees and count the occurrences

of each bipartition, storing the counts in a table. Then, in the second stage, we

create nodes for the bipartitions that occur in a majority of input trees - the

majority nodes - and “hook them together” into a tree.

An algorithm along these lines is implemented in PHYLIP [8] by Felsenstein

et al.. The overall running time as implemented seems to be O((n/w)(tn+x lg x+

n

2

)) where x is the number of bipartitions found (O(tn) in the worst case, but

often O(n)), and w is the number of bits in a machine word. The bipartition B

220 N. Amenta, F. Clarke, and K. St. John

of each of tn input nodes is represented as a bit-string:astringofn bits, one

per taxon, with a one for every taxon in S(B) set and a zero for every taxon not

in S(B). This requires n/w machine words per node, and accounts for (n/w)

factor in the bound. The ﬁrst term is for counting the bipartitions. The x lg x

term is for sorting the bipartitions by the number of times each appears; it could

be eliminated if the code was intended only to compute majority trees. The n

2

term is the running time for the subroutine for hooking together the majority

nodes. For each majority node, every other majority node is tested to see if it is

its parent, each in n/w time.

For the strict consensus tree, Day’s deterministic algorithm uses a clever

O((lg x)/w) representation for bipartitions. If we assume that the size of a ma-

chine word is O(lg x), so that for instance we can compare two bipartitions in

O(1) time, then we say that Day’s algorithm achieves an optimal O(tn) running

time. Day’s algorithm does not seem to generalize to other M

l

trees, however.

Wareham, in his undergraduate thesis at the Memorial University of Newfound-

land with Day [18], developed an O(n

2

+ t

2

n) algorithm, which only uses O(n)

space. It uses Day’s data structure to test each bipartition encountered sepa-

rately against all of the other input trees. Majority trees are also computed by

PAUP [17], using an unknown (to us) algorithm.

Our algorithm follows the same general scheme, but we introduce a new

representation for each bipartition of size O((lg x)/w) ≈ O(1), giving an O(tn)

algorithm for the ﬁrst counting step, and we also give an O(tn) algorithm for

hooking together the majority nodes.

2 Majority Rule Tree Algorithm

Our algorithm has two main stages: scanning the trees to ﬁnd the majority

bipartitions (details in Section 2.1) and then constructing the majority rule tree

from these bipartitions (details in Section 2.2). It ends by checking the output

tree for errors due to (very unlikely) bad random choices. Figure 3 contains

pseudo-code for the algorithm.

2.1 Finding Majority Bipartitions

In the ﬁrst stage of the algorithm, we traverse each input tree in post-order,

determining each bipartition as we complete the traversal of its subtree. We count

the number of times each bipartition occurs, storing the counts in a table. With

the record containing the count, we also store the cardinality of the bipartition,

which turns out to be needed as well.

A ﬁrst thought might be to use the bit-string representation of a bipartition

as an address into the table of counts, but this would be very space-ineﬃcient:

there are at most O(tn) distinct bipartitions, but 2

n

possible bit-strings. A better

idea, used in our algorithm and in PHYLIP, is to store the counts in a hash-table.

##### Citations

More filters

••

TL;DR: This article proposes stopping criteria--that is, thresholds computed at runtime to determine when enough replicates have been generated--and reports on the first large-scale experimental study to assess the effect of the number of replicates on the quality of support values, including the performance of the proposed criteria.

Abstract: Phylogenetic bootstrapping (BS) is a standard technique for inferring confidence values on phylogenetic trees that is based on reconstructing many trees from minor variations of the input data, trees called replicates. BS is used with all phylogenetic reconstruction approaches, but we focus here on one of the most popular, maximum likelihood (ML). Because ML inference is so computationally demanding, it has proved too expensive to date to assess the impact of the number of replicates used in BS on the relative accuracy of the support values. For the same reason, a rather small number (typically 100) of BS replicates are computed in real-world studies. Stamatakis et al. recently introduced a BS algorithm that is 1 to 2 orders of magnitude faster than previous techniques, while yielding qualitatively comparable support values, making an experimental study possible. In this article, we propose stopping criteria--that is, thresholds computed at runtime to determine when enough replicates have been generated--and we report on the first large-scale experimental study to assess the effect of the number of replicates on the quality of support values, including the performance of our proposed criteria. We run our tests on 17 diverse real-world DNA--single-gene as well as multi-gene--datasets, which include 125-2,554 taxa. We find that our stopping criteria typically stop computations after 100-500 replicates (although the most conservative criterion may continue for several thousand replicates) while producing support values that correlate at better than 99.5% with the reference values on the best ML trees. Significantly, we also find that the stopping criteria can recommend very different numbers of replicates for different datasets of comparable sizes. Our results are thus twofold: (i) they give the first experimental assessment of the effect of the number of BS replicates on the quality of support values returned through BS, and (ii) they validate our proposals for stopping criteria. Practitioners will no longer have to enter a guess nor worry about the quality of support values; moreover, with most counts of replicates in the 100-500 range, robust BS under ML inference becomes computationally practical for most datasets. The complete test suite is available at http://lcbb.epfl.ch/BS.tar.bz2, and BS with our stopping criteria is included in the latest release of RAxML v7.2.5, available at http://wwwkramer.in.tum.de/exelixis/software.html.

699 citations

••

14 May 2009

TL;DR: This paper proposes stopping criteria, that is, thresholds computed at runtime to determine when enough replicates have been generated, and reports on the first large-scale experimental study to assess the effect of the number of replicates on the quality of support values, including the performance of the proposed criteria.

Abstract: Phylogenetic Bootstrapping (BS) is a standard technique for inferring confidence values on phylogenetic trees that is based on reconstructing many trees from minor variations of the input data, trees called replicates. BS is used with all phylogenetic reconstruction approaches, but we focus here on the most popular, Maximum Likelihood (ML). Because ML inference is so computationally demanding, it has proved too expensive to date to assess the impact of the number of replicates used in BS on the quality of the support values. For the same reason, a rather small number (typically 100) of BS replicates are computed in real-world studies. Stamatakis et al. recently introduced a BS algorithm that is 1---2 orders of magnitude faster than previous techniques, while yielding qualitatively comparable support values, making an experimental study possible.
In this paper, we propose stopping criteria , that is, thresholds computed at runtime to determine when enough replicates have been generated, and report on the first large-scale experimental study to assess the effect of the number of replicates on the quality of support values, including the performance of our proposed criteria. We run our tests on 17 diverse real-world DNA, single-gene as well as multi-gene, datasets, that include between 125 and 2,554 sequences. We find that our stopping criteria typically stop computations after 100---500 replicates (although the most conservative criterion may continue for several thousand replicates) while producing support values that correlate at better than 99.5% with the reference values on the best ML trees. Significantly, we also find that the stopping criteria can recommend very different numbers of replicates for different datasets of comparable sizes.
Our results are thus two-fold: (i) they give the first experimental assessment of the effect of the number of BS replicates on the quality of support values returned through bootstrapping; and (ii) they validate our proposals for stopping criteria. Practitioners will no longer have to enter a guess nor worry about the quality of support values; moreover, with most counts of replicates in the 100---500 range, robust BS under ML inference becomes computationally practical for most datasets. The complete test suite is available at http://lcbb.epfl.ch/BS.tar.bz2 and BS with our stopping criteria is included in RAxML 7.1.0.

567 citations

•

02 Nov 2017

TL;DR: The author provides key analytical techniques to prove theoretical properties about methods, as well as addressing performance in practice for methods for estimating trees, in the broad and exciting field of computational phylogenetics.

Abstract: A comprehensive account of both basic and advanced material in phylogeny estimation, focusing on computational and statistical issues. No background in biology or computer science is assumed, and there is minimal use of mathematical formulas, meaning that students from many disciplines, including biology, computer science, statistics, and applied mathematics, will find the text accessible. The mathematical and statistical foundations of phylogeny estimation are presented rigorously, following which more advanced material is covered. This includes substantial chapters on multi-locus phylogeny estimation, supertree methods, multiple sequence alignment techniques, and designing methods for large-scale phylogeny estimation. The author provides key analytical techniques to prove theoretical properties about methods, as well as addressing performance in practice for methods for estimating trees. Research problems requiring novel computational methods are also presented, so that graduate students and researchers from varying disciplines will be able to enter the broad and exciting field of computational phylogenetics.

115 citations

••

TL;DR: A systematic 'divide and conquer' methodology for analyzing three-dimensional (3D) multi-parameter images of brain tissue to delineate and classify key structures, and compute quantitative associations among them is presented.

115 citations

••

TL;DR: A randomized approximation scheme that provides, in sublinear time and with high probability, a (1 + epsilon) approximation of the true RF metric, and gives a unified framework for edge-based tree algorithms in which implementation tradeoffs are clear.

Abstract: The Robinson-Foulds (RF) metric is the measure most widely used in comparing phylogenetic trees; it can be computed in linear time using Day's algorithm. When faced with the need to compare large numbers of large trees, however, even linear time becomes prohibitive. We present a randomized approximation scheme that provides, in sublinear time and with high probability, a (1 + ɛ) approximation of the true RF metric. Our approach is to use a sublinear-space embedding of the trees, combined with an application of the Johnson-Lindenstrauss lemma to approximate vector norms very rapidly. We complement our algorithm by presenting an efficient embedding procedure, thereby resolving an open issue from the preliminary version of this paper. We have also improved the performance of Day's (exact) algorithm in practice by using techniques discovered while implementing our approximation scheme. Indeed, we give a unified framework for edge-based tree algorithms in which implementation tradeoffs are clear. Finally, we p...

76 citations

### Cites background from "A Linear-Time Majority Tree Algorit..."

...It is possible to hash edges more conventionally [Amenta et al. (2003)]....

[...]

##### References

More filters

•

01 Jan 1990TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.

Abstract: From the Publisher:
The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures. Like the first edition,this text can also be used for self-study by technical professionals since it discusses engineering issues in algorithm design as well as the mathematical aspects.
In its new edition,Introduction to Algorithms continues to provide a comprehensive introduction to the modern study of algorithms. The revision has been updated to reflect changes in the years since the book's original publication. New chapters on the role of algorithms in computing and on probabilistic analysis and randomized algorithms have been included. Sections throughout the book have been rewritten for increased clarity,and material has been added wherever a fuller explanation has seemed useful or new information warrants expanded coverage.
As in the classic first edition,this new edition of Introduction to Algorithms presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers. Further,the algorithms are presented in pseudocode to make the book easily accessible to students from all programming language backgrounds.
Each chapter presents an algorithm,a design technique,an application area,or a related topic. The chapters are not dependent on one another,so the instructor can organize his or her use of the book in the way that best suits the course's needs. Additionally,the new edition offers a 25% increase over the first edition in the number of problems,giving the book 155 problems and over 900 exercises thatreinforcethe concepts the students are learning.

21,651 citations

••

TL;DR: The program MRBAYES performs Bayesian inference of phylogeny using a variant of Markov chain Monte Carlo, and an executable is available at http://brahms.rochester.edu/software.html.

Abstract: Summary: The program MRBAYES performs Bayesian inference of phylogeny using a variant of Markov chain Monte Carlo. Availability: MRBAYES, including the source code, documentation, sample data files, and an executable, is available at http://brahms.biology.rochester.edu/software.html.

20,627 citations

### Additional excerpts

...One difficulty with this program is that most programs used for phylogeny reconstruction [8,9,17] are based upon heuristics for NP-hard optimization problems, and instead of producing a single optimal tree they generally output hundreds or thousands of likely candidates for the optimal tree....

[...]

01 Jan 2002

16,957 citations