FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix

doi:10.1093/MOLBEV/MSP077

Journal Article•DOI•

FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix

Morgan N. Price¹, Paramvir S. Dehal¹, Adam P. Arkin², Adam P. Arkin¹•Institutions (2)

Lawrence Berkeley National Laboratory¹, University of California, Berkeley²

01 Jul 2009-Molecular Biology and Evolution (Oxford University Press)-Vol. 26, Iss: 7, pp 1641-1650

TL;DR: FastTree is a method for constructing large phylogenies and for estimating their reliability, instead of storing a distance matrix, that uses sequence profiles of internal nodes in the tree to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins.

read less

Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N2) space and O(N2L) time, but FastTree requires just O(NLa + N) memory and O(Nlog (N)La) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 h and 2.4 GB of memory. Just computing pairwise Jukes–Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 h and 50 GB of memory. In simulations, FastTree was slightly more accurate than Neighbor-Joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

...read moreread less

Content maybe subject to copyright Report

FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix

Citations

Cites methods or result from "FastTree: Computing Large Minimum E..."

Cites methods from "FastTree: Computing Large Minimum E..."

Cites background from "FastTree: Computing Large Minimum E..."

Cites methods from "FastTree: Computing Large Minimum E..."

Cites methods from "FastTree: Computing Large Minimum E..."

References

"FastTree: Computing Large Minimum E..." refers methods in this paper

"FastTree: Computing Large Minimum E..." refers methods in this paper

"FastTree: Computing Large Minimum E..." refers background in this paper

"FastTree: Computing Large Minimum E..." refers methods in this paper

"FastTree: Computing Large Minimum E..." refers methods in this paper

Related Papers (5)