scispace - formally typeset

Journal ArticleDOI

The neighbor-joining method: a new method for reconstructing phylogenetic trees.

01 Jul 1987-Molecular Biology and Evolution (Oxford University Press)-Vol. 4, Iss: 4, pp 406-425

TL;DR: The neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods for reconstructing phylogenetic trees from evolutionary distance data.

AbstractA new method called the neighbor-joining method is proposed for reconstructing phylogenetic trees from evolutionary distance data. The principle of this method is to find pairs of operational taxonomic units (OTUs [= neighbors]) that minimize the total branch length at each stage of clustering of OTUs starting with a starlike tree. The branch lengths as well as the topology of a parsimonious tree can quickly be obtained by using this method. Using computer simulation, we studied the efficiency of this method in obtaining the correct unrooted tree in comparison with that of five other tree-making methods: the unweighted pair group method of analysis, Farris's method, Sattath and Tversky's method, Li's method, and Tateno et al.'s modified Farris method. The new, neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods.

Topics: Tree rearrangement (55%), Split (55%), Computational phylogenetics (54%), Split networks (54%), Phylogenetic network (54%)

...read more

Content maybe subject to copyright    Report

Citations
More filters

Journal ArticleDOI
TL;DR: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved and modifications are incorporated into a new program, CLUSTAL W, which is freely available.
Abstract: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.

61,038 citations


Journal ArticleDOI
TL;DR: The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models, inferring ancestral states and sequences, and estimating evolutionary rates site-by-site.
Abstract: Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net.

37,583 citations


Cites methods from "The neighbor-joining method: a new ..."

  • ...MEGA5 automatically infers the evolutionary tree by the NeighborJoining (NJ) algorithm that uses a matrix of pairwise distances estimated under the Jones–Thornton–Taylor (JTT) model for amino acid sequences or the Tamura and Nei (1993) model for nucleotide sequences (Saitou and Nei 1987; Jones et al. 1992; Tamura and Nei 1993; Tamura et al. 2004)....

    [...]

  • ...…or generated automatically by applying NJ and BIONJ algorithms to a matrix of pairwise distances estimated using a maximum composite likelihood approach for nucleotide sequences and a JTT model for amino acid sequences (Saitou and Nei 1987; Jones et al. 1992; Gascuel 1997; Tamura et al. 2004)....

    [...]

  • ...…the NeighborJoining (NJ) algorithm that uses a matrix of pairwise distances estimated under the Jones–Thornton–Taylor (JTT) model for amino acid sequences or the Tamura and Nei (1993) model for nucleotide sequences (Saitou and Nei 1987; Jones et al. 1992; Tamura and Nei 1993; Tamura et al. 2004)....

    [...]

  • ...The initial tree for the ML search can be supplied by the user (Newick format) or generated automatically by applying NJ and BIONJ algorithms to a matrix of pairwise distances estimated using a maximum composite likelihood approach for nucleotide sequences and a JTT model for amino acid sequences (Saitou and Nei 1987; Jones et al. 1992; Gascuel 1997; Tamura et al. 2004)....

    [...]


Journal ArticleDOI
TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.
Abstract: We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using treedependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

32,394 citations


Cites methods from "The neighbor-joining method: a new ..."

  • ...Distance matrices are clustered using UPGMA (11), which we ®nd to give slightly improved results over neighbor-joining (12), despite the expectation that neighbor-joining will give a more reliable estimate of the evolutionary tree....

    [...]


Journal ArticleDOI
TL;DR: Version 4 of MEGA software expands on the existing facilities for editing DNA sequence data from autosequencers, mining Web-databases, performing automatic and manual sequence alignment, analyzing sequence alignments to estimate evolutionary distances, inferring phylogenetic trees, and testing evolutionary hypotheses.
Abstract: We announce the release of the fourth version of MEGA software, which expands on the existing facilities for editing DNA sequence data from autosequencers, mining Web-databases, performing automatic and manual sequence alignment, analyzing sequence alignments to estimate evolutionary distances, inferring phylogenetic trees, and testing evolutionary hypotheses. Version 4 includes a unique facility to generate captions, written in figure legend format, in order to provide natural language descriptions of the models and methods used in the analyses. This facility aims to promote a better understanding of the underlying assumptions used in analyses, and of the results generated. Another new feature is the Maximum Composite Likelihood (MCL) method for estimating evolutionary distances between all pairs of sequences simultaneously, with and without incorporating rate variation among sites and substitution pattern heterogeneities among lineages. This MCL method also can be used to estimate transition/transversion bias and nucleotide substitution pattern without knowledge of the phylogenetic tree. This new version is a native 32-bit Windows application with multi-threading and multi-user supports, and it is also available to run in a Linux desktop environment (via the Wine compatibility layer) and on Intel-based Macintosh computers under the Parallels program. The current version of MEGA is available free of charge at (http://www.megasoftware.net).

28,578 citations


Cites methods from "The neighbor-joining method: a new ..."

  • ...the Neighbor-Joining method ( Saitou and Nei 1987 ), as the use of the MCL distances leads to a...

    [...]

  • ...…from https://academic.oup.com/mbe/article-abstract/24/8/1596/1105236 by Zhejiang University user on 26 June 2018 Neighbor-Joining method (Saitou and Nei 1987), as the use of the MCL distances leads to a much higher accuracy (Tamura, Nei, and Kumar 2004)....

    [...]


Journal ArticleDOI
TL;DR: The latest version of the Molecular Evolutionary Genetics Analysis (Mega) software, which contains many sophisticated methods and tools for phylogenomics and phylomedicine, has been optimized for use on 64-bit computing systems for analyzing larger datasets.
Abstract: We present the latest version of the Molecular Evolutionary Genetics Analysis (Mega) software, which contains many sophisticated methods and tools for phylogenomics and phylomedicine. In this major upgrade, Mega has been optimized for use on 64-bit computing systems for analyzing larger datasets. Researchers can now explore and analyze tens of thousands of sequences in Mega The new version also provides an advanced wizard for building timetrees and includes a new functionality to automatically predict gene duplication events in gene family trees. The 64-bit Mega is made available in two interfaces: graphical and command line. The graphical user interface (GUI) is a native Microsoft Windows application that can also be used on Mac OS X. The command line Mega is available as native applications for Windows, Linux, and Mac OS X. They are intended for use in high-throughput and scripted analysis. Both versions are available from www.megasoftware.net free of charge.

25,894 citations


Cites methods from "The neighbor-joining method: a new ..."

  • ...For the Neighbor-Joining (NJ) method (Saitou and Nei 1987), memory usage increased at a polynomial rate as the number of sequences was increased....

    [...]


References
More filters

Book
01 Feb 1987
TL;DR: Recent developments of statistical methods in molecular phylogenetics are reviewed and it is shown that the mathematical foundations of these methods are not well established, but computer simulations and empirical data indicate that currently used methods produce reasonably good phylogenetic trees when a sufficiently large number of nucleotides or amino acids are used.
Abstract: Recent developments of statistical methods in molecular phylogenetics are reviewed. It is shown that the mathematical foundations of these methods are not well established, but computer simulations and empirical data indicate that currently used methods such as neighbor joining, minimum evolution, likelihood, and parsimony methods produce reasonably good phylogenetic trees when a sufficiently large number of nucleotides or amino acids are used. However, when the rate of evolution varies exlensively from branch to branch, many methods may fail to recover the true topology. Solid statistical tests for examining'the accuracy of trees obtained by neighborjoining, minimum evolution, and least-squares method are available, but the methods for likelihood and parsimony trees are yet to be refined. Parsimony, likelihood, and distance methods can all be used for inferring amino acid sequences of the proteins of ancestral organisms that have become extinct.

15,371 citations


Additional excerpts

  • ...Here, 2 represents a group of OTUs including all but 1 and 2, and Dlz and Dzz are the distances between 1 and 2 and 2 and 2, respectively (see Nei 1987, pp. 298-302, for an elementary exposition of this method)....

    [...]


Book ChapterDOI
01 Jan 1969

9,921 citations


Journal ArticleDOI

3,493 citations


"The neighbor-joining method: a new ..." refers methods in this paper

  • ...After the nucleotide sequences for eight OTUs were produced, nucleotide differences were counted for all pairs of sequences, and the evolutionary distance (Jukes and Cantor 1969) was computed for each pair of OTUs....

    [...]


Journal ArticleDOI
20 Jan 1967-Science

3,417 citations


"The neighbor-joining method: a new ..." refers methods in this paper

  • ...Our procedure of estimating branch lengths is essentially the same as that of Fitch and Margoliash ( 1967)....

    [...]


Book
01 Jan 1963
Abstract: This new edition continues the story of psychology with added research and enhanced content from the most dynamic areas of the field--cognition, gender and diversity studies, neuroscience and more, while at the same time using the most effective teaching approaches and learning tools

3,328 citations