scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The neighbor-joining method: a new method for reconstructing phylogenetic trees.

01 Jul 1987-Molecular Biology and Evolution (Oxford University Press)-Vol. 4, Iss: 4, pp 406-425
TL;DR: The neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods for reconstructing phylogenetic trees from evolutionary distance data.
Abstract: A new method called the neighbor-joining method is proposed for reconstructing phylogenetic trees from evolutionary distance data. The principle of this method is to find pairs of operational taxonomic units (OTUs [= neighbors]) that minimize the total branch length at each stage of clustering of OTUs starting with a starlike tree. The branch lengths as well as the topology of a parsimonious tree can quickly be obtained by using this method. Using computer simulation, we studied the efficiency of this method in obtaining the correct unrooted tree in comparison with that of five other tree-making methods: the unweighted pair group method of analysis, Farris's method, Sattath and Tversky's method, Li's method, and Tateno et al.'s modified Farris method. The new, neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved and modifications are incorporated into a new program, CLUSTAL W, which is freely available.
Abstract: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.

63,427 citations

Journal ArticleDOI
TL;DR: The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models, inferring ancestral states and sequences, and estimating evolutionary rates site-by-site.
Abstract: Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net.

39,110 citations


Cites methods from "The neighbor-joining method: a new ..."

  • ...MEGA5 automatically infers the evolutionary tree by the NeighborJoining (NJ) algorithm that uses a matrix of pairwise distances estimated under the Jones–Thornton–Taylor (JTT) model for amino acid sequences or the Tamura and Nei (1993) model for nucleotide sequences (Saitou and Nei 1987; Jones et al. 1992; Tamura and Nei 1993; Tamura et al. 2004)....

    [...]

  • ...…or generated automatically by applying NJ and BIONJ algorithms to a matrix of pairwise distances estimated using a maximum composite likelihood approach for nucleotide sequences and a JTT model for amino acid sequences (Saitou and Nei 1987; Jones et al. 1992; Gascuel 1997; Tamura et al. 2004)....

    [...]

  • ...…the NeighborJoining (NJ) algorithm that uses a matrix of pairwise distances estimated under the Jones–Thornton–Taylor (JTT) model for amino acid sequences or the Tamura and Nei (1993) model for nucleotide sequences (Saitou and Nei 1987; Jones et al. 1992; Tamura and Nei 1993; Tamura et al. 2004)....

    [...]

  • ...The initial tree for the ML search can be supplied by the user (Newick format) or generated automatically by applying NJ and BIONJ algorithms to a matrix of pairwise distances estimated using a maximum composite likelihood approach for nucleotide sequences and a JTT model for amino acid sequences (Saitou and Nei 1987; Jones et al. 1992; Gascuel 1997; Tamura et al. 2004)....

    [...]

Journal ArticleDOI
TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.
Abstract: We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using treedependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

37,524 citations


Cites methods from "The neighbor-joining method: a new ..."

  • ...Distance matrices are clustered using UPGMA (11), which we ®nd to give slightly improved results over neighbor-joining (12), despite the expectation that neighbor-joining will give a more reliable estimate of the evolutionary tree....

    [...]

Journal ArticleDOI
TL;DR: The latest version of the Molecular Evolutionary Genetics Analysis (Mega) software, which contains many sophisticated methods and tools for phylogenomics and phylomedicine, has been optimized for use on 64-bit computing systems for analyzing larger datasets.
Abstract: We present the latest version of the Molecular Evolutionary Genetics Analysis (Mega) software, which contains many sophisticated methods and tools for phylogenomics and phylomedicine. In this major upgrade, Mega has been optimized for use on 64-bit computing systems for analyzing larger datasets. Researchers can now explore and analyze tens of thousands of sequences in Mega The new version also provides an advanced wizard for building timetrees and includes a new functionality to automatically predict gene duplication events in gene family trees. The 64-bit Mega is made available in two interfaces: graphical and command line. The graphical user interface (GUI) is a native Microsoft Windows application that can also be used on Mac OS X. The command line Mega is available as native applications for Windows, Linux, and Mac OS X. They are intended for use in high-throughput and scripted analysis. Both versions are available from www.megasoftware.net free of charge.

33,048 citations


Cites methods from "The neighbor-joining method: a new ..."

  • ...For the Neighbor-Joining (NJ) method (Saitou and Nei 1987), memory usage increased at a polynomial rate as the number of sequences was increased....

    [...]

Journal ArticleDOI
TL;DR: Version 4 of MEGA software expands on the existing facilities for editing DNA sequence data from autosequencers, mining Web-databases, performing automatic and manual sequence alignment, analyzing sequence alignments to estimate evolutionary distances, inferring phylogenetic trees, and testing evolutionary hypotheses.
Abstract: We announce the release of the fourth version of MEGA software, which expands on the existing facilities for editing DNA sequence data from autosequencers, mining Web-databases, performing automatic and manual sequence alignment, analyzing sequence alignments to estimate evolutionary distances, inferring phylogenetic trees, and testing evolutionary hypotheses. Version 4 includes a unique facility to generate captions, written in figure legend format, in order to provide natural language descriptions of the models and methods used in the analyses. This facility aims to promote a better understanding of the underlying assumptions used in analyses, and of the results generated. Another new feature is the Maximum Composite Likelihood (MCL) method for estimating evolutionary distances between all pairs of sequences simultaneously, with and without incorporating rate variation among sites and substitution pattern heterogeneities among lineages. This MCL method also can be used to estimate transition/transversion bias and nucleotide substitution pattern without knowledge of the phylogenetic tree. This new version is a native 32-bit Windows application with multi-threading and multi-user supports, and it is also available to run in a Linux desktop environment (via the Wine compatibility layer) and on Intel-based Macintosh computers under the Parallels program. The current version of MEGA is available free of charge at (http://www.megasoftware.net).

29,021 citations


Cites methods from "The neighbor-joining method: a new ..."

  • ...the Neighbor-Joining method ( Saitou and Nei 1987 ), as the use of the MCL distances leads to a...

    [...]

  • ...…from https://academic.oup.com/mbe/article-abstract/24/8/1596/1105236 by Zhejiang University user on 26 June 2018 Neighbor-Joining method (Saitou and Nei 1987), as the use of the MCL distances leads to a much higher accuracy (Tamura, Nei, and Kumar 2004)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The tree is not formed a portion at a time, but is formed en toto without intervening estimates of branch lengths, based on a relaxed additivity (four-point metric) constraint.
Abstract: A procedure is presented that forms an unrooted tree-like structure from a matrix of pairwise differences. The tree is not formed a portion at a time, as methods now in use generally do, but is formed en toto without intervening estimates of branch lengths. The method is based on a relaxed additivity (four-point metric) constraint. From the tree, a classification may be formed.

112 citations

Journal ArticleDOI
Susan M. Case1
TL;DR: Two biochemical methods, starch gel electrophoresis and microcomplement fixation, have been used in an examination of the evolutionary relationships among western North American frogs of the genus Rana and indicate that the Rana boylii species group presently includes two very different evolutionary lineages.
Abstract: Case, S. M. (Museum of Vertebrate Zoology and Departments of Zoology and Biochemistry, University of California, Berkeley, California 94720) 1978. Biochemical systematics of members of the genus Rana native to western North America. Syst. Zool. 27:299-311. -Few supraspecific groups have been defined in North American ranids and the informal groupings which are recognized are often poorly characterized. Two biochemical methods, starch gel electrophoresis and microcomplement fixation, have been used in an examination of the evolutionary relationships among western North American frogs of the genus Rana. Both the electrophoretic and albumin comparisons indicate that the Rana boylii species group presently includes two very different evolutionary lineages. Rana aurora, R. boylii, R. cascadae, R. muscosa, and R. pretiosa are all members of one lineage allied to R. temporaria of Europe. A Mexican species traditionally included in this group, R. tarahumarae, is most closely related to other members of the genus that occur in Mexico and is part of a larger lineage that also includes R. pipiens. Frogs found in eastern North America diverged from western European frogs in mid-Eocene; estimates of divergence time are consistent with the hypothesis that separation of these lineages coincided with the end of a land connection between Europe and North America. The catesbeiana, pipiens, and tarahumarae groups diverged from each other in the Oligocene. Western North American Rana diverged from a Eurasian ancestor in the Oligocene and radiated in this area to form the five members of the boylii group. [Evolutionary relationships; electrophoresis; microcomplement fixation; Rana; western North America.] There have been conflicting views about the relationships among North American species of the genus Rana, particularly of the western forms. The traditional species groups, which are primarily defined by similarities in external morphology, are presented in Table 1. The results of two recent studies, one which examined several osteological features (Chantell, 1970) and the other utilizing biochemical comparisons (Wallace et al., 1973), have suggested that the composition of the species groups of western ranids needs to be reevaluated. Their data suggested that R. boylii and R. muscosa were closely allied with the members of the aurora group, with the possible exception of R. sylvatica. This study deals primarily with Rana aurora, R. boylii, R. cascadae, R. musI Present address: Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138. cosa, and R. pretiosa, all of which are native to the western United States, and R. tarahumarae which occurs primarily in Mexico. The geographical ranges of these six species are shown in Figs. 1 and 2. The methods of starch gel electrophoresis and microcomplement fixation have been utilized here in an analysis of the evolutionary relationships among members of the genus Rana in western North America. MATERIALS AND METHODS Rana boylii from six localities, R. muscosa from eight localities, R. aurora draytoni from seven localities, R. pretiosa from two localities, R. tarahumarae and R. catesbeiana from one locality each were examined (see Appendix for locality data). Eleven enzymes encoded ly 15 loci and five serum proteins were examined electrophoretically in horizontal starch gel. These were: lactate dehydrogenase (LDH-1 and 2), malate dehy-

95 citations

Journal ArticleDOI
TL;DR: In this article, the relative merits of four different tree-making methods in obtaining the correct topology were studied by using computer simulation, including unweighted pair-group method with arithmetic mean (UPGMA), Fitch and Margoliash's (FM), thd distance Wagner (DW) method, and Tateno et al.'s modified Farris (MF) method.
Abstract: The relative merits of four different tree-making methods in obtaining the correct topology were studied by using computer simulation. The methods studied were the unweighted pair-group method with arithmetic mean (UPGMA), Fitch and Margoliash's (FM) method, thd distance Wagner (DW) method, and Tateno et al.'s modified Farris (MF) method. An ancestral DNA sequence was assumed to evolve into eight sequences following a given model tree. Both constant and varying rates of nucleotide substitution were considered. Once the DNA sequences for the eight extant species were obtained, phylogenetic trees were constructed by using corrected (d) and uncorrected (p) nucleotide substitutions per site. The topologies of the trees obtained were then compared with that of the model tree. The results obtained can be summarized as follows: (1) The probability of obtaining the correct rooted or unrooted tree is low unless a large number of nucleotide differences exists between different sequences. (2) When the number of nucleotide substitutions per sequence is small or moderately large, the FM, DW, and MF methods show a better performance than UPGMA in recovering the correct topology. The former group of methods is particularly good for obtaining the correct unrooted tree. (3) When the number of substitutions per sequence is large, UPGMA is at least as good as the other methods, particularly for obtaining the correct rooted tree. (4) When the rate of nucleotide substitution varies with evolutionary lineage, the FM, DW, and MF methods show a better performance in obtaining the correct topology than UPGMA, except when a rooted tree is to be produced from data with a large number of nucleotide substitutions per sequence.(ABSTRACT TRUNCATED AT 250 WORDS)

75 citations

Journal ArticleDOI
TL;DR: This paper describes another method, related to the previous one, in which a present-day sequence can serve temporarily as an ancestor for purposes of determining the evolutionary tree regardless of the rates of evolution of the sequences involved.

64 citations

Journal ArticleDOI
TL;DR: The distance-Wagner algorithm is suggested as a possible heuristic method for the approximation of most-parsimonious trees from distance data and a new algorithm developed in this study compares favorably with not only the distance-wagner algorithms but also the character-W Wagner algorithm in the approximationof most- parsimoniously trees.
Abstract: Distance data have posed a number of problems for phylogenetic analysis. Among these are the loss of information about individual character states, and the frequent departures from metric properties of distance matrices derived by molecular techniques. The common degree-of-fit methods for the analysis of such data imply possibly unrealistic assumptions about these distances. As an alternative, a minimum-length criterion is considered. This has the appeal of requiring more conservative assumptions about distance data and represents the equivalent criterion to that for the analysis of character data by numerical cladistic techniques. Based upon its similarity to the character-Wagner algorithm, the distance-Wagner algorithm is suggested as a possible heuristic method for the approximation of most-parsimonious trees from distance data. Both the distance-Wagner algorithm and a recent modification have weaknesses in this role. Computer simulations demonstrate that the new algorithm developed in this study com- pares favorably with not only the distance-Wagner algorithm but also the character-Wagner algorithm in the approximation of most-parsimonious trees. (Metric; molecular distances; Wag- ner algorithm; parsimony; cladistics; immunological distance.) Among current methods for the infer- ence of phylogenetic trees, those relying on the use of character data seem to con- trast increasingly with those methods being developed that depend only upon the analysis of a matrix of pairwise dis- tances among taxa. The contrast is not only in terms of the mathematics of the respec- tive algorithms, but also apparently in terms of the goal of the analysis; what is being optimized may be quite different in the two cases. In the first case, for char- acter data, an approximation to a phylo- genetic tree is usually derived by the ap- plication of some form of a parsimony criterion (for a recent review of parsimony methods see Felsenstein, 1982). For ex- ample, the character-Wagner algorithm of Farris (1970) is used to approximate an overall most-parsimonious tree (Wagner tree) by a sequence of individually most- parsimonious additions of taxa to the tree or network. The best approximation is equivalent to the tree that has minimum length. This length is computed using a Manhattan-metric measure based upon the character data. The second group of methods, using

28 citations