scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Methods for Computing Wagner Trees

01 Mar 1970-Systematic Biology (Oxford University Press)-Vol. 19, Iss: 1, pp 83-92
TL;DR: The concept of a Wagner Network is formalized and a number of algorithms for calculating such networks are discussed and the rationale for the methods described is discussed.
Abstract: Farris, J. S. (Biol. Sci., State Univ., Stony Brook, N. Y.) 1970. Methods for computing Wagner Trees. Syst. Zool., 19:8342.-The article derives some properties of Wagner Trees and Networks and describes computational procedures for Prim Networks, the Wagner Method, Rootless Wagner Method and optimization of hypothetical intermediates ( HTUs). The Wagner Ground Plan Analysis method for estimating evolutionary trees has been widely employed in botanical studies (see references in Wagner, 1961) and has more recently been employed in zoological evolutionary taxonomy (Kluge, 1966; Kluge and Farris, 1969). Wagner Trees are one possible generalization of the most parsimonious trees of Camin and Sokal (1965). The Wagner technique is of con-siderable interest for quantitative evolution- ary taxonomists because it is readily pro- grammable and because the type of tree produced can tractably be extended to ap- plications in a variety of novel quantitative phyletic techniques. In this paper I shall formalize the concept of a Wagner Network and discuss a number of algorithms for calculating such networks. The rationale for the methods described will not be treated extensively here, as it is published elsewhere (Kluge and Farris, 1969).
Citations
More filters
Journal ArticleDOI
TL;DR: It is shown that a combination of hill-climbing approaches and a stochastic perturbation method can be time-efficiently implemented and found higher likelihoods between 62.2% and 87.1% of the studied alignments, thus efficiently exploring the tree-space.
Abstract: Large phylogenomics data sets require fast tree inference methods, especially for maximum-likelihood (ML) phylogenies. Fast programs exist, but due to inherent heuristics to find optimal trees, it is not clear whether the best tree is found. Thus, there is need for additional approaches that employ different search strategies to find ML trees and that are at the same time as fast as currently available ML programs. We show that a combination of hill-climbing approaches and a stochastic perturbation method can be time-efficiently implemented. If we allow the same CPU time as RAxML and PhyML, then our software IQ-TREE found higher likelihoods between 62.2% and 87.1% of the studied alignments, thus efficiently exploring the tree-space. If we use the IQ-TREE stopping rule, RAxML and PhyML are faster in 75.7% and 47.1% of the DNA alignments and 42.2% and 100% of the protein alignments, respectively. However, the range of obtaining higher likelihoods with IQ-TREE improves to 73.3-97.1%. IQ-TREE is freely available at http://www.cibiv.at/software/iqtree.

13,668 citations

Journal ArticleDOI
01 Jun 1992-Genetics
TL;DR: In this article, a framework for the study of molecular variation within a single species is presented, where information on DNA haplotype divergence is incorporated into an analysis of variance format, derived from a matrix of squared-distances among all pairs of haplotypes.
Abstract: We present here a framework for the study of molecular variation within a single species. Information on DNA haplotype divergence is incorporated into an analysis of variance format, derived from a matrix of squared-distances among all pairs of haplotypes. This analysis of molecular variance (AMOVA) produces estimates of variance components and F-statistic analogs, designated here as phi-statistics, reflecting the correlation of haplotypic diversity at different levels of hierarchical subdivision. The method is flexible enough to accommodate several alternative input matrices, corresponding to different types of molecular data, as well as different types of evolutionary assumptions, without modifying the basic structure of the analysis. The significance of the variance components and phi-statistics is tested using a permutational approach, eliminating the normality assumption that is conventional for analysis of variance but inappropriate for molecular data. Application of AMOVA to human mitochondrial DNA haplotype data shows that population subdivisions are better resolved when some measure of molecular differences among haplotypes is introduced into the analysis. At the intraspecific level, however, the additional information provided by knowing the exact phylogenetic relations among haplotypes or by a nonlinear translation of restriction-site change into nucleotide diversity does not significantly modify the inferred population genetic structure. Monte Carlo studies show that site sampling does not fundamentally affect the significance of the molecular variance components. The AMOVA treatment is easily extended in several different directions and it constitutes a coherent and flexible framework for the statistical analysis of molecular data.

12,835 citations

Journal ArticleDOI
TL;DR: A method for constructing networks from recombination-free population data that combines features of Kruskal's algorithm for finding minimum spanning trees by favoring short connections, and Farris's maximum-parsimony (MP) heuristic algorithm, which sequentially adds new vertices called "median vectors", except that the MJ method does not resolve ties.
Abstract: Reconstructing phylogenies from intraspecific data (such as human mitochondrial DNA variation) is often a challenging task because of large sample sizes and small genetic distances between individuals. The resulting multitude of plausible trees is best expressed by a network which displays alternative potential evolutionary paths in the form of cycles. We present a method ("median joining" [MJ]) for constructing networks from recombination-free population data that combines features of Kruskal's algorithm for finding minimum spanning trees by favoring short connections, and Farris's maximum-parsimony (MP) heuristic algorithm, which sequentially adds new vertices called "median vectors", except that our MJ method does not resolve ties. The MJ method is hence closely related to the earlier approach of Foulds, Hendy, and Penny for estimating MP trees but can be adjusted to the level of homoplasy by setting a parameter epsilon. Unlike our earlier reduced median (RM) network method, MJ is applicable to multistate characters (e.g., amino acid sequences). An additional feature is the speed of the implemented algorithm: a sample of 800 worldwide mtDNA hypervariable segment I sequences requires less than 3 h on a Pentium 120 PC. The MJ method is demonstrated on a Tibetan mitochondrial DNA RFLP data set.

9,937 citations

Journal ArticleDOI
TL;DR: A method is presented that is asserted to provide all hypothetical ancestral character states that are consistent with describing the descent of the present-day character states in a minimum number of changes of state using a predetermined phylogenetic relationship among the taxa represented.
Abstract: Fitch, W. M. (Dept. of Physiological Chemistry, Univ. of Wisconsin, Madison, Wisconsin, 53706), 1971. Toward defining the course of evolution: minimum change for a specific tree topology. Syst. Zool., 20:406-416.-A method is presented that is asserted to provide all hypothetical ancestral character states that are consistent with describing the descent of the present-day character states in a minimum number of changes of state using a predetermined phylogenetic relationship among the taxa represented. The character states used as examples are the four messenger RNA nucleotides encoding the amino acid sequences of proteins, but the method is general. [Evolution; parsimonious trees.] It has been a goal of those attempting to deduce phylogenetic relationships from information on biological characteristics to find the ancestral relationship(s) that would permit one to account for the descent of those characteristics in a manner requiring a minimum number of evolutionary steps or changes. The result could be called the most parsimonious evolutionary tree and might be expected to have a high degree of correspondence to the true phylogeny (Camin and Sokal, 1965). It's justification lies in the most efficient use of the information available and does not presuppose that evolution follows a most parsimonious course. There are no known algorithms for finding the most parsimonious tree(s) apart from the brute force method of examining nearly every possible tree.' This is impractical for trees involving a dozen or more taxonomic units. Most numerical taxonomic procedures (Sokal and Sneath, 1963; Farris, 1969, 1970; Fitch and Margoliash, 1967) provide dendrograms that would be among the more parsimonious solutions; one just cannot be sure that a more parsimonious tree structure does not exist. Farris (1970) has explicitly considered the parsimony principle as a part of 'An elegant beginning to an attack on the problem has recently been published by Farris (1969) who developed a method which estimates the reliability of various characters and then weights the characters on the basis of that reliability. his method which, like the present method, has its roots in the Wagner tree (Wagner,

7,028 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, the basic problem of interconnecting a given set of terminals with a shortest possible network of direct links is considered, and a set of simple and practical procedures are given for solving this problem both graphically and computationally.
Abstract: The basic problem considered is that of interconnecting a given set of terminals with a shortest possible network of direct links Simple and practical procedures are given for solving this problem both graphically and computationally It develops that these procedures also provide solutions for a much broader class of problems, containing other examples of practical interest

4,395 citations

Journal ArticleDOI
TL;DR: Results indicate that the successive weighting procedure can be highly successful, even when cladistically reliable characters are heavily outnumbered by unreliable ones, and computer simulation tests of the technique are described.
Abstract: Fa~ris, J. S. (Dept. Biol. Sci., State Uniu., Stony Brook, New York 11790) 1969. A successive approximations approach to charaaer weighting. Syst. Zool., 18:374385.-Characters that are reliable for cladistic inference are those that are consistent with the true phyletic relationships, that is, those that have little homoplasy. A set of cladistically reliable characters are correlated with each other in a particular non-linear fashion here referred to as hierarchic correlation. Cladistically unreliable characters can be hierarchically correlated only by chance. A technique that infers cladistic relationships by successively weighting characters according to apparent cladistic reliability is suggested, and computer simulation tests of the technique are described. Results indicate that the successive weighting procedure can be highly successful, even when cladistically reliable characters are heavily outnumbered by unreliable ones. [Evolutionary taxonomy. Cladistics. Char-acter weighting.]

1,207 citations

Journal ArticleDOI
TL;DR: With the advent of relatively objective classifications, such as the phenetic classifications produced by the operational techniques of numerical taxonomy, it was inevitable that biologists would wonder what phylogenetic conclusions could be drawn from them and with what reliability.
Abstract: With the advent of relatively objective classifications, such as the phenetic classifications produced by the operational techniques of numerical taxonomy (Sokal and Sneath, 1963), it was inevitable that biologists would wonder what phylogenetic conclusions could be drawn from them and with what reliability. If these phenetic taxonomies did not reflect all of the elements of phyletics (Sokal and Camin, 1965), could techniques be devised for deducing the latter? For example, could operational methods be devised for deducing the cladistic relationships among taxa, so that, given the same initial information, different investigators would obtain the same results? By cladistic relationships we mean the evolutionary branching sequences among taxonomic units without regard to phenetic similarities among them or to an absolute time scale. There is no question that phylogenies could probably be reconstructed without error for any taxonomic group if complete fossil sequences for that group were available. However, can cladistic reconstructions be carried out with any degree of

710 citations