scispace - formally typeset

Journal ArticleDOI

MODELTEST: testing the model of DNA substitution.

01 Jan 1998-Bioinformatics (Oxford University Press)-Vol. 14, Iss: 9, pp 817-818

TL;DR: The program MODELTEST uses log likelihood scores to establish the model of DNA evolution that best fits the data.

AbstractSummary: The program MODELTEST uses log likelihood scores to establish the model of DNA evolution that best fits the data. Availability: The MODELTEST package, including the source code and some documentation is available at http://bioag.byu.edu/zoology/crandall―lab/modeltest.html. Contact: dp47@email.byu.edu.

...read more

Content maybe subject to copyright    Report

Citations
More filters

Journal ArticleDOI
TL;DR: The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models, inferring ancestral states and sequences, and estimating evolutionary rates site-by-site.
Abstract: Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net.

37,583 citations


Cites methods from "MODELTEST: testing the model of DNA..."

  • ...The Molecular Evolutionary Genetics Analysis (MEGA) software was developed with the goal of providing a biologist centric, integrated suite of tools for statistical analyses of DNA and protein sequence data from an evolutionary standpoint....

    [...]


Journal ArticleDOI
TL;DR: The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly, and provides more output options than previously, including samples of ancestral states, site rates, site dN/dS rations, branch rates, and node dates.
Abstract: Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site d(N)/d(S) rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software.

14,723 citations


Cites methods from "MODELTEST: testing the model of DNA..."

  • ...It is standard practice today to select a substitution model for Bayesian phylogenetic inference using a priori model selection procedures (Goldman 1993; Posada 1998, 2008; Suchard et al. 2001)....

    [...]


Journal ArticleDOI
TL;DR: jModelTest 2: more models, new heuristics and parallel computing Diego Darriba, Guillermo L. Taboada, Ramón Doallo and David Posada.
Abstract: jModelTest 2: more models, new heuristics and parallel computing Diego Darriba, Guillermo L. Taboada, Ramón Doallo and David Posada Supplementary Table 1. New features in jModelTest 2 Supplementary Table 2. Model selection accuracy Supplementary Table 3. Mean square errors for model averaged estimates Supplementary Note 1. Hill-climbing hierarchical clustering algorithm Supplementary Note 2. Heuristic filtering Supplementary Note 3. Simulations from prior distributions Supplementary Note 4. Speed-up benchmark on real and simulated datasets

10,986 citations


Journal ArticleDOI
David Posada1
TL;DR: jModelTest is a new program for the statistical selection of models of nucleotide substitution based on "Phyml" that implements 5 different selection strategies, including "hierarchical and dynamical likelihood ratio tests," the "Akaike information criterion", the "Bayesian information criterion," and a "decision-theoretic performance-based" approach.
Abstract: jModelTest is a new program for the statistical selection of models of nucleotide substitution based on "Phyml" (Guindon and Gascuel 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 52:696-704.). It implements 5 different selection strategies, including "hierarchical and dynamical likelihood ratio tests," the "Akaike information criterion," the "Bayesian information criterion," and a "decision-theoretic performance-based" approach. This program also calculates the relative importance and model-averaged estimates of substitution parameters, including a model-averaged estimate of the phylogeny. jModelTest is written in Java and runs under Mac OSX, Windows, and Unix systems with a Java Runtime Environment installed. The program, including documentation, can be freely downloaded from the software section at http://darwin.uvigo.es.

9,080 citations


Cites background from "MODELTEST: testing the model of DNA..."

  • ...jModelTest: Phylogenetic Model Averaging...

    [...]

  • ...Among these, Modeltest (Posada and Crandall 1998) has been one of the most popular....

    [...]

  • ...I want to acknowledge Sudhir Kumar for inviting me to present the latest advances in Modeltest at the 2006 SMBE annual meeting, which finally prompted the completion of jModelTest....

    [...]

  • ...This note describes a new program called jModelTest that supersedes Modeltest in several aspects. jModelTest allows for the definition of restricted sets of candidate models (table 1), implements customizable ‘‘hierarchical likelihood ratio tests’’ (hLRTs) (Frati et al. 1997; Huelsenbeck and Crandall 1997; Sullivan et al. 1997) and ‘‘dynamic likelihood ratio tests’’ (dLRTs) (Posada and Crandall 2001), provides a rank of models according to the ‘‘Akaike Information Criterion’’ (AIC) (Akaike 1973), to the ‘‘Bayesian Information Criterion’’ (BIC) (Schwarz 1978) or to a ‘‘decision-theoretic performance-based’’ approach (DT) (Minin et al. 2003) (table 2), calculates the relative importance of every parameter, and computes model-averaged estimates of these, including a model-averaged estimate of the tree topology (Posada and Buckley 2004)....

    [...]

  • ...Modeltest: testing the model of DNA substitution....

    [...]


Journal ArticleDOI
TL;DR: Analysis of the microbiota of genetically obese ob/ob mice, lean ob/+ and wild-type siblings, and their ob/+ mothers, all fed the same polysaccharide-rich diet, indicates that obesity affects the diversity of the gut microbiota and suggests that intentional manipulation of community structure may be useful for regulating energy balance in obese individuals.
Abstract: We have analyzed 5,088 bacterial 16S rRNA gene sequences from the distal intestinal (cecal) microbiota of genetically obese ob/ob mice, lean ob/+ and wild-type siblings, and their ob/+ mothers, all fed the same polysaccharide-rich diet. Although the majority of mouse gut species are unique, the mouse and human microbiota(s) are similar at the division (superkingdom) level, with Firmicutes and Bacteroidetes dominating. Microbial-community composition is inherited from mothers. However, compared with lean mice and regardless of kinship, ob/ob animals have a 50% reduction in the abundance of Bacteroidetes and a proportional increase in Firmicutes. These changes, which are division-wide, indicate that, in this model, obesity affects the diversity of the gut microbiota and suggest that intentional manipulation of community structure may be useful for regulating energy balance in obese individuals.

4,700 citations


References
More filters

Journal ArticleDOI
Abstract: The history of the development of statistical hypothesis testing in time series analysis is reviewed briefly and it is pointed out that the hypothesis testing procedure is not adequately defined as the procedure for statistical model identification. The classical maximum likelihood estimation procedure is reviewed and a new estimate minimum information theoretical criterion (AIC) estimate (MAICE) which is designed for the purpose of statistical identification is introduced. When there are several competing models the MAICE is defined by the model and the maximum likelihood estimates of the parameters which give the minimum of AIC defined by AIC = (-2)log-(maximum likelihood) + 2(number of independently adjusted parameters within the model). MAICE provides a versatile procedure for statistical model identification which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure. The practical utility of MAICE in time series analysis is demonstrated with some numerical examples.

42,619 citations


Journal ArticleDOI
TL;DR: Some examples were worked out using reported globin sequences to show that synonymous substitutions occur at much higher rates than amino acid-altering substitutions in evolution.
Abstract: Some simple formulae were obtained which enable us to estimate evolutionary distances in terms of the number of nucleotide substitutions (and, also, the evolutionary rates when the divergence times are known). In comparing a pair of nucleotide sequences, we distinguish two types of differences; if homologous sites are occupied by different nucleotide bases but both are purines or both pyrimidines, the difference is called type I (or “transition” type), while, if one of the two is a purine and the other is a pyrimidine, the difference is called type II (or “transversion” type). Letting P and Q be respectively the fractions of nucleotide sites showing type I and type II differences between two sequences compared, then the evolutionary distance per site is K = — (1/2) ln {(1 — 2P — Q) }. The evolutionary rate per year is then given by k = K/(2T), where T is the time since the divergence of the two sequences. If only the third codon positions are compared, the synonymous component of the evolutionary base substitutions per site is estimated by K'S = — (1/2) ln (1 — 2P — Q). Also, formulae for standard errors were obtained. Some examples were worked out using reported globin sequences to show that synonymous substitutions occur at much higher rates than amino acid-altering substitutions in evolution.

23,941 citations


Book ChapterDOI
01 Jan 1969

9,921 citations


Journal ArticleDOI
TL;DR: A new statistical method for estimating divergence dates of species from DNA sequence data by a molecular clock approach is developed, and this dating may pose a problem for the widely believed hypothesis that the bipedal creatureAustralopithecus afarensis, which lived some 3.7 million years ago, was ancestral to man and evolved after the human-ape splitting.
Abstract: A new statistical method for estimating divergence dates of species from DNA sequence data by a molecular clock approach is developed. This method takes into account effectively the information contained in a set of DNA sequence data. The molecular clock of mitochondrial DNA (mtDNA) was calibrated by setting the date of divergence between primates and ungulates at the Cretaceous-Tertiary boundary (65 million years ago), when the extinction of dinosaurs occurred. A generalized least-squares method was applied in fitting a model to mtDNA sequence data, and the clock gave dates of 92.3 +/- 11.7, 13.3 +/- 1.5, 10.9 +/- 1.2, 3.7 +/- 0.6, and 2.7 +/- 0.6 million years ago (where the second of each pair of numbers is the standard deviation) for the separation of mouse, gibbon, orangutan, gorilla, and chimpanzee, respectively, from the line leading to humans. Although there is some uncertainty in the clock, this dating may pose a problem for the widely believed hypothesis that the pipedal creature Australopithecus afarensis, which lived some 3.7 million years ago at Laetoli in Tanzania and at Hadar in Ethiopia, was ancestral to man and evolved after the human-ape splitting. Another likelier possibility is that mtDNA was transferred through hybridization between a proto-human and a proto-chimpanzee after the former had developed bipedalism.

7,677 citations


Journal ArticleDOI
01 Jan 1978

6,002 citations