scispace - formally typeset
Search or ask a question
Author

Johan A. A. Nylander

Bio: Johan A. A. Nylander is an academic researcher from Swedish Museum of Natural History. The author has contributed to research in topics: Phylogenetic tree & Phylogenetics. The author has an hindex of 23, co-authored 38 publications receiving 6004 citations. Previous affiliations of Johan A. A. Nylander include Florida State University & Science for Life Laboratory.

Papers
More filters
Journal ArticleDOI
TL;DR: A Bayesian MCMC approach to the analysis of combined data sets was developed and its utility in inferring relationships among gall wasps based on data from morphology and four genes was explored, supporting the utility of morphological data in multigene analyses.
Abstract: The recent development of Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) techniques has facilitated the exploration of parameter-rich evolutionary models. At the same time, stochastic models have become more realistic (and complex) and have been extended to new types of data, such as morphology. Based on this foundation, we developed a Bayesian MCMC approach to the analysis of combined data sets and explored its utility in inferring relationships among gall wasps based on data from morphology and four genes (nuclear and mitochondrial, ribosomal and protein coding). Examined models range in complexity from those recognizing only a morphological and a molecular partition to those having complex substitution models with independent parameters for each gene. Bayesian MCMC analysis deals efficiently with complex models: convergence occurs faster and more predictably for complex models, mixing is adequate for all parameters even under very complex models, and the parameter update cycle is virtually unaffected by model partitioning across sites. Morphology contributed only 5% of the characters in the data set but nevertheless influenced the combined-data tree, supporting the utility of morphological data in multigene analyses. We used Bayesian criteria (Bayes factors) to show that process heterogeneity across data partitions is a significant model component, although not as important as among-site rate variation. More complex evolutionary models are associated with more topological uncertainty and less conflict between morphology and molecules. Bayes factors sometimes favor simpler models over considerably more parameter-rich models, but the best model overall is also the most complex and Bayes factors do not support exclusion of apparently weak parameters from this model. Thus, Bayes factors appear to be useful for selecting among complex models, but it is still unclear whether their use strikes a reasonable balance between model complexity and error in parameter estimates.

1,758 citations

Journal ArticleDOI
TL;DR: A simple tool is presented that uses the output from MCMC simulations and visualizes a number of properties of primary interest in a Bayesian phylogenetic analysis, such as convergence rates of posterior split probabilities and branch lengths.
Abstract: Summary: A key element to a successful Markov chain Monte Carlo (MCMC) inference is the programming and run performance of the Markov chain. However, the explicit use of quality assessments of the MCMC simulations—convergence diagnostics—in phylogenetics is still uncommon. Here, we present a simple tool that uses the output from MCMC simulations and visualizes a number of properties of primary interest in a Bayesian phylogenetic analysis, such as convergence rates of posterior split probabilities and branch lengths. Graphical exploration of the output from phylogenetic MCMC simulations gives intuitive and often crucial information on the success and reliability of the analysis. The tool presented here complements convergence diagnostics already available in other software packages primarily designed for other applications of MCMC. Importantly, the common practice of using trace-plots of a single parameter or summary statistic, such as the likelihood score of sampled trees, can be misleading for assessing the success of a phylogenetic MCMC simulation. Availability: The program is available as source under the GNU General Public License and as a web application at http://ceb.scs.fsu.edu/awty Contact: jwilgenb@scs.fsu.edu

1,740 citations

Journal ArticleDOI
TL;DR: The results suggest that the Rubiaceae originated in the Paleotropics and used the boreotropical connection to reach South America, and the biogeographic patterns found corroborate the existence of a long-lasting dispersal barrier between the Northern and Central Andes, the “Western Andean Portal.”
Abstract: Recent phylogenetic studies have revealed the major role played by the uplift of the Andes in the extraordinary diversification of the Neotropical flora. These studies, however, have typically considered the Andean uplift as a single, time-limited event fostering the evolution of highland elements. This contrasts with geological reconstructions indicating that the uplift occurred in discrete periods from west to east and that it affected different regions at different times. We introduce an approach for integrating Andean tectonics with biogeographic reconstructions of Neotropical plants, using the coffee family (Rubiaceae) as a model group. The distribution of this family spans highland and montane habitats as well as tropical lowlands of Central and South America, thus offering a unique opportunity to study the influence of the Andean uplift on the entire Neotropical flora. Our results suggest that the Rubiaceae originated in the Paleotropics and used the boreotropical connection to reach South America. The biogeographic patterns found corroborate the existence of a long-lasting dispersal barrier between the Northern and Central Andes, the “Western Andean Portal.” The uplift of the Eastern Cordillera ended this barrier, allowing dispersal of boreotropical lineages to the South, but gave rise to a huge wetland system (“Lake Pebas”) in western Amazonia that prevented in situ speciation and floristic dispersal between the Andes and Amazonia for at least 6 million years. Here, we provide evidence of these events in plants.

561 citations

Journal ArticleDOI
TL;DR: A Bayesian approach to dispersal-vicariance analysis is applied that allows a more accurate analysis of the biogeographic history of lineages and finds that despite the uncertainty in tree topology, ancestral area reconstructions indicate that the Turdus clade originated in the eastern Palearctic during the Late Miocene.
Abstract: The phylogeny of the thrushes (Aves: Turdus) has been difficult to reconstruct due to short internal branches and lack of node support for certain parts of the tree. Reconstructing the biogeographic history of this group is further complicated by the fact that current implementations of biogeographic methods, such as dispersal-vicariance analysis (DIVA; Ronquist, 1997), require a fully resolved tree. Here, we apply a Bayesian approach to dispersal-vicariance analysis that accounts for phylogenetic uncertainty and allows a more accurate analysis of the biogeographic history of lineages. Specifically, ancestral area reconstructions can be presented as marginal distributions, thus displaying the underlying topological uncertainty. Moreover, if there are multiple optimal solutions for a single node on a certain tree, integrating over the posterior distribution of trees often reveals a preference for a narrower set of solutions. We find that despite the uncertainty in tree topology, ancestral area reconstructions indicate that the Turdus clade originated in the eastern Palearctic during the Late Miocene. This was followed by an early dispersal to Africa from where a worldwide radiation took place. The uncertainty in tree topology and short branch lengths seems to indicate that this radiation took place within a limited time span during the Late Pliocene. The results support the role of Africa as a probable source area for intercontinental dispersals as suggested for other passerine groups, including basal diversification within the songbird tree.

363 citations

Journal ArticleDOI
TL;DR: A parametric method, dispersal–extinction–cladogenesis (DEC), is compared against a parsimony‐based method, disperseal–vicariance analysis (DIVA), which does not incorporate branch lengths but accounts for phylogenetic uncertainty through a Bayesian empirical approach (Bayes‐DIVA).
Abstract: Aim Recently developed parametric methods in historical biogeography allow researchers to integrate temporal and palaeogeographical information into the reconstruction of biogeographical scenarios, thus overcoming a known bias of parsimony-based approaches. Here, we compare a parametric method, dispersal-extinction-cladogenesis (DEC), against a parsimony-based method, dispersal-vicariance analysis (DIVA), which does not incorporate branch lengths but accounts for phylogenetic uncertainty through a Bayesian empirical approach (Bayes-DIVA). We analyse the benefits and limitations of each method using the cosmopolitan plant family Sapindaceae as a case study. Location World-wide. Methods Phylogenetic relationships were estimated by Bayesian inference on a large dataset representing generic diversity within Sapindaceae. Lineage divergence times were estimated by penalized likelihood over a sample of trees from the posterior distribution of the phylogeny to account for dating uncertainty in biogeographical reconstructions. We compared biogeographical scenarios between Bayes-DIVA and two different DEC models: one with no geological constraints and another that employed a stratified palaeogeographical model in which dispersal rates were scaled according to area connectivity across four time slices, reflecting the changing continental configuration over the last 110 million years. Results Despite differences in the underlying biogeographical model, Bayes-DIVA and DEC inferred similar biogeographical scenarios. The main differences were: (1) in the timing of dispersal events - which in Bayes-DIVA sometimes conflicts with palaeogeographical information, and (2) in the lower frequency of terminal dispersal events inferred by DEC. Uncertainty in divergence time estimations influenced both the inference of ancestral ranges and the decisiveness with which an area can be assigned to a node. Main conclusions By considering lineage divergence times, the DEC method gives more accurate reconstructions that are in agreement with palaeogeographical evidence. In contrast, Bayes-DIVA showed the highest decisiveness in unequivocally reconstructing ancestral ranges, probably reflecting its ability to integrate phylogenetic uncertainty. Care should be taken in defining the palaeogeographical model in DEC because of the possibility of overestimating the frequency of extinction events, or of inferring ancestral ranges that are outside the extant species ranges, owing to dispersal constraints enforced by the model. The wide-spanning spatial and temporal model proposed here could prove useful for testing large-scale biogeographical patterns in plants.

187 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: PAML, currently in version 4, is a package of programs for phylogenetic analyses of DNA and protein sequences using maximum likelihood (ML), which can be used to estimate parameters in models of sequence evolution and to test interesting biological hypotheses.
Abstract: PAML, currently in version 4, is a package of programs for phylogenetic analyses of DNA and protein sequences using maximum likelihood (ML). The programs may be used to compare and test phylogenetic trees, but their main strengths lie in the rich repertoire of evolutionary models implemented, which can be used to estimate parameters in models of sequence evolution and to test interesting biological hypotheses. Uses of the programs include estimation of synonymous and nonsynonymous rates (d(N) and d(S)) between two protein-coding DNA sequences, inference of positive Darwinian selection through phylogenetic comparison of protein-coding genes, reconstruction of ancestral genes and proteins for molecular restoration studies of extinct life forms, combined analysis of heterogeneous data sets from multiple gene loci, and estimation of species divergence times incorporating uncertainties in fossil calibrations. This note discusses some of the major applications of the package, which includes example data sets to demonstrate their use. The package is written in ANSI C, and runs under Windows, Mac OSX, and UNIX systems. It is available at -- (http://abacus.gene.ucl.ac.uk/software/paml.html).

10,773 citations

Journal ArticleDOI
TL;DR: The software package Tracer is presented, for visualizing and analyzing the MCMC trace files generated through Bayesian phylogenetic inference, which provides kernel density estimation, multivariate visualization, demographic trajectory reconstruction, conditional posterior distribution summary, and more.
Abstract: Bayesian inference of phylogeny using Markov chain Monte Carlo (MCMC) plays a central role in understanding evolutionary history from molecular sequence data. Visualizing and analyzing the MCMC-generated samples from the posterior distribution is a key step in any non-trivial Bayesian inference. We present the software package Tracer (version 1.7) for visualizing and analyzing the MCMC trace files generated through Bayesian phylogenetic inference. Tracer provides kernel density estimation, multivariate visualization, demographic trajectory reconstruction, conditional posterior distribution summary, and more. Tracer is open-source and available at http://beast.community/tracer.

5,492 citations

Journal ArticleDOI
TL;DR: Two new objective methods for the combined selection of best-fit partitioning schemes and nucleotide substitution models are described and implemented in an open-source program, PartitionFinder, which it is hoped will encourage the objective selection of partitions and thus lead to improvements in phylogenetic analyses.
Abstract: In phylogenetic analyses of molecular sequence data, partitioning involves estimating independent models of molecular evolution for different sets of sites in a sequence alignment. Choosing an appropriate partitioning scheme is an important step in most analyses because it can affect the accuracy of phylogenetic reconstruction. Despite this, partitioning schemes are often chosen without explicit statistical justification. Here, we describe two new objective methods for the combined selection of best-fit partitioning schemes and nucleotide substitution models. These methods allow millions of partitioning schemes to be compared in realistic time frames and so permit the objective selection of partitioning schemes even for large multilocus DNA data sets. We demonstrate that these methods significantly outperform previous approaches, including both the ad hoc selection of partitioning schemes (e.g., partitioning by gene or codon position) and a recently proposed hierarchical clustering method. We have implemented these methods in an open-source program, PartitionFinder. This program allows users to select partitioning schemes and substitution models using a range of information-theoretic metrics (e.g., the Bayesian information criterion, akaike information criterion [AIC], and corrected AIC). We hope that PartitionFinder will encourage the objective selection of partitioning schemes and thus lead to improvements in phylogenetic analyses. PartitionFinder is written in Python and runs under Mac OSX 10.4 and above. The program, source code, and a detailed manual are freely available from www.robertlanfear.com/partitionfinder.

4,877 citations

Journal ArticleDOI

3,734 citations

Journal ArticleDOI
TL;DR: It is argued that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages.
Abstract: Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the selection of substitution models in phylogenetics from a theoretical, philosophical and practical point of view, and summarize this comparison in table format. We argue that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages. In particular, the latter two methods are able to simultaneously compare multiple nested or nonnested models, assess model selection uncertainty, and allow for the estimation of phylogenies and model parameters using all available models (model-averaged inference or multimodel inference). We also describe how the relative importance of the different parameters included in substitution models can be depicted. To illustrate some of these points, we have applied AIC-based model averaging to 37 mitochondrial DNA sequences from the subgenus Ohomopterus (genus Carabus) ground beetles described by Sota and Vogler (2001). (AIC; Bayes factors; BIC; likelihood ratio tests; model averaging; model uncertainty; model selection; multimodel inference.) It is clear that models of nucleotide substitution (henceforth models of evolution) play a significant role in molecular phylogenetics, particularly in the context of distance, maximum likelihood (ML), and Bayesian es- timation. We know that the use of one or other model affects many, if not all, stages of phylogenetic inference. For example, estimates of phylogeny, substitution rates, bootstrap values, posterior probabilities, or tests of the molecular clock are clearly influenced by the model of evolution used in the analysis (Buckley, 2002; Buckley

3,712 citations