scispace - formally typeset
Search or ask a question

Showing papers in "Systematic Biology in 2003"


Journal ArticleDOI
TL;DR: This work has used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches.
Abstract: The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/. (Algorithm; computer simulations; maximum likelihood; phylogeny; rbcL; RDPII project.) The size of homologous sequence data sets has in- creased dramatically in recent years, and many of these data sets now involve several hundreds of taxa. More- over, current probabilistic sequence evolution models (Swofford et al., 1996 ; Page and Holmes, 1998 ), notably those including rate variation among sites (Uzzell and Corbin, 1971 ; Jin and Nei, 1990 ; Yang, 1996 ), require an increasing number of calculations. Therefore, the speed of phylogeny reconstruction methods is becoming a sig- nificant requirement and good compromises between speed and accuracy must be found. The maximum likelihood (ML) approach is especially accurate for building molecular phylogenies. Felsenstein (1981) brought this framework to nucleotide-based phy- logenetic inference, and it was later also applied to amino acid sequences (Kishino et al., 1990). Several vari- ants were proposed, most notably the Bayesian meth- ods (Rannala and Yang 1996; and see below), and the discrete Fourier analysis of Hendy et al. (1994), for ex- ample. Numerous computer studies (Huelsenbeck and Hillis, 1993; Kuhner and Felsenstein, 1994; Huelsenbeck, 1995; Rosenberg and Kumar, 2001; Ranwez and Gascuel, 2002) have shown that ML programs can recover the cor- rect tree from simulated data sets more frequently than other methods can. Another important advantage of the ML approach is the ability to compare different trees and evolutionary models within a statistical framework (see Whelan et al., 2001, for a review). However, like all optimality criterion-based phylogenetic reconstruction approaches, ML is hampered by computational difficul- ties, making it impossible to obtain the optimal tree with certainty from even moderate data sets (Swofford et al., 1996). Therefore, all practical methods rely on heuristics that obtain near-optimal trees in reasonable computing time. Moreover, the computation problem is especially difficult with ML, because the tree likelihood not only depends on the tree topology but also on numerical pa- rameters, including branch lengths. Even computing the optimal values of these parameters on a single tree is not an easy task, particularly because of possible local optima (Chor et al., 2000). The usual heuristic method, implemented in the pop- ular PHYLIP (Felsenstein, 1993 ) and PAUP ∗ (Swofford, 1999 ) packages, is based on hill climbing. It combines stepwise insertion of taxa in a growing tree and topolog- ical rearrangement. For each possible insertion position and rearrangement, the branch lengths of the resulting tree are optimized and the tree likelihood is computed. When the rearrangement improves the current tree or when the position insertion is the best among all pos- sible positions, the corresponding tree becomes the new current tree. Simple rearrangements are used during tree growing, namely "nearest neighbor interchanges" (see below), while more intense rearrangements can be used once all taxa have been inserted. The procedure stops when no rearrangement improves the current best tree. Despite significant decreases in computing times, no- tably in fastDNAml (Olsen et al., 1994 ), this heuristic becomes impracticable with several hundreds of taxa. This is mainly due to the two-level strategy, which sepa- rates branch lengths and tree topology optimization. In- deed, most calculations are done to optimize the branch lengths and evaluate the likelihood of trees that are finally rejected. New methods have thus been proposed. Strimmer and von Haeseler (1996) and others have assembled four- taxon (quartet) trees inferred by ML, in order to recon- struct a complete tree. However, the results of this ap- proach have not been very satisfactory to date (Ranwez and Gascuel, 2001 ). Ota and Li (2000, 2001) described

16,261 citations


Journal ArticleDOI
TL;DR: The utility of the method described by Nielsen to the mapping of morphological characters under continuous-time Markov models for mapping characters on trees and for identifying character correlation is demonstrated.
Abstract: Many questions in evolutionary biology are best addressed by comparing traits in different species Often such studies involve mapping characters on phylogenetic trees Mapping characters on trees allows the nature, number, and timing of the transformations to be identified The parsimony method is the only method available for mapping morphological characters on phylogenies Although the parsimony method often makes reasonable reconstructions of the history of a character, it has a number of limitations These limitations include the inability to consider more than a single change along a branch on a tree and the uncoupling of evolutionary time from amount of character change We extended a method described by Nielsen (2002, Syst Biol 51:729-739) to the mapping of morphological characters under continuous-time Markov models and demonstrate here the utility of the method for mapping characters on trees and for identifying character correlation (Bayesian estimation; character correlation; character mapping; Markov chain Monte Carlo) The footprint of natural selection on organisms can of- ten be detected using phylogenetic methods Correlation in either molecular or morphological characters is taken as evidence of natural selection acting on those charac- ters (Harvey and Pagel, 1991) The correlation might be between a character and the environment, with the re- peated evolution of the character in a particular environ- ment indicating that the trait confers an advantage, or the correlation may be between one character and another In ribosomal RNA sequences, for example, correlated changes occur in nucleotides paired in the stem struc- tures; natural selection is acting to maintain Watson- Crick pairing of nucleotides in the functionally impor- tant stem structures In either case-correlation between different characters or the repeated evolution of a charac- ter in a particular environment-phylogenetic methods provide the best framework for the analysis of correlation because they allow the effects of a common phylogenetic history that simultaneously acts on all of the characters to be partitioned from the evolutionary processes gener- ating the character patterns (Felsenstein, 1985) Despite the importance of phylogenetic analysis of character change in evolutionary biology, detection of correlation in characters is fraught with difficulties One dilemma involves how characters should be mapped onto a phylogenetic tree Many methods for detecting correlations rely on mapping character changes on a phylogenetic tree using the parsimony method (Ridley, 1983; Maddison, 1990) The parsimony method provides the minimum number of transformations required to explain the evolution of the character on the tree and therefore necessarily underestimates the total number of changes Furthermore, some methods treat the par- simony mapping of a character as an observation in fur- ther statistical analyses (Ridley, 1983; Maddison, 1990) Although the parsimony method is expected to provide a reasonable mapping of a character when the rates of evolution are low, the fundamental problem with the method is that it does not account for the uncertainty in the process of character change In effect, the parsimony method wagers all on the mapping requiring the fewest changes, when in reality many other perhaps slightly less parsimonious mappings may be nearly as good or

775 citations


Journal ArticleDOI
TL;DR: It is shown that Bayesian posterior probabilities are significantly higher than corresponding nonparametric bootstrap frequencies for true clades, but also that erroneous conclusions will be made more often.
Abstract: Many empirical studies have revealed considerable differences between nonparametric bootstrapping and Bayesian posterior probabilities in terms of the support values for branches, despite claimed predictions about their approximate equivalence. We investigated this problem by simulating data, which were then analyzed by maximum likelihood bootstrapping and Bayesian phylogenetic analysis using identical models and reoptimization of parameter values. We show that Bayesian posterior probabilities are significantly higher than corresponding nonparametric bootstrap frequencies for true clades, but also that erroneous conclusions will be made more often. These errors are strongly accentuated when the models used for analyses are underparameterized. When data are analyzed under the correct model, nonparametric bootstrapping is conservative. Bayesian posterior probabilities are also conservative in this respect, but less so.

620 citations


Journal ArticleDOI
TL;DR: In this study, simulations are used to show that the reduced accuracy associated with including incomplete taxa is caused by these taxa bearing too few complete characters rather than too many missing data cells, and suggest a more effective strategy for dealing with incompleteTaxa.
Abstract: The problem of missing data is often considered to be the most important obstacle in reconstructing the phylogeny of fossil taxa and in combining data from diverse characters and taxa for phylogenetic analysis. Empirical and theoretical studies show that including highly incomplete taxa can lead to multiple equally parsimonious trees, poorly resolved consensus trees, and decreased phylogenetic accuracy. However, the mechanisms that cause incomplete taxa to be problematic have remained unclear. It has been widely assumed that incomplete taxa are problematic because of the proportion or amount of missing data that they bear. In this study, I use simulations to show that the reduced accuracy associated with including incomplete taxa is caused by these taxa bearing too few complete characters rather than too many missing data cells. This seemingly subtle distinction has a number of important implications. First, the so-called missing data problem for incomplete taxa is, paradoxically, not directly related to their amount or proportion of missing data. Thus, the level of completeness alone should not guide the exclusion of taxa (contrary to common practice), and these results may explain why empirical studies have sometimes found little relationship between the completeness of a taxon and its impact on an analysis. These results also (1) suggest a more effective strategy for dealing with incomplete taxa, (2) call into question a justification of the controversial phylogenetic supertree approach, and (3) show the potential for the accurate phylogenetic placement of highly incomplete taxa, both when combining diverse data sets and when analyzing relationships of fossil taxa.

609 citations


Journal ArticleDOI
TL;DR: The view that rigorous and critical anatomical studies of fewer morphological characters, in the context of molecular phylogenies, is a more fruitful approach to integrating the strengths of morphological data with those of sequence data is presented.
Abstract: In this article we explore the paradox of why morphological data are currently utilized less for phylogeny reconstruction than are DNA sequence data, whereas most of what we know about phylogeny stems from classifications founded on morphological data. The crucial difference between the two data sources relates to the number of potentially unambiguous characters available, their ease and speed of discovery, and their suitability for analysis using transformational models. We consider that the increased use of DNA sequence data, relative to morphology, for phylogeny reconstruction is inevitable and well founded, but that a crucial issue remains concerning the role of morphology in phylogeny reconstruction. We present the view that rigorous and critical anatomical studies of fewer morphological characters, in the context of molecular phylogenies, is a more fruitful approach to integrating the strengths of morphological data with those of sequence data. This approach is preferable to compiling larger data matrices of increasingly ambiguous and problematic morphological characters. We argue below that a main constraint of morphologybased phylogenetic inference concerns the limited number of unambiguous characters available for analysis in a transformational framework. This problem of a limited number of unambiguous characters is further compounded by obstacles to accurate homology assessment and character coding, which further reduce the number of characters available for analysis. We discuss and disagree with the view that more morphological data should be used in phylogeny reconstruction. Furthermore, we consider the claim that the greatest strength of morphological data-increased taxon sampling-to be mistaken. In the discussion that follows we use "phylogeny reconstruction" to refer to the computer-based algorithmic analyses routinely conducted in systematics today.

422 citations


Journal ArticleDOI
TL;DR: This work develops a novel approach to model selection, which is based on the Bayesian information criterion, but incorporates relative branch-length error as a performance measure in a decision theory (DT) framework.
Abstract: Phylogenetic estimation has largely come to rely on explicitly model-based methods. This approach requires that a model be chosen and that that choice be justified. To date, justification has largely been accomplished through use of likelihood-ratio tests (LRTs) to assess the relative fit of a nested series of reversible models. While this approach certainly represents an important advance over arbitrary model selection, the best fit of a series of models may not always provide the most reliable phylogenetic estimates for finite real data sets, where all available models are surely incorrect. Here, we develop a novel approach to model selection, which is based on the Bayesian information criterion, but incorporates relative branch-length error as a performance measure in a decision theory (DT) framework. This DT method includes a penalty for overfitting, is applicable prior to running extensive analyses, and simultaneously compares all models being considered and thus does not rely on a series of pairwise comparisons of models to traverse model space. We evaluate this method by examining four real data sets and by using those data sets to define simulation conditions. In the real data sets, the DT method selects the same or simpler models than conventional LRTs. In order to lend generality to the simulations, codon-based models (with parameters estimated from the real data sets) were used to generate simulated data sets, which are therefore more complex than any of the models we evaluate. On average, the DT method selects models that are simpler than those chosen by conventional LRTs. Nevertheless, these simpler models provide estimates of branch lengths that are more accurate both in terms of relative error and absolute error than those derived using the more complex (yet still wrong) models chosen by conventional LRTs. This method is available in a program called DT-ModSel. (Bayesian model selection; decision theory; incorrect models; likelihood ratio test; maximum likelihood; nucleotide-substitution model; phylogeny.)

421 citations


Journal ArticleDOI
TL;DR: Previous likelihood models of local molecular clock for estimating species divergence times are extended to accommodate multiple calibration points and multiple genes to analyze two mitochondrial protein-coding genes to estimate divergence times of Malagasy mouse lemurs and related outgroups.
Abstract: Divergence time and substitution rate are seriously confounded in phylogenetic analysis, making it difficult to estimate divergence times when the molecular clock (rate constancy among lineages) is violated. This problem can be alleviated to some extent by analyzing multiple gene loci simultaneously and by using multiple calibration points. While different genes may have different patterns of evolutionary rate change, they share the same divergence times. Indeed, the fact that each gene may violate the molecular clock differently leads to the advantage of simultaneous analysis of multiple loci. Multiple calibration points provide the means for characterizing the local evolutionary rates on the phylogeny. In this paper, we extend previous likelihood models of local molecular clock for estimating species divergence times to accommodate multiple calibration points and multiple genes. Heterogeneity among different genes in evolutionary rate and in substitution process is accounted for by the models. We apply the likelihood models to analyze two mitochondrial protein-coding genes, cytochrome oxidase II and cytochrome b, to estimate divergence times of Malagasy mouse lemurs and related outgroups. The likelihood method is compared with the Bayes method of Thorne et al. (1998, Mol. Biol. Evol. 15:1647-1657), which uses a probabilistic model to describe the change in evolutionary rate over time and uses the Markov chain Monte Carlo procedure to derive the posterior distribution of rates and times. Our likelihood implementation has the drawbacks of failing to accommodate uncertainties in fossil calibrations and of requiring the researcher to classify branches on the tree into different rate groups. Both problems are avoided in the Bayes method. Despite the differences in the two methods, however, data partitions and model assumptions had the greatest impact on date estimation. The three codon positions have very different substitution rates and evolutionary dynamics, and assumptions in the substitution model affect date estimation in both likelihood and Bayes analyses. The results demonstrate that the separate analysis is unreliable, with dates variable among codon positions and between methods, and that the combined analysis is much more reliable. When the three codon positions were analyzed simultaneously under the most realistic models using all available calibration information, the two methods produced similar results. The divergence of the mouse lemurs is dated to be around 7-10 million years ago, indicating a surprisingly early species radiation for such a morphologically uniform group of primates.

333 citations


Journal ArticleDOI
TL;DR: Rosenberg and Kumar (2002) have concluded that the beneficial effect of increasing taxon sample size is not small, but they suggested that the benefit comes simply from the overall increase in size of the data matrix (the total number of characters × taxa).
Abstract: Rosenberg and Kumar (2001) addressed the importance of taxon sampling in phylogenetic analysis and concluded that phylogenetic error is “largely independent of taxon sample size” (2001:10756) and that their “results do not provide evidence in favor of adding taxa to problematic phylogenies” (2001:10756). In response to these conclusions, Zwickl and Hillis (2002) and Pollock et al. (2002) conducted additional simulations and reanalyzed the data presented by Rosenberg and Kumar (2001). Zwickl and Hillis and Pollock et al. showed that these conclusions of Rosenberg and Kumar could not be supported either by analyses of their original data or by new simulations that corrected a number of deficiencies in Rosenberg and Kumar’s original experimental design. Both Zwickl and Hillis and Pollock et al. found that increased taxon sampling resulted in greatly reduced phylogenetic estimation error, and Pollock et al. showed that the benefits of increased taxon sampling were similar to adding an equivalent amount of sequence length for the same taxa (in the ranges simulated by Rosenberg and Kumar). In their response, Rosenberg and Kumar (2002) focused on a slightly different conclusion from that in their original paper, which was that “longer sequences, rather than extensive sampling, will better improve the accuracy of phylogenetic inference” (2001:10751). In 2001, Rosenberg and Kumar argued that the beneficial effect of increasing taxa was 10-fold lower than the beneficial effect of increasing sequence length and that the effects of increased taxon sampling for the same genes were negligible (“largely independently” of phylogenetic error). Rosenberg and Kumar (2002) have now concluded that the beneficial effect of increasing taxon sample size is not small, but they suggested that the benefit comes simply from the overall increase in size of the data matrix (the total number of characters × taxa). Furthermore, they maintained that there is a greater benefit to increasing the total sequence length for few taxa than can be obtained by increasing taxon sampling for the same genes. Here, we discuss the two sets of conclusions reached by Rosenberg and Kumar (2001, 2002).

326 citations


Journal ArticleDOI
TL;DR: The results corroborate the findings of others that posterior probability values are excessively high and suggest that extrapolations from single topology branch-length studies are unlikely to provide any general conclusions regarding the relationship between bootstrap and posterior probabilities.
Abstract: Assessment of the reliability of a given phylogenetic hypothesis is an important step in phylogenetic analysis. Historically, the nonparametric bootstrap procedure has been the most frequently used method for assessing the support for specific phylogenetic relationships. The recent employment of Bayesian methods for phylogenetic inference problems has resulted in clade support being expressed in terms of posterior probabilities. We used simulated data and the four-taxon case to explore the relationship between nonparametric bootstrap values (as inferred by maximum likelihood) and posterior probabilities (as inferred by Bayesian analysis). The results suggest a complex association between the two measures. Three general regions of tree space can be identified: (1) the neutral zone, where differences between mean bootstrap and mean posterior probability values are not significant, (2) near the two-branch corner, and (3) deep in the two-branch corner. In the last two regions, significant differences occur between mean bootstrap and mean posterior probability values. Whether bootstrap or posterior probability values are higher depends on the data in support of alternative topologies. Examination of star topologies revealed that both bootstrap and posterior probability values differ significantly from theoretical expectations; in particular, there are more posterior probability values in the range 0.85-1 than expected by theory. Therefore, our results corroborate the findings of others that posterior probability values are excessively high. Our results also suggest that extrapolations from single topology branch-length studies are unlikely to provide any general conclusions regarding the relationship between bootstrap and posterior probability values. (Bayesian analysis; Markov chain Monte Carlo sampling; maximum likelihood; phylogenetics.)

288 citations


Journal ArticleDOI
TL;DR: A Bayesian relaxed molecular clock approach based on the continuous autocorrelation of evolutionary rates along branches was applied to estimate the divergence ages between the major clades of ruminants, confirming the traditional view that separates Tragulina and Pecora.
Abstract: The ruminants constitute the largest group of ungulates, with >190 species, and its distribution is widespread throughout all continents except Australia and Antarctica. Six families are traditionally recognized within the suborder Ruminantia: Antilocapridae (pronghorns), Bovidae (cattle, sheep, and antelopes), Cervidae (deer), Giraffidae (giraffes and okapis), Moschidae (musk deer), and Tragulidae (chevrotains). The interrelationships of the families have been an area of controversy among morphology, palaeontology, and molecular studies, and almost all possible evolutionary scenarios have been proposed in the literature. We analyzed a large DNA data set (5,322 nucleotides) for 23 species including both mito- chondrial (cytochrome b, 12S ribosomal RNA (rRNA), and 16S rRNA) and nuclear (•-casein, cytochrome P-450, lactoferrin, andfi-lactalbumin) markers. Our results show that the family Tragulidae occupies a basal position with respect to all other ruminant families, confirming the traditional view that separates Tragulina and Pecora. Within the pecorans, Antilocapridae and Giraffidae emerge first, and the families Bovidae, Moschidae, and Cervidae are allied, with the unexpected placement of Moschus close to bovids rather than to cervids. We used these molecular results to assess the homoplastic evolution of morphological characters within the Ruminantia. A Bayesian relaxed molecular clock approach based on the continuous autocorrelation of evolutionary rates along branches was applied to estimate the divergence ages between the major clades of ruminants. The evolutionary radiation of Pecora occurred at the Early/Late Oligocene transition, and Pecoran families diversified and dispersed rapidly during the Early and Middle Miocene. We propose a biogeographic scenario to explain the extraordinary expansion of this group during the Cenozoic era. (Bayesian relaxed clock; Bovidae; molecules; morphology; Moschidae; phylogeny; Ruminantia.)

247 citations


Journal ArticleDOI
TL;DR: It is concluded that the recognition of zoogeographic lines, though insightful, may oversimplify the biogeography of widespread taxa in this region.
Abstract: The interface of the Asian and Australian faunal zones is defined by a network of deep ocean trenches that separate intervening islands of the Philippines and Wallacea (Sulawesi, the Lesser Sundas, and the Moluccas). Studies of this region by Wallace marked the genesis of the field of biogeography, yet few workers have used molecular methods to investigate the biogeography of taxa whose distribution spans this interface. Some taxa, such as the fanged frogs of the ranid genus Limnonectes, have distributions on either side of the zoogeographical lines of Wallace and Huxley, offering an opportunity to ask how frequently these purported barriers were crossed and by what paths. To examine diversification of Limnonectes in Southeast Asia, the Philippines, and Wallacea, we estimated a phylogeny from mitochondrial DNA sequences obtained from a robust geographic sample. Our analyses suggest that these frogs dispersed from Borneo to the Philippines at least twice, from Borneo to Sulawesi once or twice, from Sulawesi to the Philippines once, and from the Philippines to Sulawesi once. Dispersal to the Moluccas occurred from Sulawesi and to the Lesser Sundas from Java/Bali. Species distributions are generally concordant with Pleistocene aggregate island complexes of the Philippines and with areas of endemism on Sulawesi. We conclude that the recognition of zoogeographic lines, though insightful, may oversimplify the biogeography of widespread taxa in this region.

Journal ArticleDOI
TL;DR: This study combined several tree-based phylogeny reconstruction methods with nested-clade analysis to extract maximum historical signal at various levels in the poorly known Liolaemus elongatus-kriegi lizard complex in temperate South America, and suggests that the number of putative species could be doubled.
Abstract: Recovery of evolutionary history and delimiting species boundaries in widely distributed, poorly known groups requires extensive geographic sampling, but sampling regimes are difficult to design a priori because evolutionary diversity is often "hidden" by inadequate taxonomy. Large data sets are needed, and these provide unique challenges for analysis when they span intra- and interspecific levels of divergence. However, protocols have been designed to combine methods of analysis for DNA sequences that exhibit both very shallow and relatively deeper divergences. In this study, we combined several tree-based phylogeny reconstruction methods with nested-clade analysis to extract maximum historical signal at various levels in the poorly known Liolaemus elongatus-kriegi lizard complex in temperate South America. We implemented a recently descrirbed tree-based protocol for DNA sequences to test for species boundaries, and we propose modifications to accommodate large data sets and gene regions with heterogeneous substitution rates. Combining haplotype trees with nested-clade analyses allowed testing of species boundaries on the basis of a priori defined criteria. The results obtained suggest that the number of putative species in the L. elongatus-kriegi complex could be doubled. We discuss these findings in the context of the advantages and limitations of a combined approach for retrieval of maximum historical information in large data sets and with reference to the yet formidable unresolved issues of sampling strategies. (Liolaemus; lizards; mitochondrial DNA; nested-clade analysis; phylogeny; sampling design; species boundaries.)

Journal ArticleDOI
TL;DR: The phylogenetic analysis of nucleic acid sequences, as with other data, is unavoidably based on explicit and implicit assumptions, and at the fore are character transformation models-usually transversion-transition ratios-and the relative cost of alignment-derived sequence gaps.
Abstract: The phylogenetic analysis of nucleic acid sequences, as with other data, is unavoidably based on explicit and implicit assumptions. At the fore are character transformation models-usually transversion-transition ratios-and the relative cost of alignment-derived sequence gaps. These values are the fulcra of sequence analysis. Simple homogeneous weighting does not avoid the issue of arbitrary, yet crucial, assumptions. (Wheeler, 1995:321) Sensitivity Analysis (SA) is the study of how variation in the output of a model can be apportioned, qualitatively or quantitatively, to different sources of variation, and how the given model depends upon the information fed into it.

Journal ArticleDOI
TL;DR: Simultaneous analyses of four gene sequences and paleontological data suggest that putative adaptive convergences in the jaws of gavialines and tomistomines offer character support for a grouping of these taxa, making Gavialinae an atavistic taxon.
Abstract: Morphological and molecular data sets favor robustly supported, contradictory interpretations of crocodylian phylogeny. A longstanding perception in the field of systematics is that such significantly conflicting data sets should be analyzed separately. Here we utilize a combined approach, simultaneous analyses of all relevant character data, to summarize common support and to reconcile discrepancies among data sets. By conjoining rather than separating incongruent classes of data, secondary phylogenetic signals emerge from both molecular and morphological character sets and provide solid evidence for a unified hypothesis of crocodylian phylogeny. Simultaneous analyses of four gene sequences and paleontological data suggest that putative adaptive convergences in the jaws of gavialines (gavials) and tomistomines (false gavials) offer character support for a grouping of these taxa, making Gavialinae an atavistic taxon. Simple new methods for measuring the influence of extinct taxa on topological support indicate that in this vertebrate order fossils generally stabilize relationships and accentuate hidden phylogenetic signals. Remaining inconsistencies in minimum length trees, including concentrated hierarchical patterns of homoplasy and extensive gaps in the fossil record, indicate where future work in crocodylian systematics should be directed.

Journal ArticleDOI
TL;DR: Criteria that can be used to infer whether or not a phylogenetic analysis has been misled by convergence is proposed and applied in a study of central Texas cave salamanders (genus Eurycea).
Abstract: Convergence, i.e., similarity between organisms that is not the direct result of shared phylogenetic history (and that may instead result from independent adaptations to similar environments), is a fundamental issue that lies at the interface of systematics and evolutionary biology. Although convergence is often cited as an important problem in morphological phylogenetics, there have been few well-documented examples of strongly supported and misleading phylogenetic estimates that result from adaptive convergence in morphology. In this article, we propose criteria that can be used to infer whether or not a phylogenetic analysis has been misled by convergence. We then apply these criteria in a study of central Texas cave salamanders (genus Eurycea). Morphological characters (apparently related to cave-dwelling habitat use) support a clade uniting the species E. rathbuni and E. tridentifera, whereas mitochondrial DNA sequences and allozyme data show that these two species are not closely related. We suggest that a likely explanation for the paucity of examples of strongly misleading morphological convergence is that the conditions under which adaptive convergence is most likely to produce strongly misleading results are limited. Specifically, convergence is most likely to be problematic in groups (such as the central Texas Eurycea) in which most species are morphologically very similar and some of the species have invaded and adapted to a novel selective environment.

Journal ArticleDOI
TL;DR: Divergence times based on the 28S rDNA and several fossil constraints indicate that the Brachycera originated in the late Triassic or earliest Mesozoic and that all major lower brachyceran fly lineages had near contemporaneous origins in the mid-Jurassic prior to the origin of flowering plants.
Abstract: The insect order Diptera, the true flies, contains one of the four largest Mesozoic insect radiations within its suborder Brachycera. Estimates of phylogenetic relationships and divergence dates among the major brachyceran lineages have been problematic or vague because of a lack of consistent evidence and the rarity of well-preserved fossils. Here, we combine new evidence from nucleotide sequence data, morphological reinterpretations, and fossils to improve estimates of brachyceran evolutionary relationships and ages. The 28S ribosomal DNA (rDNA) gene was sequenced for a broad diversity of taxa, and the data were combined with recently published morphological scorings for a parsimony-based phylogenetic analysis. The phylogenetic topology inferred from the combined 28S rDNA and morphology data set supports brachyceran monophyly and the monophyly of the four major brachyceran infraorders and suggests relationships largely consistent with previous classifications. Weak support was found for a basal brachyceran clade comprising the infraorders Stratiomyomorpha (soldier flies and relatives), Xylophagomorpha (xylophagid flies), and Tabanomorpha (horse flies, snipe flies, and relatives). This topology and similar alternative arrangements were used to obtain Bayesian estimates of divergence times, both with and without the assumption of a constant evolutionary rate. The estimated times were relatively robust to the choice of prior distributions. Divergence times based on the 28S rDNA and several fossil constraints indicate that the Brachycera originated in the late Triassic or earliest Mesozoic and that all major lower brachyceran fly lineages had near contemporaneous origins in the mid-Jurassic prior to the origin of flowering plants (angiosperms). This study provides increased resolution of brachyceran phylogeny, and our revised estimates of fly ages should improve the temporal context of evolutionary inferences and genomic comparisons between fly model organisms.

Journal ArticleDOI
TL;DR: This work reconstructed trees from mitochondrial and nuclear DNA sequences for pigeons and doves and their feather lice and identified three apparent cases where the host has speciated but the associated parasite has not.
Abstract: Cospeciation generally increases the similarity between host and parasite phylogenies. Incongruence between host and parasite phylogenies has previously been explained in terms of host switching, sorting, and duplication events. Here, we describe an additional process, failure of the parasite to speciate in response to host speciation, that may be important in some host-parasite systems. Failure to speciate is likely to occur when gene flow among parasite populations is much higher than that of their hosts. We reconstructed trees from mitochondrial and nuclear DNA sequences for pigeons and doves (Aves: Columbiformes) and their feather lice in the genus Columbicola (Insecta: Phthiraptera). Although comparisons of the trees from each group revealed a significant amount of cospeciation, there was also a significant degree of incongruence. Cophylogenetic analyses generally indicated that host switching may be an important process in the history of this host-parasite association. Using terminal sister taxon comparisons, we also identified three apparent cases where the host has speciated but the associated parasite has not. In two of these cases of failure to speciate, these comparisons involve allopatric sister taxa of hosts whose lice also occur on hosts sympatric with both of the allopatric sisters. These additional hosts for generalist lice may promote gene flow with lice on the allopatric sister species. Relative rate comparisons for the mitochondrial cytochrome oxidase I gene indicate that molecular substitution occurs about 11 times faster in lice than in their avian hosts.

Journal ArticleDOI
TL;DR: A simulation study of the phylogenetic methods UPGMA, neighbor joining, maximum parsimony, and maximum likelihood for a five-taxon tree under a molecular clock identified another region of the parameter space where, although consistent for a given method, some incorrect trees were each selected with up to twice the frequency of the correct tree for sequences of bounded length.
Abstract: We conducted a simulation study of the phylogenetic methods UPGMA, neighbor joining, maximum parsimony, and maximum likelihood for a five-taxon tree under a molecular clock. The parameter space included a small region where maximum parsimony is inconsistent, so we tested inconsistency correction for parsimony and distance correction for neighbor joining. As expected, corrected parsimony was consistent. For these data, maximum likelihood with the clock assumption outperformed each of the other methods tested. The distance-based methods performed marginally better than did maximum parsimony and maximum likelihood without the clock assumption. Data correction was generally detrimental to accuracy, especially for short sequence lengths. We identified another region of the parameter space where, although consistent for a given method, some incorrect trees were each selected with up to twice the frequency of the correct (generating) tree for sequences of bounded length. These incorrect trees are those where the outgroup has been incorrectly placed. In addition to this problem, the placement of the outgroup sequence can have a confounding effect on the ingroup tree, whereby the ingroup is correct when using the ingroup sequences alone, but with the inclusion of the outgroup the ingroup tree becomes incorrect.

Journal ArticleDOI
TL;DR: Either of the new methods can be used for both two- and three-dimensional landmark data and thus generalize Bookstein's linearized Procrustes formula for estimating the uniform component in two dimensions.
Abstract: Any change in shape of a configuration of landmark points in two or three dimensions includes a uniform component, a component that is a wholly linear (affine) transformation. The formulas for estimating this component have been standardized for two-dimensional data but not for three-dimensional data. We suggest estimating the component by way of the complementarity between the uniform component and the space of partial warps. The component can be estimated by regression in either one space or the other: regression on the partial warps, followed by their removal, or regression on a basis for the uniform component itself. Either of the new methods can be used for both two- and three-dimensional landmark data and thus generalize Bookstein's (1996, pages 153-168 in Advances in morphometrics [L. F. Marcus et al., eds.], Plenum, New York) linearized Procrustes formula for estimating the uniform component in two dimensions.

Journal ArticleDOI
TL;DR: Relations among 56 individuals from 20 of the 23 described species using maximum likelihood and Bayesian phylogenetic analysis of mitochondrial and nuclear DNA sequence data are examined to propose a general model for the development of endemic damselfly species on Hawaiian Islands and document five potential cases of hybridization.
Abstract: Damselflies of the endemic Hawaiian genus Megalagrion have radiated into a wide variety of habitats and are an excellent model group for the study of adaptive radiation. Past phylogenetic analysis based on morphological characters has been problematic. Here, we examine relationships among 56 individuals from 20 of the 23 described species using maximum likelihood (ML) and Bayesian phylogenetic analysis of mitochondrial (1287 bp) and nuclear (1039 bp) DNA sequence data. Models of evolution were chosen using the Akaike information criterion. Problems with distant outgroups were accommodated by constraining the best ML ingroup topology but allowing the outgroups to attach to any ingroup branch in a bootstrap analysis. No strong contradictions were obtained between either data partition and the combined data set. Areas of disagreement are mainly confined to clades that are strongly supported by the mitochondrial DNA and weakly supported by the elongation factor 1alpha data because of lack of changes. However, the combined analysis resulted in a unique tree. Correlation between Bayesian posterior probabilities and bootstrap percentages decreased in concert with decreasing information in the data partitions. In cases where nodes were supported by single characters bootstrap proportions were dramatically reduced compared with posterior probabilities. Two speciation patterns were evident from the phylogenetic analysis. First, most speciation is interisland and occurred as members of established ecological guilds colonized new volcanoes after they emerged from the sea. Second, there are several instances of rapid radiation into a variety of specialized habitats, in one case entirely within the island of Kauai. Application of a local clock procedure to the mitochondrial DNA topology suggests that two of these radiations correspond to the development of habitat on the islands of Kauai and Oahu. About 4.0 million years ago, species simultaneously moved into fast streams and plant leaf axils on Kauai, and about 1.5 million years later another group moved simultaneously to seeps and terrestrial habitats on Oahu. Results from the local clock analysis also strongly suggest that Megalagrion arrived in Hawaii about 10 million years ago, well before the emergence of Kauai. Date estimates were more sensitive to the particular node that was fixed in time than to the model of local branch evolution used. We propose a general model for the development of endemic damselfly species on Hawaiian Islands and document five potential cases of hybridization (M. xanthomelas x M. pacificum, M. eudytum x M. vagabundum, M. orobates x M. oresitrophum, M. nesiotes x M. oahuense, and M. mauka x M. paludicola).

Journal ArticleDOI
TL;DR: The issue of the database-restricted sampling was addressed and it was concluded that although there was a consistent decrease in error when using more taxa, the decrease was generally minor relative to the number of taxa added to the data set.
Abstract: Taxon sampling is often thought to be of extreme importance for phylogenetic inference, and increased sampling of taxa is commonly advocated as a solution to resolving problematic phylogenies. Another solution is to increase the number of sites (by sequencing additional genes) sampled for each taxon. In an ideal world, one would like to increase samples of both taxa and genes, but taxon sampling has not kept up with the pace of gene sampling increase because of the increasing ease and emphasis on genome sequencing. The question of taxon sampling is necessarily driven by resource limitation. The precise scope of “sufficient” taxon sampling is always dependent on questions being addressed. If we need to know the complete phylogeny of a genus, we must sample the genus exhaustively. In experimental design, partial sampling is an issue only when certain taxa can stand as proxies for the clades to which they belong (clade-based or stratified sampling; see Hillis, 1998). In bioinformatics studies, taxon sampling is restricted by the data availability in genetic databases (database-restricted sampling). Clearly, the nature of the problem in these two research programs is different. In stratified sampling, we are interested in knowing whether to sequence more genes per species or fewer genes for a large number of species per clade. In contrast, in database-restricted sampling it is important to know whether the overall accuracy of inferred phylogenetic trees for small taxa sets is similar to that of trees inferred from larger taxa sets. We recently addressed the issue of the database-restricted sampling (Rosenberg and Kumar, 2001) and concluded that although there was a consistent decrease in error when using more taxa, the decrease was generally minor relative to the number of taxa added to the data set. Pollock et al. (2002) challenged this conclusion by modifying our measure of the phylogenetic error. This measure, ΔE, differs from ours in that we used the difference in error between the subsampled tree [ES] and full sampled tree [EP], whereas Pollock et al. (2002) divided this difference by ES to measure the relative reduction in error. ΔE plotted against the number of additional taxa in the full sampled tree (=66 minus the number of taxa in the subsample tree) shows a clear positive effect (Pollock et al., 2002: Figs. 4, 5). Unfortunately, this impressive result brings little biological benefit, as clearly shown by a scatterplot of the average number of additional branches inferred correctly in each case (Fig. 1). In no instance are there more than 1.5 additional branches reconstructed correctly, even though the number of taxa has often increased many fold. For instance, more than doubling the number of taxa only led to an average increase of 0.7 additional correct branches (points in the middle of the x-axis in Fig. 1). This fact was clearly noted in our original article: “Note that even though ES is greater than EG and EP for very small subsamples (<10 taxa), the difference in phylogenetic error is usually much smaller than one branch per tree” (Rosenberg and Kumar, 2001: 10754). Therefore, although an increase in the number of taxa sampled will lead to improvement in accuracy, the improvement is minimal, particularly when we consider the amount of data (in terms of the number of total nucleotides) being added. We do not advocate using fewer taxa when more are available, as is clear from the results presented by Rosenberg and Kumar (2001:10754). Figure 1 Number of branches reconstructed correctly with increased taxon sampling. (a) All simulated genes. (b) Genes with rates >0.7 and >500 sites (after Pollock et al., 2002). Figure 4 Plot of the percentage of times interordinal branches were reconstructed correctly when the total number of bases was held constant. In each comparison, the data set with fewer taxa (and more sites per taxon) is always plotted on the x-axis. The dotted ... Zwickl and Hillis (2002) also challenged conclusions reached by Rosenberg and Kumar (2001) by using the concept of tree diameter (the maximum distance between all pairs of taxa) to partition genes with different subsampled sets of taxa for analysis. They showed that four-taxon subsamples with a smaller tree diameter generate more accurate results than those subsamples with larger tree diameters. This result is expected because, with sequence divergence and length kept constant, the larger diameter four-taxon trees will encompass higher average divergence and would thus involve larger estimation errors. Furthermore, for the simulations involving the model tree in Figure 2a, four-taxon data sets containing sequences with larger diameters would include interordinal relationships (with many small interior branches) more frequently than would small diameter samples (see also Zwickl and Hillis, 2002: Fig. 3a). Therefore, Zwickl and Hillis’s study is an examination of the phylogenetic error at different evolutionary divergence cross sections of the phylogenetic tree specifically simulated. This and the complete absence of resource limitation (a must for any sampling study) clearly establish that Zwickl and Hillis have not evaluated either stratified or database-restricted taxon-sampling problems. Therefore, Zwickl and Hillis were not correct in stating that their results are in contradiction with our previous results (Rosenberg and Kumar, 2001). In fact, Zwickl and Hillis’s results represent another facet of statistical analysis of the same data. Also, Zwickl and Hillis took issue with our choice of a fast heuristic search used in computer simulations (Rosenberg and Kumar, 2001). We chose this strategy based on results of multiple previous studies, which showed that the most optimal tree is often more optimal than the true tree and that the fast and more exhaustive searches produce trees with comparable phylogenetic errors (Kumar, 1996; Nei et al., 1998; Takahashi and Nei, 2000). Zwickl and Hillis found that with the maximum parsimony (MP) method for the given data set, the TBR searches produced topologies that had less error than those from NNI. This result (based on a single simulation data set) seems to be in conflict with previous studies. We plan to evaluate this result more thoroughly analytically and by computer simulation in the future. Figure 2 Model tree for the simulations based on the Eutherian mammal tree from Murphy et al. (2001) and Eizrik et al. (2001). (a) Full 66-taxon tree; interordinal relationships are represented by thick branches designated with letters. (b) Phylogenetic relationships ... Figure 3 Plot of the percentage of times the interordinal branches were reconstructed correctly in 66-taxon trees versus n-taxon trees, where n = 15, 30, and 45. These values are for all genes and all replicates. The dotted lines indicate a 1:1 relationship. Analyses ... However, we extrapolated our database-restricted sampling and random sampling results to conclude that the phylogenetic trees with fewer taxa but large numbers of genes per taxon may be more accurate than those with many taxa but fewer genes (Rosenberg and Kumar, 2001). Neither Pollock et al. (2002) nor Zwickl and Hillis (2002) addressed that issue, which lies at the heart of the experimental design. Here, we tackle this issue along with biological relevance of many other assumptions made and conclusions reached by Rosenberg and Kumar (2001) that Zwickl and Hillis (2002) objected to. We show that the conclusions reached by Rosenberg and Kumar (2001) are applicable for both phyloinformatic and phylogenomic studies.

Journal ArticleDOI
TL;DR: The phylogeny implies that characters of the skeleton architecture are highly homoplastic, as areCharacters of the aquiferous system, however, axial symmetry seems to be primitive for all Calcispongia, a conclusion that has potentially far-reaching implications for hypotheses of early body plan evolution in Metazoa.
Abstract: Because calcareous sponges are triggering renewed interest with respect to basal metazoan evolution, a phylogenetic framework of their internal relationships is needed to clarify the evolutionary history of key morphological characters. Morphological variation was scored at the suprageneric level within Calcispongia, but little phylogenetic information could be retrieved from morphological characters. For the main subdivision of Calcispongia, the analysis of morphological data weakly supports a classification based upon cytological and embryological characters (Calcinea/Calcaronea) rather than the older classification scheme based upon the aquiferous system (Homocoela/Heterocoela). The 18S ribosomal RNA data were then analyzed, both alone and in combination with morphological characters. The monophyly of Calcispongia is highly supported, but the position of this group with respect to other sponge lineages and to eumetazoan taxa is not resolved. The monophyly of both Calcinea and Calcaronea is retrieved, and the data strongly rejected the competing Homocoela/Heterocoela hypothesis. The phylogeny implies that characters of the skeleton architecture are highly homoplastic, as are characters of the aquiferous system. However, axial symmetry seems to be primitive for all Calcispongia, a conclusion that has potentially far-reaching implications for hypotheses of early body plan evolution in Metazoa.

Journal ArticleDOI
TL;DR: The results support the view that eusociality is hard to evolve but easily lost, and are potentially important for understanding the early evolution of the advanced eussocial insects, such as ants, termites, and corbiculate bees.
Abstract: We performed a phylogenetic analysis of the species, species groups, and subgenera within the predominantly eusocial lineage of Lasioglossum (the Hemihalictus series) based on three protein coding genes: mitochondrial cytochrome oxidase I, nuclear elongation factor 1alpha and long-wavelength rhodopsin. The entire data set consisted of 3421 aligned nucleotide sites, 854 of which were parsimony informative. Analyses by equal weights parsimony, maximum likelihood, and Bayesian methods yielded good resolution among the 53 taxa/populations, with strong bootstrap support and high posterior probabilities for most nodes. There was no significant incongruence among genes, and parsimony, maximum likelihood, and Bayesian methods yielded congruent results. We mapped social behavior onto the resulting tree for 42 of the taxa/populations to infer the likely history of social evolution within Lasioglossum. Our results indicate that eusociality had a single origin within Lasioglossum. Within the predominantly eusocial clade, however, there have been multiple (six) reversals from eusociality to solitary nesting, social polymorphism, or social parasitism, suggesting that these reversals may be more common in primitively eusocial Hymenoptera than previously anticipated. Our results support the view that eusociality is hard to evolve but easily lost. This conclusion is potentially important for understanding the early evolution of the advanced eusocial insects, such as ants, termites, and corbiculate bees.

Journal ArticleDOI
TL;DR: This work proposes an alternative middle ground by constructing a Bayesian hierarchical phylogenetic model that returns substantially more precise continuous parameter estimates than an independent parameter approach without losing the salient features of the data.
Abstract: Debate exists over how to incorporate information from multipartite sequence data in phylogenetic analyses. Strict combined-data approaches argue for concatenation of all partitions and estimation of one evolutionary history, maximizing the explanatory power of the data. Consensus/independence approaches endorse a two-step procedure where partitions are analyzed independently and then a consensus is determined from the multiple results. Mixtures across the model space of a strict combined-data approach and a priori independent parameters are popular methods to integrate these methods. We propose an alternative middle ground by constructing a Bayesian hierarchical phylogenetic model. Our hierarchical framework enables researchers to pool information across data partitions to improve estimate precision in individual partitions while permitting estimation and testing of tendencies in across-partition quantities. Such across-partition quantities include the distribution from which individual topologies relating the sequences within a partition are drawn. We propose standard hierarchical priors on continuous evolutionary parameters across partitions, while the structure on topologies varies depending on the research problem. We illustrate our model with three examples. We first explore the evolutionary history of the guinea pig (Cavia porcellus) using alignments of 13 mitochondrial genes. The hierarchical model returns substantially more precise continuous parameter estimates than an independent parameter approach without losing the salient features of the data. Second, we analyze the frequency of horizontal gene transfer using 50 prokaryotic genes. We assume an unknown species-level topology and allow individual gene topologies to differ from this with a small estimable probability. Simultaneously inferring the species and individual gene topologies returns a transfer frequency of 17%. We also examine HIV sequences longitudinally sampled from HIV+ patients. We ask whether posttreatment development of CCR5 coreceptor virus represents concerted evolution from middisease CXCR4 virus or reemergence of initial infecting CCR5 virus. The hierarchical model pools partitions from multiple unrelated patients by assuming that the topology for each patient is drawn from a multinomial distribution with unknown probabilities. Preliminary results suggest evolution and not reemergence.

Journal ArticleDOI
TL;DR: A statistical test for clustering of distribution areas based on a Monte Carlo simulation with a null model that considers the spatial autocorrelation in the data is proposed and the importance of grid size is demonstrated.
Abstract: Biotic element analysis is an alternative to the areas-of-endemism approach for recognizing the presence or absence of vicariance events in a given region. If an ancestral biota was fragmented by vicariance events, biotic elements or clusters of distribution areas should emerge. We propose a statistical test for clustering of distribution areas based on a Monte Carlo simulation with a null model that considers the spatial autocorrelation in the data. The hypothesis tested is that the observed degree of clustering of ranges can be explained by the range size distribution, the varying number of taxa per cell, and the spatial autocorrelation of the occurrences of a taxon alone. A method for the delimitation of biotic elements which uses model-based Gaussian clustering is introduced. We demonstrate our methods and show the importance of grid size by means of a case study, an analysis of the distribution patterns of southern African species of the weevil genus Scobius. The example highlights the difficulties in delimiting areas of endemism if dispersal has occurred and illustrates the advantages of the biotic element approach. (Area of endemism; biogeography; biotic elements; null model; Scobius; South Africa; vicariance.)

Journal ArticleDOI
TL;DR: The marmotine mandible may have evolved as a mosaic of characters and does not show convergence determined by size similarities, and the phylogenetic signal in the variation of landmark geometry, which describes mandible morphology, seems to account for the shape differences at intermediate taxonomic levels.
Abstract: Marmots have a prominent role in the study of mammalian social evolution, but only recently has their systematics received the attention it deserves if sociobiological studies are to be placed in a phylogenetic context. Sciurid morphology can be used as model to test the congruence between morphological change and phylogeny because sciurid skeletal characters are considered to be inclined to convergence. However, no morphological study involving all marmot species has ever been undertaken. Geometric morphometric techniques were applied in a comparative study of the marmot mandible. The adults of all 14 living marmot species were compared, and mean mandible shape were used to investigate morphological evolution in the genus Marmota. Three major trends were observed. First, the phylogenetic signal in the variation of landmark geometry, which describes mandible morphology, seems to account for the shape differences at intermediate taxonomic levels. The subgenera Marmota and Petromarmota, recently proposed on the basis of mitochondrial cytochrome b sequence, receive support from mandible morphology. When other sciurid genera were included in the analysis, the monophyly of the genus Marmota and that of the tribe Marmotini (i.e., marmots, prairie dogs, and ground squirrels) was strengthened by the morphological data. Second, the marmotine mandible may have evolved as a mosaic of characters and does not show convergence determined by size similarities. Third, allopatric speciation in peripheral isolates may have acted as a powerful force for modeling shape. This hypothesis is strongly supported by the peculiar mandible of M. vancouverensis and, to a lesser degree, by that of M. olympus, both thought to have originated as isolated populations in Pleistocene ice-free refugia.

Journal ArticleDOI
TL;DR: The results suggest that the oldest divergences of Orsonwelles spiders (on Kauai) go back about 4 million years, and the monophyly of the genus Orson welles is strongly supported.
Abstract: Spiders of the recently described linyphiid genus Orsonwelles (Araneae, Linyphiidae) are one of the most conspic- uous groups of terrestrial arthropods of Hawaiian native forests. There are 13 known Orsonwelles species, and all are single- island endemics. This radiation provides an excellent example of insular gigantism. We reconstructed the cladistic relation- ships of Orsonwelles species using a combination of morphological and molecular characters (both mitochondrial and nuclear sequences) within a parsimony framework. We explored and quantified the contribution of different character partitions and their sensitivity to changes in the traditional parameters (gap, transition, and transversion costs). The character data show a strong phylogenetic signal, robust to parameter changes. The monophyly of the genus Orsonwelles is strongly supported. The parsimony analysis of all character evidence combined recovered a clade with of all the non-Kauai Orsonwelles species; the species from Kauai form a paraphyletic assemblage with respect to the latter former clade. The biogeographic pattern of the Hawaiian Orsonwelles species is consistent with colonization by island progression, but alternative explanations for our data exist. Although the geographic origin of the radiation remains unknown, it appears that the ancestral colonizing species arrived first on Kauai (or an older island). The ambiguity in the area cladogram (i.e., post-Oahu colonization) is not derived from conflicting or unresolved phylogenetic signal among Orsonwelles species but rather from the number of taxa on the youngest islands. Speciation in Orsonwelles occurred more often within islands (8 of the 12 cladogenic events) than between islands. A molecular clock was rejected for the sequence data. Divergence times were estimated by using the non- parametric rate smoothing method of Sanderson (1997, Mol. Biol. Evol. 14:1218-1231) and the available geological data for calibration. The results suggest that the oldest divergences of Orsonwelles spiders (on Kauai) go back about 4 million years. (Biogeography; cladistics; colonization; Hawaii; Linyphiidae; Orsonwelles; phylogenetics; speciation; spiders.)

Journal ArticleDOI
TL;DR: Systematic and biogeographical relationships within the Hawaiian clade of the pantropical understory shrub genus Psychotria (Rubiaceae) were investigated using phylogenetic analysis of 18S-26S ribosomal DNA internal (ITS) and external (ETS) transcribed spacers, suggesting monophyletic relationships and extremely rapid radiation in the lineage.
Abstract: Systematic and biogeographical relationships within the Hawaiian clade of the pantropical understory shrub genus Psychotria (Rubiaceae) were investigated using phylogenetic analysis of 18S-26S ribosomal DNA internal (ITS) and external (ETS) transcribed spacers Phylogenetic analyses strongly suggest that the Hawaiian Psychotria are monophyletic and the result of a single introduction to the Hawaiian Islands The results of phylogenetic analyses of ITS and ETS partitions alone give slightly different topologies among basal lineages of the Hawaiian clade; however, such differences are not well supported Relationships in the section Straussia clade in particular are not well resolved because of few nucleotide changes on internal branches, suggesting extremely rapid radiation in the lineage Parsimony and likelihood reconstructions of an- cestral geographical distributions using the topologies inferred from both parsimony and likelihood analysis of combined data and using different combinations of models and branch lengths gave highly congruent results However, for one inter- nal node (corresponding to the majority of the "greenwelliae" clade), parsimony reconstructions were unable to distinguish between three possible island states, whereas likelihood reconstructions resulted in clear ordering of possible states, with the island of Oahu slightly more probable than other islands under all but one model and branch length combination considered (the Jukes-Cantor-like model with branch lengths inferred under parsimony, under which conditions Maui Nui is more probable) A pattern of colonization from oldest to youngest islands was inferred from the phylogeny, using maximum parsimony and maximum likelihood Additionally, a much higher incidence of intraisland versus interisland spe- ciation was inferred (Ancestral character state reconstruction; biogeography; ETS; Hawaii; island evolution; ITS; molecular systematics; Psychotria)

Journal ArticleDOI
TL;DR: Phylogenetic relationships among advanced snakes and the position of the genus Acrochordus relative to colubroid taxa are investigated by phylogenetic analysis of fragments from four mitochondrial genes representing 62 caenophidian genera and 5 noncaenophidians, and Xenoderminae appears to be the sister group to the Colubroidea.
Abstract: Phylogenetic relationships among advanced snakes (Acrochordus + Colubroidea = Caenophidia) and the position of the genus Acrochordus relative to colubroid taxa are contentious. These concerns were investigated by phylogenetic analysis of fragments from four mitochondrial genes representing 62 caenophidian genera and 5 noncaenophidian taxa. Four methods of phylogeny reconstruction were applied: matrix representation with parsimony (MRP) supertree consensus, maximum parsimony, maximum likelihood, and Bayesian analysis. Because of incomplete sampling, extensive missing data were inherent in this study. Analyses of individual genes retrieved roughly the same clades, but branching order varied greatly between gene trees, and nodal support was poor. Trees generated from combined data sets using maximum parsimony, maximum likelihood, and Bayesian analysis had medium to low nodal support but were largely congruent with each other and with MRP supertrees. Conclusions about caenophidian relationships were based on these combined analyses. The Xenoderminae, Viperidae, Pareatinae, Psammophiinae, Pseudoxyrophiinae, Homalopsinae, Natricinae, Xenodontinae, and Colubrinae (redefined) emerged as monophyletic, whereas Lamprophiinae, Atractaspididae, and Elapidae were not in one or more topologies. A clade comprising Acrochordus and Xenoderminae branched closest to the root, and when Acrochordus was assessed in relation to a colubroid subsample and all five noncaenophidians, it remained associated with the Colubroidea. Thus, Acrochordus + Xenoderminae appears to be the sister group to the Colubroidea, and Xenoderminae should be excluded from Colubroidea. Within Colubroidea, Viperidae was the most basal clade. Other relationships appearing in all final topologies were (1) a clade comprising Psammophiinae, Lamprophiinae, Atractaspididae, Pseudoxyrophiinae, and Elapidae, within which the latter four taxa formed a subclade, and (2) a clade comprising Colubrinae, Natricinae, and Xenodontinae, within which the latter two taxa formed a subclade. Pareatinae and Homalopsinae were the most unstable clades.

Journal ArticleDOI
TL;DR: A phylogenetic analysis of calyptraeid gastropods using DNA sequence data from mitochondrial cytochrome oxidase I and 16S genes and the nuclear 28S gene was used to examine the biogeographic patterns of speciation in the CalyPTraeidae.
Abstract: Although calyptraeid gastropods are not well understood taxonomically, in part because their simple plastic shells are the primary taxonomic character, they provide an ideal system to examine questions about evolution in the marine environment. I conducted a phylogenetic analysis of calyptraeid gastropods using DNA sequence data from mitochondrial cytochrome oxidase I (COI) and 16S genes and the nuclear 28S gene. The resultant phylogeny was used to examine the biogeographic patterns of speciation in the Calyptraeidae. Parsimony and Bayesian analyses of the combined data sets for 94 calyptraeid operational taxonomic units and 24 outgroups produced well-resolved phylogenies. Both approaches resulted in identical sister-species relationships, and the few differences in deeper topology did not affect biogeographic inferences. The geographic distribution of the species included here demonstrate numerous dispersal events both between the Pacific and Atlantic oceans and across the equator. When parsimony is used to reconstruct the movement from the Pacific to the Atlantic oceans on the phylogeny, there are 12 transitions between oceans, primarily from the Pacific to the Atlantic. When the latitude is coded as north versus south of the equator, the most-parsimonious reconstruction gives the origin of calyptraeids in the north followed by 15 dispersal events to regions south of the equator and no returns to the north. Many clades of the most closely related species are either sympatric or occur along a single coastline. Closely related species can, however, occur in such divergent regions as Southern California and South Africa. There is little evidence for sister-species pairs or larger clades having been split by the Isthmus of Panama or the Benguela upwelling, but the East Pacific Barrier appears to separate the most basal taxa from the rest of the family. (Biogeographic barriers; Calyptraea; Crepidula; Crucibulum; cytochrome oxidase I; 16S; sympatric speciation.)